What do vegetables and data have in common? They both bring more benefits in their raw form. While standard GA4 reports can quickly satisfy your hunger, raw data lets you cook something unique and get fresh insights. You probably have daily access to gigabytes of user data, but this amount doesn’t bring value until you make it work.
In this article, we look at raw data, why it’s needed, and how to get and use it. The definition of raw data refers to unprocessed information collected from various sources, emphasizing its potential value for organizations in analyzing customer behavior and effective marketing when properly structured and integrated.
Raw data refers to unprocessed and unstructured data collected directly from various sources. It contains all available details without any manipulation or analysis.
Note: This article was created in 2020 and updated in May 2025 due to significant changes in privacy restrictions that formed the new rules for data collection and user behavior analytics processes.
The words “data” and “information” are often synonyms. However, these things are fundamentally different.
Data is fixed information about events and phenomena stored on a storage medium. Information is the result of processing data to solve specific tasks.
Marketing often uses raw data in the form of customer behavior records. This could include customer IDs, impressions, clicks, comments, purchase dates, product SKUs, and transaction amounts.
This data can be collected from various sources, such as online sales platforms, point-of-sale systems, or customer surveys.
Analyzing each data point like this can help marketing analysts uncover trends, identify customer preferences, and develop targeted marketing strategies to enhance customer engagement and boost sales.
For example, you can collect data in a Google BigQuery repository, and when you run a SQL query on it, BigQuery provides information in response. You can also analyze a small chunk of data in Google Sheets with formulas like VLOOKUP, XLOOKUP, QUERY, IMPORT Functions, Pivot Table, and more.
In informatics, analytics, marketing, and some other fields, this “data” and “information” have special names: raw (unprocessed) and aggregated (processed) data. How does aggregation work in Google Analytics 4 reports?
When the number of rows with values for one parameter exceeds the specified limit (50,000 per day and 1 million per period), the system aggregates the remaining values into an (other) line:
Here are some raw data examples collected from various sources:
The fewer quantitative raw data you use for analysis, the less accurate your results are. Sampling can distort reporting and lead to inefficient decisions. As a result of sampling, you risk not noticing ads that make a profit, or vice versa - spending money on inefficient campaigns.
Working with aggregated information to track your website’s main KPIs is convenient, but it’s not enough to solve more complex problems. Only with raw data can you:
Here are a few significant benefits of using raw data.
By collecting statistics from your website to Google BigQuery or another tool, you can bypass sampling and other restrictions of Google Analytics 4. This will allow you to analyze your complete figures and make better decisions based on the analyzed data.
See also: How to collect complete user behavior data on your website and cost data from advertising services with minimal resource expenses.
Google Analytics 4 and all other analytics systems limit your ability to generate reports. For example, these systems limit the number of parameters and dimensions and their compatibility with each other. With access to raw data, you can build reports with any number and combination of metrics you need.
For example, you can conduct a cohort analysis in terms of dimensions that are relevant to your business.
To set up advanced analytics, you can combine website statistics with information from advertising services, call tracking systems, emails, and your CRM.
With these statistics, you can consider all user touchpoints with your company, analyze the conversion paths, evaluate the impact of all marketing efforts (both online and offline) on business indicators, find the most effective marketing channels, and quickly optimize those that lose money.
You can also check how to use analytics to create reports and avoid depleting your budget.
Using raw data, you can segment users based on their actions on your website (browsing pages, clicking links, adding items to the shopping cart, etc.), then send them to trigger mailings. In addition, you can automatically upload audiences to advertising services to launch remarketing campaigns and set up a bid management strategy for each audience segment.
Audience segmentation helps you make ads more relevant, increase customer conversions and loyalty, optimize your marketing strategy, and reduce costs.
Only with raw data can you detect suspicious activity on your website – for example, too many daily registrations. In addition, with it, you can identify unscrupulous CPA partners who may replace the source of traffic on the application page.
Here is how we tackled fraud in CPA networks for Raiffeisen Bank International.
By collecting data in Google BigQuery, you're independent of ETL services and other tools that you use. This means you can benefit from your statistics even if you decide to disconnect from a service and use your own solution.
Using raw data involves several steps, depending on the specific needs and objectives of the analysis. Here are the detailed steps on how to work with raw data:
After collecting the data, it often needs cleaning and preprocessing. This step involves:
This is the core of data processing, and it applies various statistical, machine learning, or analytical techniques to extract insights, patterns, and relationships within the data.
To collect, store, and process atomic data, we recommend using Google BigQuery cloud storage because it:
Read also: What problems might you encounter when building reports in Google Analytics, and how can you solve them using Google BigQuery?
OWOX BI collects source data for Google BigQuery directly from your website. This service isn’t constrained by Google Analytics 4's limitations, allowing you to build reports without sampling and according to any parameters.
At the same time, OWOX BI uses a data structure compatible with Google Analytics 4, meaning you can run any SQL queries written for Google Analytics 4. This saves time for your team when preparing reports.
OWOX BI sends complete non-aggregated statistics on-site user behavior from your website to Google BigQuery. It also supports an unlimited event size. As a result, you’ll get a full picture of user activity on your website.
In addition, with OWOX BI, you can collect unlimited user parameters and dimensions in Google BigQuery. You can use them to segment users by any feature and build deep reports for detailed analysis.
To collect raw data from your website to Google BigQuery:
OWOX BI collects raw data from your website and automatically combines it with statistics from advertising services, call tracking systems, email systems, and CRMs so you can receive reports without the help of analysts or knowledge of SQL.
With the simple report builder in the OWOX BI Smart Data service, you can select the necessary metrics and build any reports on advertising campaigns, ROPO, RFM, LTV, and cohort analysis.
Here are a few benefits of using OWOX BI to build reports with your raw data.
Do you regularly need ad performance reports but need more time to study SQL or wait for a response from an analyst? With OWOX BI, you don't need to understand the data structure. Simply select the template you like and get the reports you need. All are automatically updated with OWOX BI.
OWOX BI automatically collects and prepares all of the data required for your reports when needed, and doesn't limit you to ready-made dashboards. Connect your accounts once and spend time on analysis rather than data preparation.
Raw data can be overwhelming in analysis, challenging to manage, and may contain errors or inconsistencies. It lacks context and requires processing, cleaning, and transformation before it's meaningful. Extracting insights from raw data often demands significant time and expertise.
Using raw data allows for detailed analysis, customization, and flexibility. It provides a granular view of information, enabling deeper insights, custom calculations, and compatibility with various analytical tools, making it invaluable for research, decision-making, and data-driven strategies.
Raw data refers to unprocessed unstructured data collected directly from data sources, containing all the details available without further analysis. Real data, on the other hand, refers to processed and transformed information that has been refined for your specific use, often presenting insights and patterns derived from raw data.
Data is fixed information about events and phenomena that is stored on some storage medium. Information is the result of processing data to solve specific tasks. For example, you can collect data in a Google BigQuery repository, and when you run a SQL query on it, BigQuery provides information in response. In informatics, analytics, marketing, and some other fields, this «data» and «information» have special names: raw (unprocessed) and aggregated (processed) data.
Only with raw data can you:
- Perform a deep analysis of metrics and their dependencies
- Track the user's entire journey from first touch to purchase
- Build any reports without the limits and restrictions of Google Analytics and reveal valuable insights
- Merge information from different sources and set up advanced analytics
- Create complex sales funnels that match your business structure
To collect, store, and process raw data, we recommend using Google BigQuery cloud storage because it:
- Allows you to upload large amounts of information and quickly process it with SQL
- Scales flexibly and provides more opportunities as your business grows
- Guarantees security and gives you full control over access to your project with your Google account and two-factor authorization
- Allows you to pay only for the volume of statistics collected and processed
- Seamlessly integrates with other Google products and popular visualization and reporting systems
Google Analytics is an undisputed leader among web analytics services. It's free, easy to work with, and it provides insights about the key KPI of online businesses. However, there are limitations in the system that prevent you from getting deeper into the data and exploring it from all sides.
1. The data you see in Google Analytics reports is always aggregated, and this process is beyond control.
2. Sampling, which can seriously distort your data and lead to wrong business decisions.
3. Reports can contain only a limited number and only specific combinations of parameters and key figures.
4. Limit on a number of lines.
5. Data processing time - If you use a free version of Google Analytics, you need to wait up to 24-48 hours for the system to complete data processing.
1. OWOX BI Pipeline.
2. Use Google Analytics APIs.
3. BigQuery export for Google Analytics 360.
4. Build your own connector.