Since poor quality data can make any further actions useless (such as calculating attribution, sending bids to advertising services, or building reports), assuring the quality of data continues to be the biggest challenge in digital analytics. It’s common to say that analysts are responsible for all data-related issues. But is this true?
Who is responsible for data quality in a company? Contrary to popular belief, it’s not only the analysts. For example, marketers work with UTM tags, engineers apply tracking codes, etc. So it’s no surprise that chaos occurs when working with data: each employee has many tasks, and it isn’t clear who’s doing what, who’s responsible for what, and who should be asked for the result.
In this article, we try to understand who is responsible for data quality at each stage and how to manage it.
Even within one company, the world of data can be filled with discrepancies and misunderstandings. To empower business users with quality data and avoid missing valuable data, you need to plan the collection of all necessary marketing data. By preparing the data workflow, you demonstrate how data is related for colleagues in all departments so it becomes easy to connect the dots. However, that’s only the first step. Let’s see what the other steps are in preparing data for reports and dashboards:
Yet, regardless of all preparation, decision-makers often encounter a report or dashboard with poor quality data. And the first thing they do is turn to the analyst with the question: Why is there a discrepancy? or Is the data relevant here?
However, the reality is that different specialists are involved in these processes: data engineers are engaged in setting up the analytics system, marketers add UTM tags, users enter data. Let’s see in detail what stages you should go through and how they should be implemented to provide users with high-quality data.
Though this step looks like the easiest, there are several hidden obstacles. First of all, you have to plan to collect all data from all sources, factoring in all customer touchpoints. Sometimes this planning step is skipped, but doing so is unreasonable and risky. Taking an unstructured approach leads to getting incomplete or incorrect data.
The main challenge is that you have to collect fragmented data from different advertising platforms and services you work with. Since processing massive data arrays in the shortest possible time is complicated and resource-intensive, let’s see what possible bottlenecks can appear:
During this step, among all other challenges, you have to consider controlling access to data. For this, we recommend using the classic RACI matrix that defines roles for processes and emphasizes who does, controls, manages, and is responsible for what. Here are the possible roles:
According to the RACI matrix, the roles and responsibilities for data collection look like this:
The next step is to decide where to store all obtained data. If you want to gain complete control over your raw data without modifying it, we recommend using a single storage with automated data import. As using your own servers for storing every byte of data will cost a fortune, we recommend using cloud solutions that save your resources and provide access to data everywhere.
The best option for this task is Google BigQuery, as it considers the needs of marketers and can be used for storing raw data from websites, CRM systems, advertising platforms, etc. Today, there are tons of marketing software solutions. We recommend OWOX BI, which automatically collects data into a data warehouse (or data lake) from different services and websites.
Let’s see what classic errors can occur when collecting raw data:
According to the matrix, in this process, the marketer is a consultant and source of knowledge: for example, knowledge about what accounts you need to download data from, what the UTM tags are, and markup on advertising campaigns.
There are also developers who want to know what changes would happen to containers if Google Tag Manager were used, as they are responsible for the website’s download speed.
At this point, data engineers are already performing the responsible role because they are configuring data pipelines. And analysts are responsible for the result of the work. Even if one employee performs these functions, there will actually be two roles. So if the company has only one analyst, we still recommend implementing the matrix by roles. Then, with the growth of the company, you’ll have a job description for a new colleague, and it will be clear what the responsibilities are for a specific role.
The stakeholder at this stage is interested in knowing what data is available and what problems there are with its quality, as it identifies priorities and resources aimed at collecting data. For example, the OWOX BI Data Monitoring feature is widely applied by our clients.
Data preparation is the next step. It’s often called data mart preparation — this is a flat structure containing those parameters and metrics that will be presented on the dashboard. An analyst who is limited in tools, budget, and time often skips the stage of preparing business data and immediately prepares a data mart. It looks like raw data collected in a data warehouse. Then, there are a million different SQL queries along with Python and R scripts — and this mess will result in something on the dashboard.
If you keep skipping the preparation of business-ready data, it will lead to repeated errors that need to be corrected in each of the sources. Other things that could go wrong include:
The simplest and most common example of a mistake is the definition of a new user and returned user. Most businesses don’t make this distinction in the same way as Google Analytics. Therefore, the logic of user type definitions is often duplicated in different reports. Frequent errors also include incomprehensible report logic. The first thing the business customer will ask about when looking at the report is how it was built, what assumptions it was based on, why the data was used, and so on. Therefore, the preparation of business data is a stage you definitely shouldn’t skip. Building a data mart from raw data is like not washing vegetables and fruits before eating them.
If we assign responsibilities according to the matrix, then for data preparation, we’ll get this:
Business-ready data is a cleaned final dataset that corresponds to the business model. It’s ready-made data that can be sent to any data visualization service (Power BI, Tableau, Google Data Studio, etc.).
Naturally, different businesses operate with different models. For example, the definitions of “users,” “B2B users,” “transactions, “leads,” etc. will mean different things for different companies. These business objects actually answer the question of how a business thinks about its business model in terms of data. This is a description of the business at its core and not the structure of events in Google Analytics.
The data model allows all employees to synchronize and have a general understanding of how data is used and what is understood about it. Therefore, converting raw data to business-ready data is an important stage that cannot be skipped.
What could go wrong at this stage:
Here, you need to decide which data model to choose and how to control changes in the logic of data transformation. Accordingly, these are the roles of participants in the change process:
The stakeholder is no longer just informed but becomes a consultant. They make decisions like what should be understood as a new or returned user. The task of the analyst at this stage is to involve stakeholders as much as possible in making these decisions. Otherwise, the best thing that can happen is that the analyst will be asked to redo the report.
In our experience, some companies still don’t prepare business-ready data and build reports on raw data. The main problem with this approach is endless debugging and rewriting of SQL queries. In the long run, it’s cheaper and easier to work with prepared data instead of running around raw data doing the same things again and again.
OWOX BI automatically collects raw data from different sources and converts it into a report-friendly format. As a result, you receive ready-made datasets that are automatically transformed into the desired structure, taking into account nuances important for marketers. You won’t have to spend time developing and supporting complex transformations, delve into the data structure, and spend hours looking for the causes of discrepancies.
The next stage is preparing the data mart. Simply put, this is a prepared table containing the exact data needed by certain users of a particular department, which makes it much easier to apply.
Why do analysts need a data mart, and why should you not skip this stage? Marketers and other employees without analytical skills find it difficult to work with raw data. The task of the analyst is to provide all employees with access to data in the most convenient form so they don’t have to write complex SQL queries every time.
A data mart helps solve this problem. Indeed, with a competent filling, it will include exactly the data slice necessary for the work of a certain department. And colleagues will know exactly how to use such a database and will understand the context of the parameters and metrics presented in it.
The main cases in which something can go wrong when preparing the data mart are:
Let’s see who is responsible for what at this stage according to the matrix:
It’s obvious that data preparation is the responsibility of data analysts along with stakeholders and data engineers, who are consultants in the process. Note that OWOX BI analysts can handle this task for you. We can collect and merge data, model it for your business model, and prepare a data mart accompanied by detailed instructions with a description of the build logic, allowing you to make changes on your side if necessary (for example, adding new fields).
Visually presenting data in reports and dashboards is the final stage for which everything was actually started. Obviously, data should be presented in a way that is both informative and user-friendly. Not to mention that automated and properly configured visualizations significantly reduce the time to find risk zones, problems, and growth possibilities.
If you have prepared business-ready data and a data mart, you will have no difficulties with visualizations. However, there also can appear mistakes such as:
According to the RACI matrix, the analyst already has a dual role — approver and responsible. The stakeholder is a consultant here, and most likely they have answered in advance the question of what decisions they plan to make and what hypotheses they want to test. These hypotheses form the basis for the design of the visualization with which the analyst works.
The RACI matrix isn’t an answer to all possible questions about working with data, but it definitely can ease the implementation and application of the data flow in your company.
Since people in different roles are involved in different stages of the data flow, it’s wrong to assume that the analyst is solely responsible for data quality. Data quality is also the responsibility of all colleagues who are involved in data markup, delivery, preparation, or management decisions.
All data is always poor quality, and it’s impossible to permanently get rid of data discrepancies, make data consistent, and rid it of noise and duplication. This always happens, especially in such a fast and dynamically changing data reality as marketing. However, you can proactively identify these problems and set a goal to make your data quality known. For example, you can obtain answers to questions such as: When has the data been updated? In what granularity is data available? What errors in the data do we know about? and What metrics can we work with?
For those who want to contribute to improving their company’s data quality, we recommend three simple steps:
Poor data quality can result in incorrect decision making, loss of revenue, brand damage, compliance issues, and operational inefficiencies.
Organizations can ensure data quality by establishing data quality standards and policies, conducting regular data audits, investing in data quality control tools, and fostering a culture of data excellence and accountability among employees.
Everyone within an organization is responsible for data quality, from data entry personnel to management. However, ultimately, the organization's leadership and data governance team are accountable for ensuring data quality standards are established and enforced.