- Detective – identify defects but do not stop them from initial processing.
- Corrective – fix or enrich data that is defective or deficient.
- Preventative – identify and stop defective data from being processed.
When designing controls, consider the following:
- Where will the controls be placed (implemented) along your information supply chain?
- In which cases will the control stop data from being processed, versus allowing processing with notification for follow-up?
- Who will monitor for control exceptions (defects)?
- If a risk, issue, or defect is identified, how will it be tracked, prioritized, and resolved? Who will do this?
- How will controls for process-level risk differ from controls for systemic (entity-level) risk? Do you need/want both? Why, or why not?
Data Quality Reporting & Metrics
Information about data quality levels across a given process, data set, or an enterprise. Reports and metrics are a type of detective control. They inform the parties responsible for ensuring data quality as to how well they are doing their job. They inform data users as to the overall quality of the data. Reports and metrics can be developed to cover a single point in time view and to show trends and variance over time. Metrics (facts) on the reports can be summarized, aggregated, and viewed from different perspectives and across various dimensions.
When defining and developing data quality metrics and reports considers the following:
- Who will be able to view the reports and what metrics and metric views will they want to see?
- How often will the reports be refreshed?
- What is the right mix of pre-canned and ad-hoc reports?
- Are there existing company quality control metrics that can be leveraged?
- Can existing operations monitoring techniques used at your organization be leveraged, such as Six Sigma, Total Quality Management (TQM), and Statistical Process Control (SPC)?
What Data Should Have Quality Controls?
Ultimately, all organizational data should have some degree of quality control. However, it is not realistic to assume that all data will have the same level of rigor applied. Combining & applying the dimensions of criticality, commonality, and quality to your organization’s data assets to establish a prioritized list of data categories is a key initial step on the path to better data quality.
Some data is more critical than other data. Its degree of importance is driven by its usage – i.e. the criticality of the processes, reports, or decisions relying upon that data. Each organization will have a different view of their most critical processes and decisions, and therefore, pieces of information. To make things more complex this view will change over time.
Additionally, some data is more highly shared. Customer and product data may be used repeatedly across many of your organization’s processes, while other information may be limited to use by a single function. A good indicator of the importance of a data subject to your entire enterprise is its level of sharing or common use. Finally, some data is simply better than other data. Higher quality data typically needs less attention than data that is known to be defective.
Once intelligence is gathered, create a prioritized list of processes and/or data categories. Warning: the prioritization process has the potential to drag on if there is unresolved conflict and an overly broad span of coverage. Be sure that basic decision making roles, processes, and timeframes are in place prior to embarking on this initial step. But most importantly get started. With each successive data quality project more data risks and defects will be addressed; you will become more adept at prioritization, control coverage will expand, and you will mature in the art of designing and implementing controls.
What are some Best Practices for Improving Data Quality?
Improving data quality is an iterative process by which both the data itself and the controls that manage data quality levels are improved and matured over successive cycles. Below are guiding principles to follow:
- Understand current data quality levels and practices at a broad, high-level; get started in a targeted area, then expand scope.
- Data Quality is never “done.” It is a continuous improvement process baked into everyday business and IT functions.
- Data content is owned by the business. Data systems are run by IT. Data Quality depends upon both constituent groups.
- Don’t drive the effort exclusively from one side or the other.
- Establish a core competency center in data quality.
When forming this center, create a team(s) possessing the following attributes:
- Strong communication skills and ability to understand and work with others.
- Substantial data analysis skills from both a business and technology perspective.
- Deep expertise in statistics, quality control, and process engineering.
- Demonstrated knowledge and experience using a data quality tool.
- Demonstrated ability to get things done.
While there may be no one person on the team that embodies all of the above characteristics, the combined group should cover them all. How many teams and team size is variable depending upon the nature of your organization and initial scope. The general guideline is to start small (one team with no more than 3-5 resources). Additional data quality competency teams can be established later, if needed.
- Controls must be both designed and executed effectively.
- It does no good to architect an elegant suite of controlled processes if the operational function that will employ and react to deviations does not function well.
- A diligent operational function may be hampered by poorly placed or highly-manual controls that limit control coverage, or inhibit the efficiency by which defects are identified, prioritized, and fixed.
- Focus on both operational controls at a local process level and enterprise control reports that gauge overall data quality risk for the enterprise.
In conclusion, data is an extremely important asset to organizations and it must be continuously improved based upon data, consumer feedback, and priorities. By leveraging a well-planned Data Quality effort, the value your organization receives from your data, in relation to the associated data acquisition, management, usage costs, and risks will increase significantly.