• Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Special Offers
Business Intelligence Info
  • Business Intelligence
    • BI News and Info
    • Big Data
    • Mobile and Cloud
    • Self-Service BI
  • CRM
    • CRM News and Info
    • InfusionSoft
    • Microsoft Dynamics CRM
    • NetSuite
    • OnContact
    • Salesforce
    • Workbooks
  • Data Mining
    • Pentaho
    • Sisense
    • Tableau
    • TIBCO Spotfire
  • Data Warehousing
    • DWH News and Info
    • IBM DB2
    • Microsoft SQL Server
    • Oracle
    • Teradata
  • Predictive Analytics
    • FICO
    • KNIME
    • Mathematica
    • Matlab
    • Minitab
    • RapidMiner
    • Revolution
    • SAP
    • SAS/SPSS
  • Humor

Does Your Data Measure Up? How to Assess Data Quality

October 3, 2018   Big Data
Does Your Data Measure Up How to Assess Data Quality Does Your Data Measure Up? How to Assess Data Quality
Christopher Tozzi avatar 1476151897 54x54 Does Your Data Measure Up? How to Assess Data Quality

Christopher Tozzi

October 3, 2018

Businesses today are increasingly dependent on an ever-growing flood of information. Whether it’s sales records, financial and accounting data, or sensitive customer information, the accuracy and adequacy of a company’s data is critical. If portions of that information are inaccurate or incomplete, the effect on the organization can range from embarrassing to catastrophic.

That’s why you, as an IT professional, should be committed to ensuring that the information your company relies on meets the highest data quality standards.

Measuring Data Quality

The term “data quality” refers to the suitability of data to serve its intended purpose. So, measuring data quality involves performing data quality assessments to determine the degree to which your data adequately supports the business needs of the company.

A data quality assessment is done by measuring particular features of the data to see if they meet defined standards. Each such feature is called a “data quality dimension,” and is rated according to a relevant metric that provides an objective assessment of quality.

The industry hasn’t yet settled on a standard set of data quality dimensions, but the following is a representative group:

Completeness, Validity, Timeliness, Consistency, Integrity

Let’s take a brief look at each of these and at the metrics used in assessing them.


bigstock  196546954 600x Does Your Data Measure Up? How to Assess Data Quality

Completeness

Completeness relates to whether all required information is present in the data set. For example, if the customer information in a database is required to include both first and last names, any record in which the first name or last name field is not populated is marked as incomplete. The metric used in assessing this dimension is the percentage of records that are complete.

Validity

Data is characterized as valid if it matches the rules specified for it. Those rules typically include specifications such as format (number of digits, etc), allowable types (integer, floating point, string, etc), and range (minimum and maximum values). For example, a telephone number field that contains the string ‘1809 Oak Street’ is not valid. The metric for this dimension is the percentage of records in which all values are valid.

Timeliness

Timeliness relates to whether information is up-to-date for the intended use. In other words, is the correct information available when needed? For example, if a customer has notified the company of an address change, but the new address is not in the database at the time billing statements are processed, that entry fails the timeliness test. The metric used to measure timeliness is the time difference between when data is needed and when it is available.

Consistency

A data item is consistent if all representations of that item across data stores match. If, for example, a birth date is entered in one system using the U.S. mm/dd/yyyy format, but it is imported into another system where the date is entered using the European dd/mm/yyyy standard, that data lacks consistency. A paper published in the April 2002 edition of Communications Of the ACM, defines the metric for consistency as “the ratio of violations of a specific consistency type to the total number of consistency checks subtracted from one.”

Integrity

When critical linkages between data elements are missing, that data is said to lack integrity. An example would be a Sales Transactions table in which the customer ID points to a record in the Customers table. If a customer record is deleted without updating related tables, records in the Sales Transaction table that point to that particular customer become “orphans” because their parent record no longer exists. This represents a loss of referential integrity. An appropriate metric for data integrity would be the number of orphan records present in a database.

How To Start

If you’ve never done a data quality assessment before, it can look a bit daunting. But it needn’t be. Sophisticated automated data quality solutions, such as those provided by Syncsort, can make the process straightforward.

Check out our eBook on 4 ways to measure data quality.

Let’s block ads! (Why?)

Syncsort Blog

assess, data, Measure, quality
  • Recent Posts

    • Experimenting to Win with Data
    • twice-impeached POTUS* boasts: “I may even decide to beat [Democrats] for a third time”
    • Understanding Key Facets of Your Master Data; Facet #2: Relationships
    • Quality Match raises $6 million to build better AI datasets
    • Teradata Joins Open Manufacturing Platform
  • Categories

  • Archives

    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
© 2021 Business Intelligence Info
Power BI Training | G Com Solutions Limited