• Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Special Offers
Business Intelligence Info
  • Business Intelligence
    • BI News and Info
    • Big Data
    • Mobile and Cloud
    • Self-Service BI
  • CRM
    • CRM News and Info
    • InfusionSoft
    • Microsoft Dynamics CRM
    • NetSuite
    • OnContact
    • Salesforce
    • Workbooks
  • Data Mining
    • Pentaho
    • Sisense
    • Tableau
    • TIBCO Spotfire
  • Data Warehousing
    • DWH News and Info
    • IBM DB2
    • Microsoft SQL Server
    • Oracle
    • Teradata
  • Predictive Analytics
    • FICO
    • KNIME
    • Mathematica
    • Matlab
    • Minitab
    • RapidMiner
    • Revolution
    • SAP
    • SAS/SPSS
  • Humor

How the Cloud Complicates Data Quality (and How You Can Fix It)

October 8, 2018   Big Data
blog Cloud Complicates Data Quality How the Cloud Complicates Data Quality (and How You Can Fix It)
Christopher Tozzi avatar 1476151897 54x54 How the Cloud Complicates Data Quality (and How You Can Fix It)

Christopher Tozzi

October 8, 2018

By now, you’ve heard all about the advantages of cloud computing. But the cloud also has its downsides. Among them are the special challenges for data quality that arise when data and data applications move to the cloud.

This does not mean that you shouldn’t use the cloud for storing and processing data. It does mean, however, that you need to take special care to manage data quality in cloud environments. Let’s explore how.

The Benefits of Cloud-Based Data Management

Lest this article appear to have an anti-cloud bent, let me make clear that the cloud is by no means a bad solution for data management.

When you move data and data analytics to the cloud, you get lots of benefits. The most obvious is scalability, or the ability to increase or decrease quickly the amount of infrastructure available for hosting and processing data.

Other benefits of cloud-based data management include easy-to-deploy data analytics tools, because you can take advantage of tools that your cloud provider offers as a service. Managing data in the cloud can also help you to avoid network bottlenecks. If your data originates in the cloud and you store and process it in the cloud, you don’t have to worry about delays while you wait to move data over the Internet to an on-premise environment.

blog banner TDWI Cloud Data Quality Tools 1 How the Cloud Complicates Data Quality (and How You Can Fix It)

Data Quality Drawbacks in the Cloud

On the other hand, when data management takes places in the cloud, data quality can suffer – if you don’t take steps to address it. That is true for a number of reasons:

  • In the cloud, you often have little control over how the tools you rely on collect and process your data. You have to use whichever tools your cloud vendors provides you, and your ability to tweak the way those tools work is typically limited. If you run Hadoop on-premise, for example, you can configure it and modify it to your heart’s content. But in the cloud, you’re stuck with whichever Hadoop-as-a-Service solution your cloud vendor offers. The reason that this can create data quality challenges is that it limits your ability to transform and standardize data in ways that make data sets consistent and predictable.
  • When you move data within the cloud, or between the cloud and on-premise infrastructure (if you choose to do that), you run the risk of formatting problems, data loss, inaccurate timestamps and other issues that undercut data quality. For example, if you move block data from a virtual server disk into a cloud-based file-storage service, formatting differences could cause data quality problems. Or data could be damaged while being transferred over the network.
  • Cloud data can become very big, fast. The fact that the cloud is so scalable makes it easy to store huge volumes of information in the cloud. The more data you have, the harder it can be to maintain data quality.
  • Cloud services are always changing and being updated – and unlike software that you set up and manage yourself on-premise, cloud-based tools may not always notify you when they are modified. Changes to your cloud-based tools can cause data quality issues if, for example, a tool modifies the way it structures data and your other tools are not configured to handle the new format.

data quality How the Cloud Complicates Data Quality (and How You Can Fix It)

Solutions: Maximizing Data Quality in the Cloud

So, what’s a forward-thinking data management team to do? Avoiding the cloud entirely is not the answer; that would put your organization at a disadvantage by denying it the benefits of the cloud.

Instead, you want to be sure that, when you take advantage of the cloud to assist in data management, you put data quality measures into place at the same time.

The most obvious and most fundamental way of doing this is to ensure that you run automated data quality checks on all of your data, whether it is based in the cloud or not. You should always have data quality checks in place.

At the same time, taking steps to minimize the number of data migrations between different services, or between the cloud and on-premise, can also improve data quality. So can a policy for archiving or deleting data from the cloud when you no longer need it, in order to avoid having your data sets grow too large and unwieldy.

Finally, remember that you don’t need to use all of your cloud vendor’s data management and analytics tools if you don’t want to. You can always take advantage of the cloud for data management in some ways, while still performing other tasks on-premise – or in your own custom cloud-based environment. You could, for example, set up your own Hadoop environment, using a distribution of your choice, in the cloud, rather than adopting the Hadoop-as-a-Service that the cloud vendor supplies.

The bottom line: It’s possible to enjoy the benefits of the cloud and ensure data quality at the same time. But it won’t happen without the right processes in place.

For more information on achieving high quality data in the cloud, read TDWI Checklist Report: Cloud Data-Quality Tool Considerations now!

Let’s block ads! (Why?)

Syncsort Blog

Cloud, Complicates, data, quality
  • Recent Posts

    • Now make soup!
    • Attach2Dynamics Or SharePoint Security Sync – Choose your smart app for effective document management in Dynamics 365 CRM/Power Apps.
    • 5 jobs that you should apply for this week (before it’s too late)
    • SQL Server authentication methods, logins, and database users
    • DAE solver fails for system of coupled partial differential equations
  • Categories

  • Archives

    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
© 2021 Business Intelligence Info
Power BI Training | G Com Solutions Limited