• Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Special Offers
Business Intelligence Info
  • Business Intelligence
    • BI News and Info
    • Big Data
    • Mobile and Cloud
    • Self-Service BI
  • CRM
    • CRM News and Info
    • InfusionSoft
    • Microsoft Dynamics CRM
    • NetSuite
    • OnContact
    • Salesforce
    • Workbooks
  • Data Mining
    • Pentaho
    • Sisense
    • Tableau
    • TIBCO Spotfire
  • Data Warehousing
    • DWH News and Info
    • IBM DB2
    • Microsoft SQL Server
    • Oracle
    • Teradata
  • Predictive Analytics
    • FICO
    • KNIME
    • Mathematica
    • Matlab
    • Minitab
    • RapidMiner
    • Revolution
    • SAP
    • SAS/SPSS
  • Humor

Exploratory and Confirmatory Analysis: What’s the Difference?

December 3, 2017   Sisense
1200x628 Explorer2 Exploratory and Confirmatory Analysis: What’s the Difference?

How does a detective solve a case? She pulls together all the evidence she has, all the data that’s available to her, and she looks for clues and patterns.

At the same time, she takes a good hard look at individual pieces of evidence. What supports her hypothesis? What bucks the trend? Which factors work against her narrative? What questions does she still need to answer… and what does she need to do next in order to answer them?

Then, adding to the mix her wealth of experience and ingrained intuition, she builds a picture of what really took place – and perhaps even predicts what might happen next.

But that’s not the end of the story. We don’t simply take the detective’s word for it that she’s solved the crime. We take her findings to a court and make her prove it.

In a nutshell, that’s the difference between Exploratory and Confirmatory Analysis.

Data analysis is a broad church, and managing this process successfully involves several rounds of testing, experimenting, hypothesizing, checking, and interrogating both your data and approach.

Putting your case together, and then ripping apart what you think you’re certain about to challenge your own assumptions, are both crucial to Business Intelligence.

Before you can do either of these things, however, you have to be sure that you can tell them apart.

What is Exploratory Data Analysis?

Exploratory data analysis (EDA) is the first part of your data analysis process. There are several important things to do at this stage, but it boils down to this: figuring out what to make of the data, establishing the questions you want to ask and how you’re going to frame them, and coming up with the best way to present and manipulate the data you have to draw out those important insights.

That’s what it is, but how does it work?

As the name suggests, you’re exploring – looking for clues. You’re teasing out trends and patterns, as well as deviations from the model, outliers, and unexpected results, using quantitative and visual methods. What you find out now will help you decide the questions to ask, the research areas to explore and, generally, the next steps to take.

Exploratory Data Analysis involves things like: establishing the data’s underlying structure, identifying mistakes and missing data, establishing the key variables, spotting anomalies, checking assumptions and testing hypotheses in relation to a specific model, estimating parameters, establishing confidence intervals and margins of error, and figuring out a “parsimonious model” – i.e. one that you can use to explain the data with the fewest possible predictor variables.

In this way, your Exploratory Data Analysis is your detective work. To make it stick, though, you need Confirmatory Data Analysis.

What is Confirmatory Data Analysis?

Confirmatory Data Analysis is the part where you evaluate your evidence using traditional statistical tools such as significance, inference, and confidence.

At this point, you’re really challenging your assumptions. A big part of confirmatory data analysis is quantifying things like the extent any deviation from the model you’ve built could have happened by chance, and at what point you need to start questioning your model.

Confirmatory Data Analysis involves things like: testing hypotheses, producing estimates with a specified level of precision, regression analysis, and variance analysis.
In this way, your confirmatory data analysis is where you put your findings and arguments to trial.

Uses of Confirmatory and Exploratory Data Analysis

In reality, exploratory and confirmatory data analysis aren’t performed one after another, but continually intertwine to help you create the best possible model for analysis.

Let’s take an example of how this might look in practice.

Imagine that in recent months, you’d seen a surge in the number of users canceling their product subscription. You want to find out why this is, so that you can tackle the underlying cause and reverse the trend.

This would begin as exploratory data analysis. You’d take all of the data you have on the defectors, as well as on happy customers of your product, and start to sift through looking for clues. After plenty of time spent manipulating the data and looking at it from different angles, you notice that the vast majority of people that defected had signed up during the same month.

On closer investigation, you find out that during the month in question, your marketing team was shifting to a new customer management system and as a result, introductory documentation that you usually send to new customers wasn’t always going through. This would have helped to troubleshoot many teething problems that new users face.

Now you have a hypothesis: people are defecting because they didn’t get the welcome pack (and the easy solution is to make sure they always get a welcome pack!).

But first, you need to be sure that you were right about this cause. Based on your Exploratory Data Analysis, you now build a new predictive model that allows you to compare defection rates between those that received the welcome pack and those that did not. This is rooted in Confirmatory Data Analysis.

The results show a broad correlation between the two. Bingo! You have your answer.

Exploratory Data Analysis and Big Data

Getting a feel for the data is one thing, but what about when you’re dealing with enormous data pools?

After all, there are already so many different ways you can approach Exploratory Data Analysis, by transforming it through nonlinear operators, projecting it into a difference subspace and examining your resulting distribution, or slicing and dicing it along different combinations of dimensions… add sprawling amounts of data into the mix and suddenly the whole “playing detective” element feels a lot more daunting.

The important thing is to ensure that you have the right tech stack in place to cope with this, and to make sure you have access to the data you need in real time.

Two of the best statistical programming packages available for conducting Exploratory Data Analysis are R and S-Plus; R is particularly powerful and easily integrated with many BI platforms. That’s the first thing to consider.

The next step is ensuring that your BI platform has a comprehensive set of data connectors, that – crucially – allow data to flow in both directions. This means that you can keep importing Exploratory Data Analysis and models from, for example, R to visualize and interrogate results – and also send data back from your BI solution to automatically update your model and results as new information flows into R.

In this way, you not only strengthen your Exploratory Data Analysis, you incorporate Confirmatory Data Analysis, too – covering all your bases of collecting, presenting and testing your evidence to help reach a genuinely insightful conclusion.

Your honor, we rest our case.

Ready to learn how to incorporate R for deeper statistical learning? You can watch our webinar with renowned R expert Jared Lander to learn how R can be used to solve real-life business problems.

Let’s block ads! (Why?)

Blog – Sisense

Analysis, Confirmatory, Difference, Exploratory, what’s
  • Recent Posts

    • Export with large E instead of small e
    • You’ll be back
    • Building AI for the Global South
    • Dapper Duo
    • AI Weekly: These researchers are improving AI’s ability to understand different accents
  • Categories

  • Archives

    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
© 2021 Business Intelligence Info
Power BI Training | G Com Solutions Limited