• Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Special Offers
Business Intelligence Info
  • Business Intelligence
    • BI News and Info
    • Big Data
    • Mobile and Cloud
    • Self-Service BI
  • CRM
    • CRM News and Info
    • InfusionSoft
    • Microsoft Dynamics CRM
    • NetSuite
    • OnContact
    • Salesforce
    • Workbooks
  • Data Mining
    • Pentaho
    • Sisense
    • Tableau
    • TIBCO Spotfire
  • Data Warehousing
    • DWH News and Info
    • IBM DB2
    • Microsoft SQL Server
    • Oracle
    • Teradata
  • Predictive Analytics
    • FICO
    • KNIME
    • Mathematica
    • Matlab
    • Minitab
    • RapidMiner
    • Revolution
    • SAP
    • SAS/SPSS
  • Humor

Visual Awesomeness Unlocked – Box-and-Whisker Plots

January 31, 2016   Self-Service BI

By Amir Netz, Technical Fellow and Mey Meenakshisundaram, Product Manager

Numbers tell the story. But when you have diverse data points and sources, telling the story with just one aggregation to represent the whole range of numbers might often not tell the fully story.

Showing averages over time or across some series of data often allows us to answer questions like: How long did the app take to load in the mobile device? To answer this question, most commonly, we would find all data points for the day and then compute the average.  While the average is often a useful metric, by itself is a lossy compression algorithm. What if sizable number of customers are experiencing a slow load time even though the average is within the limits of our expectation?  Imagine that we had a dataset that showed on average it took 300ms to load the app.   Now we may be happy with that metric, but what happens if every now and then it takes 6000ms to load?  The 300ms average number hides that alarmingly bad experience for sizable customer base. This is also where other metrics come into play, like the median, 95 percentiles that can give us a better understanding of the data.

Half a century ago, one mathematician thought out-of-the-box, to solve this problem and came up with the box plot. In his words, the greatest value of a picture is when it forces us to notice what we never expected to see and box plot does it perfectly.

The box whisker plot allows us to see a number of different things in the data series more deeply.  We can see outliers, clusters of data points, different volume of data points between series; all things that summary statistics can hide.   A box whisker plot uses simple glyphs that summarize a quantitative distribution with: the smallest and largest values, lower quantile, median, upper quantile. This summary approach allows the viewer to easily recognize differences between distributions and see beyond a standard mean value plots.

This week we have two submissions to the gallery about Box and Whisker – one from Brad Sarsfield and another from Jan Pieter Posthuma. Thanks to both them for producing this very important visual and publishing it to the gallery.

In Brad’s chart, every data point is plotted as a circle on the axis; this lets us visualize the distribution of the data points, the top and bottom 5% as ‘outliers’ and color them red and mark the ‘whiskers’ at those points, the 95th quantile and the 5th quantile. You can also adjust these quantile values to meet your needs. In this chart, you have to explicitly say ‘Do not summarize’ in the Values bucket to view each series and data point.

The one from Jan Pieter allows category to make the box colorful. It has a second ‘Samples” category to provide different sample results of one experiment group. The values are aggressed at this second group. But If you want to treat each data point separately, then you can have a column which has unique value for each row and put this in the Sample bucket

Here is the video from Brad

 5141.01 Visual Awesomeness Unlocked – Box and Whisker Plots

Make sure to mark the aggregation as ‘Don’t summarize’ in the Value bucket for each series.

38087.02 Visual Awesomeness Unlocked – Box and Whisker Plots

In the formatting section, you can also specify the percentile for each of the Quantile.

3872.03 Visual Awesomeness Unlocked – Box and Whisker Plots

Here is the one from Jan.

2110.04 Visual Awesomeness Unlocked – Box and Whisker Plots

To use, simply download Box and Whisker chart from the visuals gallery and import it to your Power BI report and use it.

Here are the links to Brad’s and Jan’s Box plots.

You can also download the pbix file with sample file attached to this post.

As usual, we can’t wait to hear your thoughts and your ideas for improvements.

Enjoy!

Perfect Benchmarking-Tool! Thanks a lot !!!

Awesome, thanks for this great viz!

Yet to try this but it looks like it will fill a big gap in allowing data to be viewed in the context of what is normal!  Good work!

Thank you for working on this, it’s a great improvement icon smile Visual Awesomeness Unlocked – Box and Whisker Plots

Thanks. Great visual.

There is a similar style of tool in Technical Analysis for share trading. It’s called a candlestick chart. One other feature of the candlestick chart is that the box is coloured in if the share closes lower than it opens on that day, it is left uncoloured if it closes higher.

A really cool feature might be to colour the box if the mean is lower than the previous time period,leave it uncoloured if it is higher. This way you also get a quick visual indicator of where the mean is moving.

This visualization is FLAWED. I’ve been using Box and Whisker plots for a long time and never seen anything like this. Excel 2016 B&W plot does it correctly (and is compatible with R’s standard box and whisker plot).

How Excel, R, and other packages calculate limits:

1) Q1 is ALWAYS the 25th percentile

2) Q2 is ALWAYS the median, or the 50th percentile

3) Q3 is ALWAYS the 75th percentile

4) The bottom of the lower whisker is ALWAYS Q2-1.5IQR

5) The top of the whisker is ALWAYS Q3+1.5IQR

6) Values outside these limits are outliers

7) There are various methods of calculating the median – which method is used?

When you don’t follow these calculations, the user won’t know what they are looking at, and cannot relate the visualization to any similar visualization done in another application. And why build a visualization so contrary to Excel, which many Power BI folks will be using? Also, B&W plots are equally valid when shown horizontally (a fact that the Excel team readily acknowledges).

I implore you to redo this chart to conform to Excel’s box and whisker plot, with the addition of a horizontal orientation option.

@Colin  – These visualizations for provided by community for the community. In the Visual gallery ,for each of these visuals , there is ‘Contact Author’ link and you can provide any feedback/issues to them using that mechanism.   So please use the ‘Contact Author’ link in the gallery to reach out to the authors directly.

BTW , Brad’s visual allow you to specify the percentile for each quartiles. If you check the Wikipedia article en.wikipedia.org/…/Box_plot , it clearly says that  top & bottom of the box are always the first and third quartiles but the ends of the whiskers can represent several possible alternative values

Mey, thanks – I’ll contact the author.

“BTW , Brad’s visual allow you to specify the percentile for each quartiles”

And that’s a problem. When is a quartile something other than a quartile? (25, 50, 75)?

“If you check the Wikipedia article en.wikipedia.org/…/Box_plot , it clearly says that  top & bottom of the box are always the first and third quartiles but the ends of the whiskers can represent several possible alternative values”

True, but if you’re building a box plot in Power BI, why would you go out of your way to do something different from Excel?

This entry passed through the Full-Text RSS service – if this is your content and you’re reading it on someone else’s site, please read the FAQ at fivefilters.org/content-only/faq.php#publishers.

PowerBI

Awesomeness, BoxandWhisker, Plots, Unlocked, Visual
  • Recent Posts

    • WHEN IDEOLOGY TRUMPS TRUTH
    • New Customer Experience Needs and Commerce Trends for 2021
    • A data transformation problem in SQL and Scala: Dovetailing declarative solutions
    • George Wallace Joins Laverne Cox For Comedy Titled ‘Clean Slate’
    • How Microsoft Azure DevOps and Dynamics 365 CRM Work Together to Improve Service Responsiveness
  • Categories

  • Archives

    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
© 2021 Business Intelligence Info
Power BI Training | G Com Solutions Limited