Tag Archives: Pivot

Tip – Create several AutoSum Measures in with one click #Excel Power Pivot – #powerbi

April 14, 2017 / Erik Svensen

Tip – Create several AutoSum Measures in with one click #Excel Power Pivot – #powerbi

Just wanted to share a quick tip I just stumbled upon when designing a data model in Excel Power Pivot.

For creating a measure in Power Pivot, I place the cursor under the column and then I click the AutoSum button on the Home tab –

 Tip – Create several AutoSum Measures in with one click #Excel Power Pivot – #powerbi

And then I get the formula created quick and I can modify the name.

 Tip – Create several AutoSum Measures in with one click #Excel Power Pivot – #powerbi

But did you know that if you multiselect several cells in the Calculation Area and click the AutoSum – you get formulas for all the columns you have selected

 Tip – Create several AutoSum Measures in with one click #Excel Power Pivot – #powerbi

Hope you find this useful as well …

Advertisements

Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

image thumb 3 Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

The Pivot Pictured Above Acts as if We’ve Swapped Out Fields on Rows – in Response to a Slicer Click!

First off…my first post! Being one of the newest (and youngest) members of the PowerPivotPro family has been very exciting so far. As a way of introducing myself, I’d like to share a creative solution to a problem I’m sure many of you have encountered when building a report or dashboard.

Typical Client Report Requests:

  • Simple but complex
  • High-level yet detailed
  • Compact yet containing everything…

Quite understandably, clients love to channel their inner Marie Antoinette by basically asking to have their cake and eat it too. I actually relish those scenarios, they allow me to flex my “think outside the box” muscles!  And hey, it’s great to be working with a toolset that truly CAN accommodate the special demands posed by real-world situations.

Well one such scenario had a customer wanting a summary Pivot Table with about five columns (fields) worth of details in it. No problem, done and done! The problem we were encountering however…was that 90% of the real-estate space on our reporting page had already been used for other visuals, none of which this client wanted eliminated or reduced to make room. So I’m left with the predicament of figuring out how to fit all this data onto this dashboard…Tetris mode engage!

Unfortunately despite my best efforts to rearrange the dashboard (accompanied by my 80’s Rush Mixtape) I simply could not find any way to display a wide Pivot Table on this dashboard. So I circled back to the drawing board and asked myself what variables I could manipulate to achieve the desired outcome.

I realized that I had an assumption that the PivotTable had to be fixed, meaning that it always has to show all levels of the data. However I LOVE to design visuals for clients that are dynamic, only showing the relevant data to them (often based on slicer selections). So I politely asked my previous assumption to leave and invited over my good friend paradigm shift. After some long conversations and extensive Google searches I actually ran across a PowerPivotPro Blog Post written by Rob that inspired my eventual solution.

Discovering this post almost felt like a relay race and I was being passed the baton to cross the finish line. Using the idea from this post that a slicer could change the axis of a chart, I realized the same would work in a PivotTable. All five columns in my table were part of a hierarchy…so why not use this technique to display a single column that would DYNAMICALLY switch between any of the five levels of this hierarchy based on slicer selections. I would now be able to create a table that is both compact and would display all the data the client needed.

Time to break out the cake forks!

Now for the fun part as I share this recipe for success (that was the last cake joke I promise). The general idea will be to create a hierarchy in the data model, and then reference those in an Excel set to be used in my Pivot Table. I’ll be using tables from the publicly available Northwind DW data set for this example.

Download Completed Example Workbook

X

Get Your Files

FIRST, create a Customer Geography Hierarchy in the data model on the DimCustomer Table.

Hierarchy in the Data Model table:

Customer Geo Hierarchy with box thumb 1 Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

SECOND, create a new DAX Measure called “Distinct Count of Country.” This will be used in our set to indicate whether or not a selection was made on our country Slicer.

=DISTINCTCOUNT( DimCustomer[Country] )

Now some of you technically savvy readers may be thinking “why didn’t he use the DAX Function HASONEVALUE?” I’ll explain more on this later when I explain how to write the set using MDX.

THIRD, create a new set for your pivot table referencing our recently created Hierarchy and DAX Measure. Note that the only way to access your sets is through a conditional ribbon that is displayed only when a cell selection is on a Pivot Table.

Opening the Set Manager window:

Sets with boxes thumb 1 Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

Creating a new set using MDX:

Set Manager with boxes thumb 1 Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

Writing the MDX Code:

MDX with boxes thumb 1 Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

This MDX query works by utilizing an IIF statement, which operates the same was as in DAX or Excel. It checks to see if our (Distinct Count of Country) DAX Measure is greater than 1 (indicating no slicer selection). If TRUE it returns the Country column from our hierarchy, if FALSE (slicer selection has been made) then it returns the City column. It’s important to note that I must reference the columns in the hierarchy, if you were to put just the column names in this query it would not run. It’s also important that the “Recalculate set with every update” box is checked, this makes sure the MDX statement is calculated every time someone uses a slicer, otherwise it’ll appear like the set isn’t working.

Keen observers take note! Here’s where I explain WHY I used a DISTINCTCOUNT rather than HASONEVALUE in my DAX Measure. Let’s say a client would like to multi-select countries in the slicer and still have it display the City column on rows in the Pivot Table. If I were to use HASONEVALUE in my DAX Measure I would only switch to the city column when a single value was selected.

The way I’ve designed it we can change the value in the MDX query from 1 to any number we’d like (E.g. 2, 3, etc.) which gives us the flexibility to allow multiple slicer selections and still have it switch to the City column.

“Clever Girl…”

Clever Girl thumb Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

I’m not actually sure if I’m supposed to be the hunter or dinosaur in this analogy from Jurassic Park…but either way I felt clever for that last step.

FOURTH, we can now use our newly created set in a Pivot Table. You’ll notice that a new folder called Sets is now nested in our DimCustomer Table.

Placing the Customer Geo set on rows in our Pivot Table:

PivotTable with boxes thumb 1 Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

Making a slicer selection to observe the dynamic switch from Country to City. Pretty cool!

PivotTable with slicer selection AND Boxes thumb 1 Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

My client’s reaction could be summed up in a single word spoken by the immortal Keanu Reeves…

”Whoa…”

Now some of you may have noticed that I have a Total Sales value at the top of my Pivot Table. Now’s my chance to point out one unfortunate drawback of using sets on rows, it eliminates the totals row at the bottom of the Pivot Table. All is not lost though my friends, for every every problem a solution can always be found! In this case I created an artificial “Total’s row” at the top of the Pivot Table. I did this using the tried and true CUBEVALUE function to call the measure I’m already using in my PivotTable. NOTE that you need to make sure you connect all slicers (via the slicer_name) in the cube string for them to slice the CUBEVALUE as well. Finally, just a bit of dash formatting and some elbow grease and we have ourselves a Totals row!

CUBEVALUE Formula used in the cell for totals:

Totals Row with box thumb Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

There you have it, your very own Mighty Morphing Pivot Table. wlEmoticon smile 1 Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

Let’s block ads! (Why?)

PowerPivotPro

Fully-Customizable DAX-Based Alerts in Power BI and Power Pivot, Our New Class Schedule, and Announcing a Contest!

image Fully Customizable DAX Based Alerts in Power BI and Power Pivot, Our New Class Schedule, and Announcing a Contest!

This Email and the Highlighted Text were Automatically Generated in Response to DAX
“Detectors” Scanning Our Results During Scheduled Refresh
I’ve Wanted this Feature Forever.  We Now Have It.

image 1 Fully Customizable DAX Based Alerts in Power BI and Power Pivot, Our New Class Schedule, and Announcing a Contest!First, we are very excited to announce new class dates:  Houston (September 27-28), Los Angeles (Oct 11-12), Indianapolis (Oct 25-26), and London UK (Nov 15-16).

Reserve your spot today!

(And we’re conducting a special class Thu-Fri this week in Bentonville AR, but that one is by invite only – drop us a note via the contact form if you work in Bentonville and are interested in joining us at the last minute.)

image 2 Fully Customizable DAX Based Alerts in Power BI and Power Pivot, Our New Class Schedule, and Announcing a Contest!“Inventing” new techniques/capabilities is probably my favorite thing to do.  Look no further than my last post, Movers and Fakers, for a fun (and hopefully useful!) example.  Some of my other past favorites include Fixing Stale Slicers, Frankenspark, Hyperlink “Drilldown”, Top N vs. Bottom N vs. “All Others”, Sort by Slicer, and honestly dozens more (I really should do a “greatest hits” style post some day, given how long this site has now been running – nearly seven years! – but that will have to wait).

All of those were created using “just” the built-in features of the products, but sometimes, you need actual software changes in order to pull something off.  Today’s technique (and contest!) has long struck me as precisely that – something we needed Microsoft to do for us – but then recently, an inspiration struck me…

First let’s talk about the need in question:  a failed refresh needs to result in notifications, not just an error in the log.  If a data source has outright vanished, or is no longer reachable, or the password has expired, etc., your scheduled refresh will fail and you will see that in the logs, but will you be notified as well?  In on-premises environments, you will not.  Your dashboard/report consumers will stumble onto this problem, get confused, and swamp you with emails.  Or worse, they get confused and NOT swamp you with emails, meaning they just silently lose trust in the overall system.  (We’ve seen this happen a few times with borderline serious consequences).

PowerBI.com does offer email notification on failure, so at the moment, this gap really only exists in on-prem environments (which still represent the majority of our clients, but hey – it’s still progress).

image 3 Fully Customizable DAX Based Alerts in Power BI and Power Pivot, Our New Class Schedule, and Announcing a Contest!

Bad Timing, Total Garbage Data, or Some Data Missing:  Three Cases When “Success” Isn’t Success

Even more critically, sometimes the refresh technically succeeds and you STILL need to be notified.  This is where robust Alerts/Notifications come in, and so far, we don’t have those really have those baked into the MS platform.

The three primary examples of when we need to be notified of failure even when technically the refresh succeeded:

  1. When a data source wasn’t yet “ready” to be refreshed – most commonly, this happens when the database doesn’t yet have the latest data in it (because some other process didn’t run, or is late), so you end up not having the data for today (or the most current week/month/etc.), or worse, you have some of the data for today but not all of it.
  2. When a data source “succeeds” but returns garbage data – we’ve seen examples, for instance, where all numeric columns came back loaded with nothing but zeroes (again due to a problem upstream from the data source) – this can happen to all of the data or just the latest rows.
  3. When certain segments of the latest data didn’t show up – (aka “Not All Precincts Reporting”) Another very common scenario occurs when, say, 95% of your locations reported data for the latest time period, but a few failed to transmit/submit their corner of the data, so you end up with most of the latest data, but not all of it.  (Particularly common in retail sales environments, for example).

image 4 Fully Customizable DAX Based Alerts in Power BI and Power Pivot, Our New Class Schedule, and Announcing a Contest!

REAL Things that are Good, Bad, or Even Just Strange/New:  Three Things We Want to Know About

Lastly, if the data refreshes correctly and completely, but there are REAL things happening in the data that are interesting, it would be good to know about those as well.

PowerBI.com does have a “select this tile and notify me if it goes above or below a certain number” feature, which again shows that Microsoft is on the right track and making progress, but there is a LOT of functionality missing:

  1. Scan across a whole list of entities (locations, products, etc.) and let me know if ANY of them go above or below a threshold (rather than letting me know if the overall total crosses a threshold, which is pretty much what the current feature does).
  2. Check individual entities against their own individual thresholds – maybe X% is the right threshold for Socks, but Y% is the right threshold for Pants, Z% for Shoes, etc. and I want to be able to specify those in a table somewhere.
  3. Perform “invisible” validation – the current PowerBI.com functionality is hooked to visible tiles.  We also need/want to devise tests like “IF Measure1 > Measure2, let me know” and I don’t want to display a validation-only measure to the user in order to set it up.
  4. Spot Outliers – you could argue that this is kinda the same as #2, but as in the Movers and Fakers post, we can dynamically calculate new thresholds for each entity (or intersection of entities!) on an ongoing basis.
  5. Do ANY of this On-Prem – and lastly, remember that even the current data alerts feature from MS is cloud only (and currently iPhone-only?), so even if the simplest flavor of it is sufficient, many people still don’t have access to it.

There’s a VERY good reason why PowerBI Alerts are missing a lot of things that we want:  it’s a VERY difficult feature to make complete.  I’m sympathetic rather than critical.  Think about it:  it would need its own expression-design UX, similar to the Conditional Formatting dialogs in Excel, or the Rules Wizard in Outlook, to even handle 25% of the real-world scenarios.  And these are not cheap things to design and build – trust me as someone who has attempted it back when I worked there:

image 5 Fully Customizable DAX Based Alerts in Power BI and Power Pivot, Our New Class Schedule, and Announcing a Contest!

To Provide Us With a Properly-Flexible Alerts UX, Power BI Would Need Dialogs Like These – But FAR More Sophisticated

But why invent new expression languages and user experiences when we have DAX and Excel???  THAT was the inspiration that hit me the other day.  We don’t need, or even WANT, more UX around alerts!  We want alerts to be tied into the robust tools we already have!

So, check out these two excerpts from my current Power Pivot workbook:

image 6 Fully Customizable DAX Based Alerts in Power BI and Power Pivot, Our New Class Schedule, and Announcing a Contest!

Each of the named ranges (in green) has a well-known name (UpdateAlert, UpdateAlert1, etc.), and if those are non-blank after a scheduled refresh, we get an email alert!

The pivot at the top is 100% “lifted” from Movers and Fakers – in fact, that post is what got me thinking about Alerts again.  Go skim that post now if you’ve never seen it before, because it’s important to know how the pivot in question ONLY returns Education / Subcategory pairings that have dramatically spiked or fallen in the most recent month, relative to the preceding 12 months.

In short, the pivot in question uses the DAX language to scan for outliers – and that’s a far more robust and complete language to use than any UX that could be built for finding outliers.

Then, on our hidden validation sheet, we use regular Excel formulas to do the following:

  1. Reference into the pivot
  2. If the pivot is non-empty (meaning, it found some outliers), our Validation sheet picks those up
  3. For the things it finds, our Validation sheet constructs “friendly” text strings (using basic concatenation and IF’s)…
  4. And those friendly strings are anchored into cells that are Named Ranges with special names like UpdateAlert

This combination of DAX and Excel formulas can do ABSOLUTELY ANYTHING.

That’s right, ANYTHING.  Whatever your heart can desire or your brain can imagine, you can build your very own custom Alerts feature around it, as if the entire Microsoft development team were at your disposal.  (In a way, they are, because they gave us DAX, and they gave us Excel.)

Yeah, great question!  Well, here’s where we DO need a little bit of software help – but it’s help we already have (otherwise I wouldn’t be writing this post).

The short version is that Power Update now examines your workbook during a refresh (after the refresh, technically) and checks a series of Named Ranges that conform to a set of patterns that we decided upon:

  • If one of the UpdateError, UpdateError1, UpdateError2… (continues checking until the named range with the next consecutive postfix number is empty) named ranges are not empty then their contents is added to the {errorList} email element and the task fails.
  • If one of the UpdateWarning, UpdateWarning1, UpdateWarning2… named ranges are not empty then their contents is added to the {warningList} email element and the task succeeds.
  • If one of the UpdateAlert, UpdateAlert1, UpdateAlert2… named ranges are not empty then their contents is added to the {alertList} email element and the task succeeds and forces the success email to be sent even if it’s disabled.

So, you create (some or all of) those named ranges and give them formulas to populate them.  Go nuts – you don’t have to reference into pivots either, you can alternately use Cube Formulas (maybe even leverage CUBESET() !), DAX Query tables…  maybe even do some CONCATENATEX() in your DAX if you want…

Let’s recap that…

Here’s what you do:

  1. Grab the latest version of Power Update from here.  Remember, it’s 100% free for one workbook, so this costs nothing if you don’t already have it.
  2. Create a new schedule in Power Update, and configure Email Settings to work with your email server (it works GREAT with O365 email for instance).
  3. Use a combination of DAX, Excel formulas, and the named ranges listed above to set up the alerts you want, ranging from Errors (which are treated as a failure) to Warnings and Alerts (which are not treated as refresh failures).
  4. Make sure you include the tag or tags in the Email Settings dialog for what you want to see:  {errorList}, {warningList}, and/or {alertList} – note that these tag names are CASE SENSITIVE in this first release!

image 7 Fully Customizable DAX Based Alerts in Power BI and Power Pivot, Our New Class Schedule, and Announcing a Contest!image 8 Fully Customizable DAX Based Alerts in Power BI and Power Pivot, Our New Class Schedule, and Announcing a Contest!

In Email Settings, Make Sure You Add Your Desired Tags:
{errorList}, {warningList}, and/or {alertList} to Succeeded/Failed Messages!

Now run a refresh and see what happens!

If you’re a diehard PowerBI virtuoso and don’t use Excel-based Power Pivot for anything in production, you absolutely CAN still make use of this technique to set up hyper-flexible alerts on your PowerBI models!

The easiest thing to do would be to put all of the DAX validation into your PowerBI model (the PBIX file), upload it, and then create a lightweight Excel file for the validation.  That Excel file can then be connected directly to your Power BI model in the cloud.  Then you set up Power Update to refresh the XLSX validation file – even if you’re using PowerBI’s built-in refresh scheduler for the dashboards, you can instruct Power Update to run a refresh of the validation workbook shortly after the scheduled refresh of the PBIX.

These are the official (internal) PowerPivotPro polos and sweatshirts that we issue to consultants and trainers on our team:

image 9 Fully Customizable DAX Based Alerts in Power BI and Power Pivot, Our New Class Schedule, and Announcing a Contest!

Want One of These?  Or One of Each?  How About a Free License to the Full-Power Version of Power Update?
Or Maybe Even Name-Your-Own-Prize?  Enter our Alerts Contest!
(Winners Will of Course Be Blogged About Here as Well)

We think that people will immediately start using these capabilities to do completely unexpected and cool things.  Which got us thinking…  let’s get some of those ideas flowing, and explicitly encourage our community here on PowerPivotPro.com to get involved…  with a contest!

Here are the Rules:

  1. The contest will run from August 2, 2016 through Friday, August 19th.
  2. You must get it working with Power Update (and prove it with screenshots), but you do NOT have to purchase anything – the free version will suffice nicely.
  3. Feel free to submit multiple entries, and we will use the latest entry from each contestant.
  4. In other words, we expect iteration – we’ll even give you feedback on your earlier versions so you can improve them, so enter early!
  5. Your final entry must be in our hands by 11:59 PM, US Eastern time, on Friday Aug 19th, 2016.
  6. We will select at least one winner, but reserve the right to declare as many as 3-5 winners depending on the quality and quantity of entries.  Excellence will be rewarded.
  7. Only one free license to Power Update will be awarded, but all winners will receive the clothes simple smile Fully Customizable DAX Based Alerts in Power BI and Power Pivot, Our New Class Schedule, and Announcing a Contest!
  8. Yes, we have women’s sizes.
  9. We will also separately declare one internal winner from the PowerPivotPro team.
  10. I (Rob) am the final judge.  Muhaha.
  11. Entries must include a writeup (preferably in PPTX or DOCX format) with screenshots.  Concise and clear writing earns you extra points, as does description of the problem, the business impact, the human element, funny stories, how the idea came to you, what would make it better… whatever you think is worth telling us.  Be yourselves folks.
  12. You do NOT have to provide us with the workbooks or PBIX files (but if they are not sensitive, please feel free to include them!)
  13. Do NOT send us sensitive business data – we will politely delete it and ask you to resubmit without such disclosure.
  14. Entries must be emailed to ideas@powerpivotpro.com

If you’d like to grab the workbook I used for this blog post, with its named ranges defined, you can download it here.  (It won’t refresh since you don’t have the Access database that “powers” it, but that’s not the important part – you need the named range examples and the underlying DAX).

If you have questions about the idea, the contest, or how Power Update works, please leave a comment on this post.

Let’s block ads! (Why?)

PowerPivotPro

Movers and Fakers: Spotting Outlier Performance in Power Pivot / Power BI

image Movers and Fakers: Spotting Outlier Performance in Power Pivot / Power BI

Out of 407 Total Combinations of Subcategory and Region, These 8 Stand Out This Month
(These 8 Combinations Differed GREATLY from their Respective 12-month Averages)

(What’s this, you say?  An actual “how-to” post from Rob?  Well YEAH, it was about time!)

DAX is fantastic at calculating numbers for us, and producing metrics that drop jaws all over the world.  It is, quite simply, The Best Thing Ever.  You want an all-up metric that properly accounts for seasonality and price changes simultaneously across all divisions of your organization?  That thing you’ve always talked about like it was The Holy grail?  Well, it’s really no problem.

But I’ve now had a long obsession with making DAX do more than that for us.  I want it to not just produce the metrics, but also to then robo-scan said metrics on our behalf and report back on interesting things that it found.  I don’t necessarily get too many chance to indulge this desire, and that’s a shame, so today I’m gonna use the old reliable AdventureWorks fake data.

The basic idea here is “alert me if something has changed dramatically.”  If there’s a corner of my business that has spiked or crashed in a big way, I want to know.  If something has dramatically improved in a particular region, I may want to dive into that and see if it’s something we can replicate elsewhere.  And if something has fallen off a cliff, well, I need to know that for obvious reasons too.  And both kinds of dramatic change, positive and negative, can easily be obscured by overall aggregate values (so in some sense this is a similar theme to “Sara Problem?”)

So the first inclination is to evaluate distance from average performance.  And maybe that would be fine with high-volume situations, but when we’re subdividing our business into hundreds or perhaps thousands of micro-segments, we end up looking at smaller and smaller sample sizes, and “normal” variation can be legitimately more random than we expect.

Consider these two series of numbers:

12, 2, 12, 2, 12, 2, 15

7, 7, 7, 7, 7, 7, 15

Both series have an average of 7 over the first six data points, and both end with 15 as the seventh data point.  If we just go by the average, that final value (15) is just as “much” of an outlier in the first series as the second: in both cases, our most recent number is 8 units higher than our average.

But our eyes can tell that 15 is far more “outlandish” in the second series than the first, right?  We were just cruising along, hyper-consistent at 7, and suddenly, BAM, huge spike to 15.  I can definitely imagine wanting to be notified of the second series’ spike and NOT caring about the first series’ spike.

So we bring Standard Deviation into the picture – not to replace “distance from average,” but to complement it.  Series one has a standard deviation of 5 and series 2 has a standard deviation of 0.  In some sense our eyes are “seeing” the standard deviation difference between these two series of numbers – series one can’t make up its mind, is super hot and cold, whereas series 2 is like a metronome.  Standard deviation is, in other words, a measure of how “choppy” a data set is, or what is considered “normal variation.”

Now, bringing Standard Deviation and “Distance from Average” together…  if we compare distance from average to standard deviation, we get a sense if this most recent change is a bigger than normal change!

In series one, 15 differs from the average by 2 units more than its standard deviation, or a 60% spike outside of normal variation:  (15 – 7 – 5) / 5 = 60%

In series two, 15 differs from the average by 8 units more than its standard deviation, or a 160% spike outside of normal variation:  (15 – 7 – 0) / 5 = 160%

The sensitivity thresholds for each biz are going to be different.  Hey, 60% may be plenty to catch your eye (probably not though).  But 160% is getting closer, and 300% is definitely attention-getting.

[Most recent 12 months sales avg] :=

CALCULATE([Total Sales],
FILTER(ALL(Calendar),
Calendar[MonthID]<=MAX(Calendar[MonthID]) &&
Calendar[MonthID]>=MAX(Calendar[MonthID])-11
)
) / 12

I’m choosing to use the “greatest formula in the world” flavor of calendar navigation here rather than the built-in functions like DATEADD because we can generally make them work in ALL biz scenarios, but also because in this case I’m just more comfortable being “hands-on” with the logic rather than trusting the magic functions to do the right thing.

This relies on a MonthID column that in my Calendar table:

image 1 Movers and Fakers: Spotting Outlier Performance in Power Pivot / Power BI

Subtract the First Year in Your Data Set (2001 in my case) from CalendarYear, Then Multiply by 12 and add MonthNumberOfYear – This Yields a MonthID that You Can Navigate with Arithmetic

Moving on to more formulas…

[Most recent month sales] :=

CALCULATE([Total Sales],
FILTER(ALL(Calendar), Calendar[MonthID]=MAX(Calendar[MonthID]))
)

[Absolute Delta from Average] :=

ABS([Most recent month sales]-[Most recent 12 months sales avg])

I’m taking the Absolute Value here because I want to treat “up” and “down” spike equivalently, AND because STDEV never returns negative numbers.

[STDEV Sales over Last 12 Months] :=

STDEVX.P(FILTER(ALL(Calendar),
Calendar[MonthID]<=MAX(Calendar[MonthID]) &&
Calendar[MonthID]>=MAX(Calendar[MonthID])-11
),
[Total Sales] )

And finally, the “detector” measure:

[Am I Big Change] :=

IF([Abs Delta From Average] >
(20*[STDEV Sales Over Last 12 Months]),
1,
BLANK()
)

I’ve highlighted that 20 because that’s nuts – YOU WON’T USE 20 THERE!  That 20 means I’m saying “let me know when the latest number is TWO THOUSAND PERCENT the normal standard deviation, whereas earlier in this post, I mentioned that 300% is a lot.

Seriously, you need to tune your formula to a sensitivity that fits your situation.  And you do that by changing that 20 to other values.  Start with 3, because that theoretically means you’ll catch only the most-outlying 0.3% .

Well remember, I’m working with fake data here, and I had to crank the factor up to 20 before I got “small” results (like the 8 combinations of Subcategory and Region displayed in the pivot at the top of the post).  Hopefully, you can use something more reasonable, like 3, but play around with it and find out.  (You could also use a disconnected slicer and make it a variable that the user can control of course.)

Furthermore, my fake data was loaded with lots of silly cases where Subcat/Region pairs exhibited ZERO standard deviation, which is just not going to happen in real life.  So I filtered out any cases where that happened, too – just to make the results look more realistic.

image 2 Movers and Fakers: Spotting Outlier Performance in Power Pivot / Power BI

Because My Fake Data Was Loaded with LOTS of Zero-Deviation Examples, I Added the Highlighted Test
(You Probably/Hopefully Don’t Need That Part)

First I created a Flattened Pivot…

image 3 Movers and Fakers: Spotting Outlier Performance in Power Pivot / Power BI

Then added my Subcategory and Territory fields, plus the measures:

image 4 Movers and Fakers: Spotting Outlier Performance in Power Pivot / Power BI

But I don’t want to see Subtotals here, so I turn those off…

image 5 Movers and Fakers: Spotting Outlier Performance in Power Pivot / Power BI

And then I filter the “inner” field (Region in this case) to only show me [Am I Big Change] = 1…

image 6 Movers and Fakers: Spotting Outlier Performance in Power Pivot / Power BI

image 7 Movers and Fakers: Spotting Outlier Performance in Power Pivot / Power BI

And then, because my Calendar table “overruns” my Sales table (it contains dates that “happen” after my last Sales row of data), I add a HasData report filter:

image 8 Movers and Fakers: Spotting Outlier Performance in Power Pivot / Power BI

Which itself is a calculated column in my Calendar table:

image 9 Movers and Fakers: Spotting Outlier Performance in Power Pivot / Power BI

That prevents me from EVER using a “bad” month (one that contains no data) as my most recent month.

First off, we can add a Month-Year Slicer and go look at outliers in past months as well:

image 10 Movers and Fakers: Spotting Outlier Performance in Power Pivot / Power BI

image 11 Movers and Fakers: Spotting Outlier Performance in Power Pivot / Power BI

We are NOT Locked into the Strictly Most-Recent Month (July 2004) –
We Can Slice Back and Look at Other Months
(And we see there are more outlier combinations in May than in April)

Furthermore this is NOT “locked in” to Subcategories and Regions specifically.  We can try out other combinations on Rows of the pivot, as long as we remember to re-set that [Am I Big Change]=1 filter on the innermost field…

image 12 Movers and Fakers: Spotting Outlier Performance in Power Pivot / Power BI

Customer Education Level and Subcategory Work Too – No Formula Changes Required
(Although you might need to tune the sensitivity differently)

We’d be very interested in real-world applications of this technique, because such things simply make us happy when we hear about them.  So please circle back here if you have success… but also if you don’t.

Also, any ideas how this can be improved?  Flaws in my thinking?  Throw those at us too.

Let’s block ads! (Why?)

PowerPivotPro

Hey Siri! At Apple WWDC 2016, Tim Cook needs to make big data, AI pivot

Apple needs to change its attitude and approach to customer data, back away from the big data corner it has painted itself into and use its upcoming World Wide Developer Conference (WWDC) to lay out some sort of artificial intelligence vision.

Amazon has Alexa and its Echo. Google has Home, Assistant and a bevy of other services. Microsoft has Cortana.

Meanwhile, Apple has its long-in-the-tooth Siri that reportedly will be opened up to third party developers.

Over the last two years, Apple has dug its heels in on privacy, vilified ad models to some degree and knocked Silicon Valley rivals (read Facebook and Google) for using customers as the products and collecting too much information.

In many ways, Apple in the Tim Cook era has been about privacy concerns and keeping customer data local. Here are a few notable Cook quotes over the last two years.

Cook on Charlie Rose in September 2014:

Designing for Usability in Power Pivot and Power BI

Can your data model (Power Pivot or Power BI) easily be used by 100s of users? Is your model fairly intuitive to understand for the end-user even with little or no training?

Usability may be the key to achieving the Self-Service BI nirvana.

The very first Power Pivot model that I had designed…

Number of users I expected to use my first Power Pivot Model    1 (myself)
Number of users that actually ended up using the model    600+

I learnt my lesson the hard way. The one-audience data model was filled with oddities that was not always easy for end users to understand. I ended up doing a lot of video tutorials (now you know how I got good at video wlEmoticon winkingsmile Designing for Usability in Power Pivot and Power BI , check out our YouTube channel).

Now when I design a model for any client, I design with usability in mind. Such that perhaps not right away, but at some point 100s of users can use the BI Model for Self-Service BI. They can easily navigate the model to create their own Reports and Dashboards (in Excel or Power BI).

I use a few techniques when designing for usability; good Data Modeling (star schema), naming conventions, use of Perspectives, automated Data Dictionary (documentation) are some that I lean on.

In this blog post, I would illustrate one scenario where, some would argue I go to great lengths to make the model more usable. As you will realize, it is well worth it.
GOAL: Out goal is to combine two data tables – Sales and Budget – which are at varying granularity, but without the use of a an extra lookup table.

image thumb 31 Designing for Usability in Power Pivot and Power BI
GOAL: Combine Sales and Budget without creating a “Subcategory” Lookup table

Jump to section:
Setup and the Classic Solution
Usability Issues with the Classic Solution
Elegant Solution
Results
Epilogue: Single Data Table with Differing Granularity

Need help designing your BI Model? Check out our Training and Consulting Services

In our book, Chapter 18 Multiple Data Tables – Differing Granularity, we show how you can combine Data Tables of differing granularity.
(You have a copy of our book, right? Hmmm…why would you wait to buy the #1 bestselling Power BI book on the market, Carpe diem my friend!).
Grain or granularity is a fancy word, which simply means what one row of data represents. For example in our data set:-

  • Sales – is at a Daily and Product SKU (Individual Product or Product Key) level granularity
  • Budget – is at a Monthly and Product SubCategory level granularity

image thumb 20 Designing for Usability in Power Pivot and Power BI
Sales Table at grain – ProductKey, Date

image thumb 21 Designing for Usability in Power Pivot and Power BI
Budget Table at grain – Product SubCategory and Month

You would encounter this pattern frequently when working with real world data sets – Data tables of differing granularity. And the book already shows you how to solve this using a shared Lookup table.

image thumb 18 Designing for Usability in Power Pivot and Power BI
Sales and Budget linked by a newly created SubCategory Lookup Table

Creating a shared lookup table, lets you:

  • Slice and dice your Sales and Budget numbers using any of the shared lookup tables (Calendar, SubCategory in this case)
  • Define “Hybrid” measures which do math across the two data tables (e.g. Variance to Budget, or VTB = [Total Sales] – [Total Budget])

image thumb 22 Designing for Usability in Power Pivot and Power BI
Slice and dice Sales and Budget using Lookup Tables and define “Hybrid” measures

You would encounter this pattern frequently when working with real world data sets – Data tables of differing granularity.

When I think of Data Modeling, usually I have two criteria in mind – Performance and Usability. Let us consider these for the problem at hand:
(Data modeling best practices are sprinkled throughout our book. I also wrote an article which summarizes the key best practices: Data Modeling for Power Pivot and Power BI).

  • Performance: Having a cascading Lookup table (SubCategory) with just one or two columns is generally inefficient. But performance is not our main criteria here.
  • Usability: With the Classic solution, you end up with the “SubCategory” column in two places – in Products table and SubCategory table. And users are expected to remember to use the right SubCategory column.
    – Use SubCategory[SubCategory] when using measures on [Budget] or hybrid measures
    – Can use Product[SubCategory] when not using [Budget] related measures

If you or any user using the model to create Pivots/Reports uses the Category/SubCategory from the “wrong” table, then they can get unexpected results (see image below). You could hide the Category/SubCategory columns from the Product table, but that only feels like a hack.
– Category/SubCategory are Product attributes and logically belong in the Product table
– For users that do not care about [Budget] table and would never use those measures, this feels like an additional burden for them to find the Category/SubCategory in a separate table.

image thumb 32 Designing for Usability in Power Pivot and Power BI
You always need to remember to use the Category/SubCategory fields from the “right” table, else you may not get the correct results

Here is how we can go about combining the Products and the newly created SubCategory table for a more usable and elegant solution. I have highlighted the key steps, you can examine the solution file to go step by step in the Power BI Query.
Note: Don’t be scared of the M code, 99% of these steps were generated using the Query ribbon – I have included those screenshots as well.

1. Create a SubCategory Lookup Table

image thumb 24 Designing for Usability in Power Pivot and Power BI
Steps to create a SubCategory Lookup Table (click to enlarge)

This is so far identical to how we built the SubCategory Lookup Table in the classic solution.

2. Append Products and SubCategory Table

image thumb 25 Designing for Usability in Power Pivot and Power BI
Append Queries is available on the Query ribbon interface

In the new appended table, the ProductKey is null for all rows coming from the SubCategory table.

image thumb 26 Designing for Usability in Power Pivot and Power BI
ProductKey now has null values for the rows coming from SubCategory Lookup

This won’t do for a Lookup table (key used to create relationship cannot contain null values), thus we create a new key column.

3. Create New Product Key

image thumb 27 Designing for Usability in Power Pivot and Power BI
Add Index Column is available on the Query ribbon interface

Now we have a unique key for our appended table.

image thumb 28 Designing for Usability in Power Pivot and Power BI
Now we have a NewProductKey that we could use

4. Update Sales and Budget Tables with New Product Key

All that’s left is to go back to our Data tables – Sales and Budget, and update those to use the NewProductKey.
The steps below are for updating Sales table; Budget table would be updated on similar lines to add the NewProductKey.

image thumb 29 Designing for Usability in Power Pivot and Power BI
Steps to pull in the NewProductKey into our Sales table (click to enlarge)

BONUS CHALLENGE: You can try generating NewProductKey in such a way that the Sales table would not need to be updated with a New Product Key. You would only need to worry about the Budget table.
I can think of a few ways to accomplish that. Up for the challenge? Modify our provided file and post a link to your solution file in comments.

Final Step: Link Sales and Budget to new Products Table

Next, you can link both the Sales and Budget data table to our new Products table using the NewProductKey column!

image thumb 19 Designing for Usability in Power Pivot and Power BI
Sales and Budget linked by a Product Table, even though Budget is set at SubCategory

We do not need to worry about using the right table for the right occasion (based on the measures being used). Furthermore, the Category, SubCategory fields – which are Product attributes – are found in the Products table, which would make more intuitive sense for end users.

image thumb 30 Designing for Usability in Power Pivot and Power BI
Our new model is more user friendly: Product Category/SubCategory can be used to view both Sales & Budget

The differing granularity problem not only occurs when multiple data tables are involved (e.g. Sales and Budget), but can also occur within a single data table. For example, Sales can record transactions at different grain of Geography (Territories table in our model).

  • Some Sales transactions at the lowest granularity (Region in our case)
  • Some Sales transactions at a higher granularity (e.g. Country). This can occur in a scenario where the deal was made, let’s say with the Government of a Country. Then ‘Country’ is the only level of information available.

This can be addressed using a similar approach as outlined above to build a Geography/Territory lookup table at a different grain. If interested in such a problem, leave us a comment and we’ll try to cover this in a follow up blog post.

Download Solution File: Usability_PBIX_Files.zip

Power On!
-Avi Singh

Let’s block ads! (Why?)

PowerPivotPro

Power Pivot versus Power Query: Power BI Dilemma

20160316 PowerPivot vs PowerQuery Power Pivot versus Power Query: Power BI Dilemma
Do we use Power Pivot or Power Query?” – I get this question all the time when teaching Power BI classes.

(Note: This applies in Excel as well as Power BI Desktop world. Even though in PBI Desktop the tools are not labeled as PP & PQ, they are still very much present. See Excel Power BI / PBI Desktop comparison).

I would be presenting this topic at the PASS BA Conference (May 2-4). I would be fleshing this out in the next month, but wanted to solicit your thoughts and comments.

The answer to the question, of course is – You use both!

  • Use Power Query to clean, shape and transform your data
  • Use Power Pivot to model your data and define calculations

Use Power Query to Get Data

On Power BI Desktop, you don’t even have a choice – the only route to connect to data is via the “Get Data/Power Query” interface. Which is A-Okay with me. Even with Excel, I now connect to ANY data using Power Query.

Use Power Query to fill all your Get Data needs

Yes, ANY data. Even if I could connect using Power Pivot to those data sources and did not need any transformation  – I still always use Power Query.

I can think of few scenarios where Power Pivot would be preferred:
– Excel 2010 Users where Power Query cannot send the data directly to the Power Pivot Data Model (without a hack)
– User needs to publish their Excel Model to SSAS and then plan to run the refresh in SSAS (since it does not support Power Query refresh). Very edge case scenario in my mind

And there might be a few other scenarios. If you know any glaring ones, let us know via comments.

But in general (for Excel 2013/2016/PBI Desktop) Power Query is the way to go for getting data.

Use Power Pivot to Model Your Data

Relationships: Please! Don’t use Power Query to flatten data into a single table. Leverage ‘relationships’ in Power Pivot. In fact, if the dataset you are connecting to is “flattened”, use the magic of Power Query to un-flatten – separate Data table and Lookup Tables – and leverage relationships in Power Pivot. For this technique (unflatten), read Chapter 20 – Power Query to the Rescue, Scenario #5 Using Power Query to create a Lookup Table, from our book Power Pivot and Power BI. For general Data Modeling advice, read my earlier article, Data Model: Beast to Beauty.

Calculations: In general, I eschew doing any math/calculations in Power Query. I save that for DAX measures within Power Pivot.

Throwaway calculations, you need in order to shape your tables are obviously okay. Example, to trim your Product table to only Products which have a sale, you may add a column calculating sales for each product. But after the trim is complete in Power Query, you would just remove this column.

There could be few other scenarios where a numeric calculation in Power Query would make sense. Example, really intensive calculations (Ranking) may be manifested as a calculated column using Power Query.

But as a general rule of thumb, for any math/calculations I use DAX measures.

All of that is sage advice, however both of these tools are so capable that they do step on each other a bit. I have run into multiple scenarios where I did have to ask the same question to myself? Do I solve this using Power Query or Power Pivot?

Here are some of the scenarios I have encountered where I needed to weigh between the two. Let us know if you have encountered some of the same scenarios or any new ones.

Sample Scenarios

a) Extract filtered data using Power Query or Use Power Pivot measures with CALCULATE Filters

Let’s say we have car sales data for all Make and Models. And customer states that they are only interested in “Honda”. You can use Power Query to pre-filter the dataset to just Honda and only bring in those rows in your data model.

Or you can bring in the whole dataset, and then write a measure to only count Honda. e.g.

CarsSold = CALCULATE( SUM(CarSales[Units]), CarSales[Make] = “Honda” )

I typically prefer the latter approach. Why? Because I take what the customer is asking me with a pinch of salt. They may ask for “Honda” today but need “Toyota” tomorrow and “BMW” the month after. I try to build data models that can not only answer the questions being asked today but also the questions that may be asked in the future.

Although for clear-cut scenarios, it would make sense to just pre-filter using Power Query.

WINNER: Tie (It depends)

b) Time Intelligence: Year-to-Date, Year-over-Year comparisons etc.

Every time I see people doing time intelligence using Power Query a part of me dies. Power Query is very flexible, and it would let you do things like that – add columns which calculate Year-to-date totals or show the previous year amount.

But this approach is inflexible and wasteful! Power Pivot is extremely strong in Time Intelligence. I have not counted it up myself, but I have been told that Time Intelligence Functions has the largest number of functions than any other DAX function category.

Time-Intelligence is a slam-dunk for Power Pivot. Check our many Time-Intelligence articles on PowerPivotPro.

WINNER: Power Pivot

c) Split Data Rows

This one was a bit complex – but the gist was – I needed to say take the Sales landing in OrganizationA and split them across OrgB & OrgC. I used Power Query to do this split and it was a reasonable solution. However, my concern was that if I was dealing with hundreds of millions of rows – it would result in a large increase in my row count.

WINNER: Power Query for small/medium data sets.

Conclusion

By using the right tool for the right job, you would end up with models that are clean and efficient. I will be preparing a lot more scenarios like above and fleshing these ones out for my PASS BA talk.

Meanwhile, let us know what your thoughts are around this topic.

Power On!
-Avi Singh

Let’s block ads! (Why?)

PowerPivotPro

SQL for Excel Power Pivot Users

Today I am going to spend some time trying to convince (and demonstrate to) Excel Power Pivot users that it is easy and valuable to learn some simple SQL code.  SQL is one of the easiest languages I have ever taught myself and you can do it too.  The reasons SQL is easy to learn include:

  • It has a very simple syntax that uses English language expressions.
  • You can start with simple short queries and get more complex only when you need to.
  • When you need to do something more complex, the Internet is just a quick Google away.
  • The chance of getting something wrong is relatively low (for these simple queries anyhow – see warning at the end).

If you use Power Pivot for Excel and you directly source your data from a database, learning some basic SQL will help you get significantly more value from the process of loading your data.   With some SQL skills, you can add significant flexibility to your data acquisition and loading processes into Power Pivot.  The same opportunity exists with Power BI but with a different UI experience – I will not cover the Power BI UI in this post.

Using SQL will:

  • Allow you to import a non-contiguous subset of records into your data tables that filters out unwanted data – something that is otherwise difficult to achieve with the standard UI.
  • It can help you significantly improve the compression and hence performance of your larger reports.
  • Allow you to import smaller lookup tables based on the subset of records in your data table(s).
  • Allow you to store these loading variations in a bespoke workbook rather than in the source database.

But enough of that – let’s get into it.

One of the first dialogue boxes you will ever see in Power Pivot for Excel is this one.

image thumb SQL for Excel Power Pivot Users

When I first saw the 2 options in this dialogue, it was very clear to me.  I would ALWAYS use option 1 and I would NEVER use option 2.  Now I know better and I am going to share with you why I now routinely use option 2. 

Note: You can only use SQL if you have a supported database as a source.  You can’t write SQL code to selectively import CSV data for example.

I want to start with a data model that already is loaded with data built and is up and running.  The sample I am using is a typical Adventure Works report that I built some time ago for another blog post.  You can Download the files I used from here if you want to play along.  Make sure you unzip the files into a folder – don’t use the files from a zipped folder else the next steps may not work.

The reason I am doing this is that this is my most common use case – I have a report (typically large – maybe 200MB, and someone wants a variation).

Reconnect The Database

If you are following along with the samples, the first task is to reconnect the Power Pivot Workbook to the Access Database.  To do this, go to the Power Pivot Window, select Home\Existing Connections and then select the Access connection (#1 below) and then click Edit (#2).

image thumb 1 SQL for Excel Power Pivot Users

Then just browse to the location of the Access database (it is in the zip file with the Excel workbook), select the Access database, Click Open\Save\Close.  This reconnects the Excel file with the database.

The basic SQL syntax is simply as follows

It can be a lot more complex than this, but the point is that this is enough to get started.

Use Case 1 – Import a Sub Set of a Data Table

Let’s assume that you want to refresh a workbook so it only contains data for the top 10 selling products.  First create a copy of the existing workbook (the one that contains all your data) so you don’t change or corrupt your master copy.  Then use the new workbook (still containing all the data) to find a list of the required product codes you want in your new workbook.  In my example I want the top 10 selling products, so it is easiest to do this in a pivot table like shown below.

image thumb 2 SQL for Excel Power Pivot Users

Once you have this list of values (product keys), you could go into the Power Pivot window, click on Design\Table Properties, and bring up the Import Wizard dialogue.  From there you could manually select the product codes you need as shown below.

image thumb 3 SQL for Excel Power Pivot Users

Now if you just want to re-import the data for 1 or 2 products, or if you need to import a contiguous range of product codes, then this is a pretty easy process.  But if you want to select a non-contiguous set of data like in this example, it is a lot more work. And it gets worse the more items in the list you need to import.  But this problem is easy to solve with some simple SQL.  Here is how I do it.

The first step is to deselect all product codes and then select any 1 item in the list – click OK.  It doesn’t matter which one as it is going to be changed anyway.

image thumb 4 SQL for Excel Power Pivot Users

Then in the same dialogue box, switch from Table Preview to Query Editor as shown below.

image thumb 5 SQL for Excel Power Pivot Users

You will then see the SQL code generated by the wizard (shown below).

image thumb 6 SQL for Excel Power Pivot Users

The SQL code is not very well formatted and is hard to read, so the next thing I like to do is format the SQL code.  I normally use the following website but you can use any online service you like. http://www.sql-format.com/

I copy and paste the SQL code from the Power Pivot window above and then paste it into the formatting service and then click Format.  It gives me something that looks like this

image thumb 7 SQL for Excel Power Pivot Users

I then copy and paste the code back into the dialogue box in Power Pivot, and it looks like this below.

image thumb 8 SQL for Excel Power Pivot Users

Now the reason I selected any single product from the table preview editor earlier in the process is because this forced the table preview editor to write some more of the SQL code for me.  See how it added a WHERE clause?  If there is no WHERE clause, then the query will simply bring back all the records.

The last part of the code is as follows

The next step is to replace this WHERE clause with a list of product keys.  I am using an IN statement to do this as follows:

Note that I have removed some extra unnecessary brackets from the original code line 2.  You don’t need to be a programmer to work out how this works.  It basically says “bring back the columns I have asked for from the Sales table where the product key is in this list of values above. The reason the list of columns is so long (in this case) is because I selected a subset of columns when originally importing my data.  If I selected every column, it would simply say SELECT * FROM [Sales].

To get the list of product codes, you can use the normal Excel tricks to concatenate them into a single string separated by commas.  To do this, I normally get my list of values into a Pivot Table, sort the pivot based on sales (in this case given I am after the top 10 products), copy the product codes from the Pivot Table and then paste them as values somewhere in the spreadsheet.

image thumb 9 SQL for Excel Power Pivot Users

Once you have done this, you can manually combine them into the IN statement needed using Concatenate or similar.  Once I had done this a few times I figured some VBA code would be better.  I wrote this VBA code (below) that allows you to select/highlight a range of cells containing the data (product numbers saved as values in this case) and then run the VBA code to create the IN statement.  Note this range of values can’t be the pivot table, it has to be values in cells.

Here is the code if you want to use it (one code for text, one code for numbers).

After running the VBA or manually concatenating the values you will have this:

IN (312, 310, 313, 314, 311, 361, 353, 363, 359, 357)

Now all you need to do is copy the IN statement and add it to the SQL code in the Query Import dialogue.  Paste it in, click validate (to make sure you haven’t made any mistakes) and then save.

image thumb 10 SQL for Excel Power Pivot Users

The sales table is then re-imported with a sub set of records.

image thumb 11 SQL for Excel Power Pivot Users

Now that the Sales table contains a small list of 10 products, it is not necessary to bring in every product from the products table.  It is easy to fix this using the exact same approach as above – in fact you can use the exact same WHERE clause (assuming the name of the product code column is the same in both the Sales and the Products tables).

  • Go to the products table
  • Click on Table Properties
  • Change the product key filter to select a single product
  • Switch to the Query Editor
  • Format the Text with an online formatter
  • Copy the formatted SQL code back to Power Pivot
  • Reuse the IN statement from before to limit the selected products to be the same ones in your sales table.

You will end up with this.  Validate the formula and then save.

image thumb 12 SQL for Excel Power Pivot Users

As I covered in my last blog post, you can gain significant compression improvements if your workbooks are a lot larger than 2 million rows. To gain the benefits, you will need to sort the data on load.  To do this, simply do the following.

  • Go to your data table
  • Switch to the Query Editor
  • Format the query if you like
  • add an ORDER BY clause to the end.

See the example below.  Now in my test workbook there are only a couple of thousand rows, and so this wont make any difference to my workbook.  However if you have millions of rows, this can make a big difference (read the previous blog I mentioned above to find out why).  You will need to work out which column to sort by in your own data with some trial and error and understanding of the cardinality of your data.

image thumb 13 SQL for Excel Power Pivot Users

This next example is not beginners SQL and also will not work with Access, but it is a good demo of what can be done.  You can copy the pattern I show below or invest some time learning some more complex SQL.  Keep in mind that I am an Excel user with no formal SQL training, and I can now write this code quickly and easily when needed. If I can learn it, other Excel users can learn it too.  Basically what this code does is:

  • Find a list of customer codes that have purchased the products in question
  • Creates a temporary table of these customer codes
  • Uses the temporary table to extract a list of just those customers into the customer table.

I use this pattern to retrieve a subset of dimension tables from a SQL Server DB all the time.  If your data source is SQL Server you would do this as follows (same general process as before).

Go into the Customer table and copy the SQL Code (just copy what is there), format the code at an online formatter.

image thumb 14 SQL for Excel Power Pivot Users

This time I have left it in the formatting tool to show you what I am doing.  If I am doing this in real life, I typically complete this step in SSMS, but the process shown below is fine too.

Now go into the Sales Table and copy the SQL code from there too.  Paste the SQL code from the sales table and paste it before the SQL from the Customer Table into the editor below. You will have something like this.

image thumb 15 SQL for Excel Power Pivot Users

Then I edit the first SQL statement (Sales) so that all it does is create a list of Customer Numbers that have purchased this product (shown below).

image thumb 16 SQL for Excel Power Pivot Users

Then I turn this code snippet into a temporary table that I can reuse in my query.  There are various ways of doing this, but here is what I do (shown below).  This gives me a temporary table called ‘temp’ that contains all the customer numbers I need in my Power Pivot Workbook.

image thumb 17 SQL for Excel Power Pivot Users

Then I change the second part of the SQL statement to just extract those customers as follows.

image thumb 18 SQL for Excel Power Pivot Users

Copy the code back to the customer table in Power Pivot, validate and save.

All of these examples have 1 thing in common.  The code to extract a different set of data is saved inside the Query Editor in a duplicate copy of a Power Pivot Workbook.  If you need the same query over and over, then it would be much better to ask someone to create a query/view for you in the database itself.  However for those times that you want to build an ad hoc report, I think it is much better to store your queries in the workbook itself.  It is accessible to you (the report author) and doesn’t pollute the source database with one off queries that will rarely be reused.  So the end state of the steps outlined above is to create this bespoke workbook that contains all the information for the specific bespoke need without polluting your SQL Server DB.

One thing to be aware of – you can do bad things to your database server if you write bad SQL. Literally you can bring the SQL DB to a standstill. If you do this, it is possible someone who administers the server will come after you.  Worst case they may ban you from refreshing your workbooks against the server.  So I suggest you start out simple and make sure you know what you are doing before you get too tricky.

I would love to hear from others about what other SQL tricks they use with Power Pivot.  I have thought about creating parameter tables to store the list of values and then use Power Query to manage it all, but I am not an expert on Query Folding so I have not gone down this path (yet anyway).

This entry passed through the Full-Text RSS service – if this is your content and you’re reading it on someone else’s site, please read the FAQ at fivefilters.org/content-only/faq.php#publishers.

PowerPivotPro

Data Modeling for Power Pivot and Power BI

What if I told you, there was one key skill, that has set me head and shoulders above the Excel crowd when working with Power Pivot and Power BI? Would you be interested in learning that skill? Great, because I was about to teach it to you anyway.

20160126 FeaturedImage DataModel Data Modeling for Power Pivot and Power BI

But first a story. I had discovered Power Pivot (code-named Gemini back then), while working in the Microsoft Commerce group. But really came in to my element at my next gig, the Microsoft Learning team (read about my Power Pivot Journey). There I met my colleague Gregory Weber (known as Greg). It’s hard not to be impressed by Greg. He was a Microsoft certified instructor and taught courses for many years, both within US and internationally. He has got the trainers voice, and fluidity with words. Along with the ability to deconstruct complex topics and make them easy to understand. He can talk about pretty much any topic under the sun, technical or otherwise. Once you get to know him better, you’ll find out that he enjoys building and programming robots in his spare time (How’s that for a hobby?). Lately he has been pursuing a degree at Georgia Tech, while also helping to teach some of the courses there. Besides all of this, he is a genuinely nice guy.

It was Greg, who suggested that we form a book reading club. Our first book was one I had heard about, but never read – The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling By Ralph Kimball. As a business analyst, I had leaned heavily on Excel, along with a mishmash of other technologies. Data warehouse and data modeling didn’t seem like topics that would be relevant to me; more for an IT/BI team perhaps. But I figured, it couldn’t hurt to learn something new.

Our book club meetings looked more as if, class was in session. We brought in our questions, and Greg patiently answered them, helping us realize the importance of the topics, and trade-offs involved in various choices. As things go, our reading club was disbanded before we were even halfway through the book. But the knowledge that I had gained, helped me grow by leaps and bounds in my Power Pivot and Power BI journey.

Now the truth is, I would never have picked up this book and read it myself (much less understand it). It was only due to a teacher like Greg, that I was able to internalize some of the concepts. So this article, is my humble attempt to pass on the knowledge (hopefully without mangling it). I’m just paying it forward.

The article below, was actually submitted, as a chapter in our new book. However under Rob’s keen eye, we realized that most of these topics were weaved in as best practices in the rest of the chapters anyway. While it didn’t make it in the book, we still feel it would be of value to our readers – hence this article.

If you were reading the book, this article would perhaps fit in best as Chapter 19A. (After Chapters 17/18/19 have covered Multiple Data Tables and Performance). You can get our book at Amazon or MrExcel.

Data Modeling – for Scalability and Usability
    Scaling to Millions of Rows
    Scaling to Multiple Data Sets
        Lookup Tables – the Who, What, Where, When, How
        Matrix of Lookup and Data Tables
        Crazy Lookups and Snowflake: Too Much of a Good Thing
    Usability

Scaling to Millions of Rows

So you’ve read our advice that

• Wide Flat Tables = Poor Performance
• Narrow Data Tables with separate Lookup Tables (Star Schema) = Improved Performance

But you say to yourself, the performance of my model is just fine. Thank you, thank you very much. (You did say that the Elvis way, right?). That is the case for most new users adopting Power Pivot. They are working with relatively small models and no matter how they are structured, your performance is reasonably good. Especially, when compared to all the effort you had to put in earlier (before Power Pivot), to build similar reports.

However this honeymoon period is not going to last forever. Sooner or later you would end up working on a data model, large enough, that all these lessons would be important. It is easier to start off, following the best practices, than later on having to undo your work and unlearn the bad habits.

I would not elaborate too much on this here, we’ve covered this ground already in the Performance chapter(Ch 19). Instead I want to speak about other reasons, why a good Data Model is critical to your success.

Scaling to Multiple Data Sets

So you are a new Power Pivot user and start your journey with a small dataset. Say the Sales data, that we started with or something similar. At this point, for a small dataset, it doesn’t matter much, whether you keep separate Data and Lookup tables (Star-Schema) or a single flat-and-wide table. Since it is a small dataset, the performance would not vary noticeably and you can pretty much write any measure you need, in either design.

DATAMODEL 01 Data Modeling for Power Pivot and Power BI
With a single small dataset, there is not much difference, performance wise,
between the two designs –Star Schema and Flat-and-Wide
(click to enlarge any image in this article)

It is always best to organize your lookup tables near the top and the data table near the bottom. Consider the above Star-Schema screenshot for illustration only.

Now think about, what happens when you bring in another dataset. If we are using flat design, we end up with two flat data tables – Sales and Service Calls in our case. That works as long as you analyze these two datasets separately.

DATAMODEL 02 Data Modeling for Power Pivot and Power BI
Two Flat-and-Wide tables: Sales and Service Calls

DATAMODEL 03 Data Modeling for Power Pivot and Power BI
Can create separate pivots to analyze the datasets separately…

However, often the magic happens when you can show measures from two different data sets side by side in the same pivot, and slice and dice them together. Or even better – write hybrid measures across the two datasets.

DATAMODEL 04A Data Modeling for Power Pivot and Power BI
Analyzing disparate datasets together is the holy grail for analysts

But if you have two flat tables how would you ever be able to accomplish this? This is a big moment, so I want you to pause and think about it.

<<This page intentionally left blank to give you a moment of cogitation>>

The answer is obvious, and we have already implemented this design in our model, and exercises earlier. We need a common lookup table.

DATAMODEL 05 Data Modeling for Power Pivot and Power BI
Common Lookup table would let us connect and analyze the datasets together

Lookup Tables – the Who, What, Where, When, How

And not just one, you need a few. Often when I am working with a dataset and need to think about the lookup tables I might need, I think of the – Who, What, Where, When, How. For example for our Sales data, I might answer that, as below and attempt to create the Lookup tables accordingly.

DATAMODEL 05A Data Modeling for Power Pivot and Power BI
Who, What, Where, When, How – would become our Lookup Tables

DATAMODEL 06 Data Modeling for Power Pivot and Power BI
Both data tables connected to all relevant lookup tables – ready for action!

Diagram View with Multiple Data/Lookup Tables: If you do end up with multiple data and lookup tables, your Diagram View might start to look like a spaghetti chart or a complicated Integrated Circuit diagram. We would still recommend that you keep your Lookup tables near the top and Data tables down below – to relate to our analogy of “relationships flow downhill”. Some people complain that they cannot see the related tables easily this way. However understanding your data and lookup tables and improving your discipline in using them that way should override that.

If you do end up with more tables than you can keep track of, consider creating Perspectives to keep them organized for yourself and for your users. Learn more at http://ppvt.pro/groupTables

Note that, you can have multiple “Who”s and multiple “When”s etc.

For example, for our sales transaction, we can have

Who: Who placed the order? Whose credit card was used to pay for the transaction? Who was it shipped to?
When: When was the order placed? When was it shipped? When did it arrive? When was the customer invoiced? When did the customer finally pay?

You can read about how to handle such cases, where potentially there could be multiple relationship between two tables, in our chapter on “Complicated” Relationships (Ch 22).

Matrix of Lookup and Data Tables

As your model evolves and you pull in additional datasets, these should get plugged into Lookup tables in a matrix like fashion (the rectangular array, not the movie; well maybe a bit like the movie wlEmoticon smile 3 Data Modeling for Power Pivot and Power BI).

DATAMODEL 07 Data Modeling for Power Pivot and Power BI
Data sets should connect to a common set of Lookup tables
(click to enlarge any image in this article)

Even if the number of data tables in your model grow at a prolific rate, the lookup tables should remain a fairly stable set. Your lookup tables typically represent the core entities in your business – like Customer, Product etc. – to which many data sets would hook up to. On occasion as you bring in new datasets (which typically represent business processes), you may need to add new lookup tables as well. But that exercise should be fairly infrequent.

Not all data tables would connect to all lookup tables, only the ones that apply to the specific dataset. That is perfectly all right, in fact to be expected. For instance in the example above, Budget does not connect to Customer or Employee, since Budget is not set at Customer or Employee level (at least for this example scenario).

When using measures across two (or more) datasets in the same pivot (or when using hybrid measures) just be aware that only the common lookup tables would provide filtering on both datasets. You may get invalid results if you use a lookup table which is only connected to one of the two data tables. See example below, but also covered in detail in Chapter 17 Multiple Data Tables, Section ‘Multiple Data Table Gotchas’.

DATAMODEL 07A Data Modeling for Power Pivot and Power BI
Budget data table is not connected to Customer Lookup Table,
thus you would get invalid results if you try to use them together

Crazy Lookups and Snowflake: Too Much of a Good Thing

Whereas Lookup tables are good, there can be too much of a good thing. I have seen client models where everything is a lookup table and the diagram view looks like a mutant octopus.

DATAMODEL 08A Data Modeling for Power Pivot and Power BI
Making every little thing into a lookup table is not good data modeling
(click to enlarge any image in this article)

Or where we have cascading Lookup tables (also called Snowflake).

DATAMODEL 09A Data Modeling for Power Pivot and Power BI
Lookup Tables have Lookup Tables? That may not be ideal
(click to enlarge any image in this article)

None of these are considered good design for performance, scalability and usability reasons (discussed next). Your lookup table should represent something tangible and easily understood – e.g. Product, Purchase Order.

Consider all of this as guideline, a best practice. We’re not too squeamish about breaking these rules. Just be aware when you are not following the guidelines, understand why you are not doing so and account for possible implications.

Usability

Star schema (separate Data and Lookup tables) would typically offer you the best performance and allow you to scale to millions of rows and multiple data sets. But another important implication is usability. Even for the simplest scenario where we have only one dataset – which of the field lists below would you like to use, day-to-day to build new reports and analysis?

DATAMODEL 10B Data Modeling for Power Pivot and Power BI
Star Schema would be much more user friendly once you start building reports

More importantly, which field list would you want other users to be using who connect to your data model? Star Schema groups items into logical entities which are easier to understand; not just by you but for all the users who would end up connecting to your data model and creating reports/analysis against it.

Think you are just creating a quick analysis model for yourself? Power Pivot is so powerful that you would often create models that last longer than you ever expect, be used by more people than you ever imagined. The model that I built at Microsoft, started as a simple model for my own use, but grew over time and went on to be used by hundreds of users and is still in operation long after I left Microsoft.

When we parachute into an organization for our typical 2-3 day consulting engagement, I still create each data model like it’s going to be the next best thing since sliced bread for them. Dare I say, it often is wlEmoticon smile 3 Data Modeling for Power Pivot and Power BI

So dream big and aim high. Put some thought into designing your data model. It only takes a little more work and pays huge dividends in the long run.

Book Cover Page1 A 515x640 Data Modeling for Power Pivot and Power BITo learn more Power Pivot and Power BI best practices and tips & tricks, get our bestselling book, Power Pivot and Power BI, at Amazon or MrExcel. A big thanks to Greg Weber at Microsoft for teaching me all I learnt about data modeling.

Power On!

This entry passed through the Full-Text RSS service – if this is your content and you’re reading it on someone else’s site, please read the FAQ at fivefilters.org/content-only/faq.php#publishers.

PowerPivotPro

For Real This Time: Show Barnes & Noble that Power Pivot and Power BI are Important!

image thumb 13 For Real This Time: Show Barnes & Noble that Power Pivot and Power BI are Important!

Sunday, Feb 7th in a Local B&N:  That Silver/Grey Spined Book is Ours!

image 28 For Real This Time: Show Barnes & Noble that Power Pivot and Power BI are Important!

OK, for real, B&N has them now!  (I have seen them myself, as pictured above, and every B&N I’ve called has them).

Remember, they accepted our challenge of an experiment:  they will stock our new book for a few weeks and see what kind of “action” it gets.  If it does decent sales, they will keep the book on the shelves (AND start stocking more such books, which is just as important IMO, because people scanning the Excel section of a bookstore is one way that the word gets out about these toolsets).

Simple: go into a local Barnes and Noble this week (or weekend) and buy the book “Power Pivot and Power BI.”  It looks like this, for quick reference:

image thumb2 thumb For Real This Time: Show Barnes & Noble that Power Pivot and Power BI are Important!

Buy This Book at a Barnes and Noble, We’ll Send You Another Book Free (See Below)

If you ask for it by title, they might tell you they don’t have it, even though they DO!  (Long story, but they have the title of the 1st edition linked to the new ISBN, but with the new book’s cover…  it’s pretty funny).

Instead, ask them for:

ISBN 978-1615470396.

That’s what you should ask them for on the phone, or at their in-store helpdesk.  Don’t even mention the title or authors.

About half the stores we called had the book, but had NOT yet placed it on shelves – they still had it sitting in the storeroom in the back.

If it IS on the shelf, it is probably in the Technology section, amongst the other Excel books:

image thumb 14 For Real This Time: Show Barnes & Noble that Power Pivot and Power BI are Important!

Half of Stores Have it in the Tech Section, with the XL Books (other stores, it’s still in their back storeroom)
(Many Thanks to Wayne M for this photo from his B&N in Washington DC on Saturday!)

This might seem like a money grab campaign by us, but if it were money we were after, we’d MUCH rather you buy the book from MrExcel.com than Barnes and Noble or Amazon, because our cut of a retail sale is tiny compared to when we sell it ourselves.  (Like, it’s a 3-4x difference, no kidding).

No, this is an awareness campaign, and we’re going to “pay back” anyone who helps.  Buy the book above, and we’ll send you one of these:

image thumb7 thumb For Real This Time: Show Barnes & Noble that Power Pivot and Power BI are Important!

We’ll Send You Your Choice of Any of These Books, Plus a DAX Sticker wlEmoticon smile 2 For Real This Time: Show Barnes & Noble that Power Pivot and Power BI are Important!

Yeah, and we’ll include one of the exclusive DAX stickers from the Indiegogo campaign:

image thumb10 thumb For Real This Time: Show Barnes & Noble that Power Pivot and Power BI are Important!

You Will Also Get One of These wlEmoticon smile 2 For Real This Time: Show Barnes & Noble that Power Pivot and Power BI are Important!

  1. Buy the Power Pivot and Power BI book at B&N.
  2. Keep the receipt.
  3. Mail the physical receipt to the address below.
  4. Include a note as to which of the five books you want.

MrExcel Has Gone Insane
1075 Molaki Drive
Merritt Island FL 32953

We’d love to see pictures of the book on shelves in the physical world, so please please please send some!  (I’m going to be doing the same, canvasing B&N stores this weekend in my area).  Post them as comments, send them to our “empower” email address, tweet them at us, whatever.

Even if you don’t buy the book, send pics in fact if you happen to be in a B&N and see it wlEmoticon smile 2 For Real This Time: Show Barnes & Noble that Power Pivot and Power BI are Important!

This entry passed through the Full-Text RSS service – if this is your content and you’re reading it on someone else’s site, please read the FAQ at fivefilters.org/content-only/faq.php#publishers.

PowerPivotPro