Tag Archives: rows

Use calculated tables to show only used rows in dimensions

I got an interesting request this week, to improve the usability of the tabular model they wanted to filter the dimension table to only contain the rows that have data in the fact table. That way the slicers and other filters will never have data where you don’t have any sales or any facts. Often this is taken care before loading the data by creating view or writing queries but in this case this would have been expensive and time consuming for all tables in the model. But there is a way to do this in Power BI \ SSAS by using calculated tables.

Take the following model:

 Use calculated tables to show only used rows in dimensions

I filtered the FactOnlineSales table in the query to exclude some products. I ended up with a sales table that only contains data for 2 product, the product tables has 2517 products in total though.

If I now want to use the product as slicer or filter Power BI will show all of them, in this case it looks like this:

 Use calculated tables to show only used rows in dimensions

As you can see that is a long list! I just want to filter it down to only show those products where I have sales. To do this I will create a new calculated table that only contains those product that have values in the sales table:

FilteredProduct = FILTER(DimProduct,
                                    COUNTROWS(RELATEDTABLE(FactOnlineSales)) > 0
                        )

In short: Filter the rows in the DimProduct table to keep only those rows where we have more than 0 zero rows in the related sales table for the current product.

Now this works, I just have 2 rows left:

 Use calculated tables to show only used rows in dimensions

But I would like to make it a bit cleaner by hiding the table and setting the relationship to inactive, the problem is that by doing that the expression doesnt work anymore since I am using the RELATEDTABLE function. But there is a way to fix this by adding the USERELATIONSHIP function:

FilteredProduct = FILTER(DimProduct,
                                 CALCULATE(
                                            COUNTROWS(RELATEDTABLE(FactOnlineSales)) > 0
                                           ,USERELATIONSHIP(DimProduct[ProductKey],FactOnlineSales[ProductKey])
                                          )
                          )

Now this does the trick. The model now looks like this:

 Use calculated tables to show only used rows in dimensions

By hiding the DimProduct table the end user will not be able to see the table in the field list and my slicer now returns just the 2 items:

 Use calculated tables to show only used rows in dimensions

Mission accomplished!

Let’s block ads! (Why?)

Kasper On BI

Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

image thumb 3 Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

The Pivot Pictured Above Acts as if We’ve Swapped Out Fields on Rows – in Response to a Slicer Click!

First off…my first post! Being one of the newest (and youngest) members of the PowerPivotPro family has been very exciting so far. As a way of introducing myself, I’d like to share a creative solution to a problem I’m sure many of you have encountered when building a report or dashboard.

Typical Client Report Requests:

  • Simple but complex
  • High-level yet detailed
  • Compact yet containing everything…

Quite understandably, clients love to channel their inner Marie Antoinette by basically asking to have their cake and eat it too. I actually relish those scenarios, they allow me to flex my “think outside the box” muscles!  And hey, it’s great to be working with a toolset that truly CAN accommodate the special demands posed by real-world situations.

Well one such scenario had a customer wanting a summary Pivot Table with about five columns (fields) worth of details in it. No problem, done and done! The problem we were encountering however…was that 90% of the real-estate space on our reporting page had already been used for other visuals, none of which this client wanted eliminated or reduced to make room. So I’m left with the predicament of figuring out how to fit all this data onto this dashboard…Tetris mode engage!

Unfortunately despite my best efforts to rearrange the dashboard (accompanied by my 80’s Rush Mixtape) I simply could not find any way to display a wide Pivot Table on this dashboard. So I circled back to the drawing board and asked myself what variables I could manipulate to achieve the desired outcome.

I realized that I had an assumption that the PivotTable had to be fixed, meaning that it always has to show all levels of the data. However I LOVE to design visuals for clients that are dynamic, only showing the relevant data to them (often based on slicer selections). So I politely asked my previous assumption to leave and invited over my good friend paradigm shift. After some long conversations and extensive Google searches I actually ran across a PowerPivotPro Blog Post written by Rob that inspired my eventual solution.

Discovering this post almost felt like a relay race and I was being passed the baton to cross the finish line. Using the idea from this post that a slicer could change the axis of a chart, I realized the same would work in a PivotTable. All five columns in my table were part of a hierarchy…so why not use this technique to display a single column that would DYNAMICALLY switch between any of the five levels of this hierarchy based on slicer selections. I would now be able to create a table that is both compact and would display all the data the client needed.

Time to break out the cake forks!

Now for the fun part as I share this recipe for success (that was the last cake joke I promise). The general idea will be to create a hierarchy in the data model, and then reference those in an Excel set to be used in my Pivot Table. I’ll be using tables from the publicly available Northwind DW data set for this example.

Download Completed Example Workbook

X

Get Your Files

FIRST, create a Customer Geography Hierarchy in the data model on the DimCustomer Table.

Hierarchy in the Data Model table:

Customer Geo Hierarchy with box thumb 1 Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

SECOND, create a new DAX Measure called “Distinct Count of Country.” This will be used in our set to indicate whether or not a selection was made on our country Slicer.

=DISTINCTCOUNT( DimCustomer[Country] )

Now some of you technically savvy readers may be thinking “why didn’t he use the DAX Function HASONEVALUE?” I’ll explain more on this later when I explain how to write the set using MDX.

THIRD, create a new set for your pivot table referencing our recently created Hierarchy and DAX Measure. Note that the only way to access your sets is through a conditional ribbon that is displayed only when a cell selection is on a Pivot Table.

Opening the Set Manager window:

Sets with boxes thumb 1 Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

Creating a new set using MDX:

Set Manager with boxes thumb 1 Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

Writing the MDX Code:

MDX with boxes thumb 1 Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

This MDX query works by utilizing an IIF statement, which operates the same was as in DAX or Excel. It checks to see if our (Distinct Count of Country) DAX Measure is greater than 1 (indicating no slicer selection). If TRUE it returns the Country column from our hierarchy, if FALSE (slicer selection has been made) then it returns the City column. It’s important to note that I must reference the columns in the hierarchy, if you were to put just the column names in this query it would not run. It’s also important that the “Recalculate set with every update” box is checked, this makes sure the MDX statement is calculated every time someone uses a slicer, otherwise it’ll appear like the set isn’t working.

Keen observers take note! Here’s where I explain WHY I used a DISTINCTCOUNT rather than HASONEVALUE in my DAX Measure. Let’s say a client would like to multi-select countries in the slicer and still have it display the City column on rows in the Pivot Table. If I were to use HASONEVALUE in my DAX Measure I would only switch to the city column when a single value was selected.

The way I’ve designed it we can change the value in the MDX query from 1 to any number we’d like (E.g. 2, 3, etc.) which gives us the flexibility to allow multiple slicer selections and still have it switch to the City column.

“Clever Girl…”

Clever Girl thumb Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

I’m not actually sure if I’m supposed to be the hunter or dinosaur in this analogy from Jurassic Park…but either way I felt clever for that last step.

FOURTH, we can now use our newly created set in a Pivot Table. You’ll notice that a new folder called Sets is now nested in our DimCustomer Table.

Placing the Customer Geo set on rows in our Pivot Table:

PivotTable with boxes thumb 1 Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

Making a slicer selection to observe the dynamic switch from Country to City. Pretty cool!

PivotTable with slicer selection AND Boxes thumb 1 Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

My client’s reaction could be summed up in a single word spoken by the immortal Keanu Reeves…

”Whoa…”

Now some of you may have noticed that I have a Total Sales value at the top of my Pivot Table. Now’s my chance to point out one unfortunate drawback of using sets on rows, it eliminates the totals row at the bottom of the Pivot Table. All is not lost though my friends, for every every problem a solution can always be found! In this case I created an artificial “Total’s row” at the top of the Pivot Table. I did this using the tried and true CUBEVALUE function to call the measure I’m already using in my PivotTable. NOTE that you need to make sure you connect all slicers (via the slicer_name) in the cube string for them to slice the CUBEVALUE as well. Finally, just a bit of dash formatting and some elbow grease and we have ourselves a Totals row!

CUBEVALUE Formula used in the cell for totals:

Totals Row with box thumb Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

There you have it, your very own Mighty Morphing Pivot Table. wlEmoticon smile 1 Mighty Morphing Pivot Tables or: How I learned to automatically change hierarchy levels on rows

Let’s block ads! (Why?)

PowerPivotPro

.Net 4.6.2. Framework client driver for Always Encrypted resulting in intermittent failures to decrypt individual rows

The SQL Product team has identified an issue with .Net 4.6.2 framework client driver for Always Encrypted enabled database on SQL Server 2016 and Azure SQL Database. The issue can lead to intermittent failure while trying to decrypt the records from the Always Encrypted enabled database with following error message

Decryption failed. The last 10 bytes of the encrypted column encryption key are: ‘7E-0B-E6-D3-39-CE-35-86-2F-AA’.The first 10 bytes of ciphertext are: ’01-C3-D7-39-33-2F-E6-44-C3-B1′.Specified ciphertext has an invalid authentication tag. 

The above failure to decrypt may potentially lead to incorrect query results which in turn may trigger incorrect behavior in the app, for example, attempts to insert missing values or to perform any other updates that will either produce further errors or produce inconsistent data in the database.

The SQL Product team is aware of the issue and is actively working on the fix which may be made available soon. In the interim, we have following recommendation for the users

  • Users who are on .Net framework 4.6.1 are not impacted by this issue and can ignore this. We recommend not to upgrade to .Net framework 4.6.2 if you are using Always Encrypted database feature until the fix for the issue is released.
  • Users who have the latest version of .Net framework 4.6.2 installed, we recommend, if possible to rollback to previous version of .Net framework 4.6.1. Note: In general, you should not uninstall any versions of the .NET Framework that are installed on your computer without testing,  because the application dependent on that version can potentially break and may not function as desired.
  • If you are unable to uninstall .Net framework 4.6.2 due to application dependency, you can workaround the issue by turning off column key encryption (CEK) by setting the SqlConnection.ColumnEncryptionKeyCacheTtl property to 0 in the .Net framework 4.6.2 driver.

After uninstalling .Net framework 4.6.2 or turning off Column Encryption Key (CEK) caching, users can confirm that the error does not reappear during a table scan (e.g., SELECT * FROM < table with Always Encrypted>) executed from a query window in SSMS. Executing this scan will help validate the error doesn’t occur.

Customers who encounter the above error during the validation scan and are unable to resolve the issue, should contact sqlalwaysencrypted@microsoft.com.  The team will be able to help access and recover all previously encrypted rows that were affected by this bug. There will be no permanent data loss caused as a result of this defect.

To determine which versions of the .NET Framework are installed on a system, see How to: Determine Which .NET Framework Versions Are Installed.

Parikshit Savjani
Senior Program Manager (@talktosavjani)

Let’s block ads! (Why?)

SQL Server Release Services

SQL Pattern Matching Deep Dive – Part 4, Empty matches and unmatched rows?

I have been asked a number of times during and after presenting on this topic (SQL Pattern Matching Deep Dive) what is the difference between the various matching options such as EMPTY MATCHES and UNMATCHED ROWS. This is the area that I am going to cover in this particular blog post, which is No 4 in this deep dive series.

When determining the type of output you want font-size: 12px;”> MATCH_RECOGNIZE to return most developers will opt for one of the following:

  • ONE ROW PER MATCH – each match produces one summary row. This is the default.
  • ALL ROWS PER MATCH – a match spanning multiple rows will produce one output row for each row in the match.

The default behaviour for MATCH_RECOGNIZE is to return one summary row for each match. In the majority of use cases this is probably the ideal solution. However, there are also many use cases that require more detailed information to be returned. If you are debugging your MATCH_RECOGNIZE statement then a little more information can help show how the pattern is being matched to your data set. In some cases you may find it useful, or even necessary, to use the extended syntax of the ALL ROWS PER MATCH keywords. There are three sub options:

  • ALL ROWS PER MATCH SHOW EMPTY MATCHES <- note that this is the default
  • ALL ROWS PER MATCH OMIT EMPTY MATCHES
  • ALL ROWS PER MATCH WITH UNMATCHED ROWS

Let’s look at these sub options in more detail but first a quick point of reference: all the examples shown below use the default AFTER MATCH SKIP PAST LAST ROW syntax. More on this later… 

TICKER DATA

Here is part of the ticker data set that we are going to use in this example – if you want to take a look at the full data set then see the example on the LiveSQL site:

 SQL Pattern Matching Deep Dive   Part 4, Empty matches and unmatched rows?

Empty matches

An obvious first question is: what’s the difference between an “empty match” and an “unmatched row”? This is largely determined by the type of quantifier used as part of the pattern definition. By changing the quantifier it is possible to produce the similar result using both sets of keywords.  To help explore the subtleties of these keywords I have simplified the pattern to just look for price decreases and you should note that we are using the * quantifier to indicate that we are looking for zero or more matches of the DOWN pattern. Therefore, if we run the following code:

SELECT
  symbol,
  tstamp,
  price,
  start_tstamp,
  end_tstamp,
  match_num,
  classifier
FROM ticker
MATCH_RECOGNIZE (
  PARTITION BY symbol ORDER BY tstamp
  MEASURES FIRST(down.tstamp) AS start_tstamp,
           LAST(down.tstamp) AS end_tstamp,
           match_number() AS match_num,
           classifier() AS classifier
ALL ROWS PER MATCH SHOW EMPTY MATCHES
  PATTERN (DOWN*)
  DEFINE
    DOWN AS (price <= PREV(price))
)
WHERE symbol = ‘GLOBEX';

We get the following output:

 SQL Pattern Matching Deep Dive   Part 4, Empty matches and unmatched rows?

You can see that the result set contains all 20 rows that make up the data for my symbol “GLOBEX”. Rows 1- 3, 9, and 13-15 are identified as unmatched rows – the classifier returns null. These rows appear because we have defined the search requirements for pattern DOWN as being zero or more occurrences.

Based on this we can state that an empty match is a row that does not map explicitly to a pattern variable (in this case DOWN). However, it is worth noting that an empty match does in fact have a starting row and it is assigned a sequential match number, based on the ordinal position of its starting row. The above situation is largely the result of the specific quantifier that we are using: * (asterisk). Given that the DOWN variable can be matched zero or more times there is the opportunity for an empty match to occur. As the complexity of the PATTERN increases, adding more variables and using different combinations of quantifiers, the probability of getting empty matches decreases but it is something that you need to consider. Why? Because the MATCH_NUMBER() function counts the empty matches and assigns a number to them – as you can see above. Therefore, if we omit the empty matches from the results the MATCH_NUMBER() column no longer contains a contiguous set of numbers:

So that if we run the following code where we have specified “OMIT EMPTY MATCHES”:

SELECT 
  symbol, 
  tstamp,
  price,
  start_tstamp,
  end_tstamp,
  match_num,
  classifier
FROM ticker 
MATCH_RECOGNIZE ( 
  PARTITION BY symbol ORDER BY tstamp 
  MEASURES FIRST(down.tstamp) AS start_tstamp,
           LAST(down.tstamp) AS end_tstamp,
           match_number() AS match_num,
           classifier() AS classifier
ALL ROWS PER MATCH OMIT EMPTY MATCHES
  PATTERN (DOWN*)
  DEFINE 
    DOWN AS (price <= PREV(price))
)
WHERE symbol = ‘GLOBEX';

We get the following output:

 SQL Pattern Matching Deep Dive   Part 4, Empty matches and unmatched rows?

as you can see the MATCH_NUMBER() column starts with match number 4 followed by match 6 followed by match 10. Therefore, you need to be very careful if you decide to test for a specific match number within the MATCH_RECOGNIZE section and/or the result set because you might get caught out if you are expecting a contiguous series of numbers.  

Summary of EMPTY MATCHES

Some patterns permit empty matches such as those using the asterisk quantifier, as shown above. Three mains points to remember when your pattern permits this type of matching:

  1. The value of MATCH_NUMBER() is the sequential match number of the empty match.
  2. Any COUNT is 0.
  3. Any other aggregate, row pattern navigation operation, or ordinary row pattern column reference is null.

The default is always to return empty matches, therefore, it is always a good idea to determine from the start if your pattern is capable of returning an empty match and how you want to manage those rows: include them (SHOW EMPTY MATCHES) or exclude them (OMIT EMPTY MATCHES). Be careful if you are using MATCH_NUMBER() within the DEFINE section as part of a formula because empty matches increment the MATCH_NUMBER() counter.

Reporting unmatched rows?

Always useful to view the complete result set – at least when you are running your code against test data sets. Getting all the input rows into your output is relatively easy because you just need to include the phrase ALL ROWS PER MATCH WITH UNMATCHED ROWS. Other than for testing purposes I can’t think of a good use case for using this in production so make sure you check your code before submit your production-ready code to your DBA.

What about skipping?

Note that if ALL ROWS PER MATCH WITH UNMATCHED ROWS is used with the default skipping behaviour (AFTER MATCH SKIP PAST LAST ROW), then there is exactly one row in the output for every row in the input. This statement will lead us nicely into the next topic in this deep dive series where I will explore SKIPPING. Taking a quick peak into this next topic…obviously there are many different types of skipping behaviours that are permitted when using WITH UNMATCHED ROWS. It does, in fact, become possible for a row to be mapped by more than one match and appear in the row pattern output table multiple times. Unmatched rows will appear in the output only once.

Can a query contain all three types of match?

Now the big question: Can I have a query where it is possible to have both UNMATCHED ROWS and EMPTY MATCHES? Short answer: Yes.

When the PATTERN clause allows empty matches, nothing in the DEFINE clause can stop the empty matches from happening. However, there are special PATTERN symbols that are called anchors. Anchors work in terms of positions rather than rows. They match a position either at the start or end of a partition, or it used together then across the whole partition.

  • ^ matches the position before the first row in the partition
  • $ matches the position after the last row in the partition

Therefore, using these symbols it is possible to create a PATTERN where the keywords SHOW EMPTY MATCHES, OMIT EMPTY MATCHES, and WITH UNMATCHED ROWS all produce different results from the same result set. For example, let’s start with the following:

SELECT symbol, tstamp, price, mnm, nmr, cls
FROM ticker MATCH_RECOGNIZE(
 PARTITION BY symbol
 ORDER BY tstamp
 MEASURES match_number() AS mnm,
          count(*) AS nmr,
          classifier() AS cls
 ALL ROWS PER MATCH SHOW EMPTY MATCHES
 PATTERN ((^A*)|A+)
 DEFINE A AS price > 11)
WHERE symbol = ‘GLOBEX’
ORDER BY 1, 2;

returns the following 5 rows:

 SQL Pattern Matching Deep Dive   Part 4, Empty matches and unmatched rows?

this shows row 1 as an empty match for the pattern A* because we are matching from the start of the partition. This sets the MATCH_NUMBER() counter to 1. After the empty match the state moves to the pattern A+ for the remainder of the rows. The first match for this pattern starts at row 2 and completes at row 4. The final match in our data set is found at the row containing 15-APR-11. Therefore, if we omit the empty match at row 1 we only get 4 rows returned as shown here:

SELECT symbol, tstamp, price, mnm, nmr, cls
FROM ticker MATCH_RECOGNIZE(
 PARTITION BY symbol
 ORDER BY tstamp
 MEASURES match_number() AS mnm, 
          count(*) AS nmr, 
          classifier() AS cls
 ALL ROWS PER MATCH OMIT EMPTY MATCHES
 PATTERN ((^A*)|A+)
 DEFINE A AS price > 11)
WHERE symbol = ‘GLOBEX’
ORDER BY 1, 2;

returns the following 4 rows:

 SQL Pattern Matching Deep Dive   Part 4, Empty matches and unmatched rows?

Now if we use the last iteration of this example the MATCH_RECOGNIZE statement returns all the rows from the input data. The actual “unmatched rows” are identified as having a NULL match number and NULL classifier. The “empty matches” are identified as having a NULL classifier and in this example the COUNT(*) function returns zero.

SELECT symbol, tstamp, price, mnm, nmr, cls
FROM ticker MATCH_RECOGNIZE(
 PARTITION BY symbol
 ORDER BY tstamp
 MEASURES match_number() AS mnm, 
          count(*) AS nmr, 
          classifier() AS cls
 ALL ROWS PER MATCH WITH UNMATCHED ROWS
 PATTERN ((^A*)|A+)
 DEFINE A AS price > 11)
WHERE symbol = ‘GLOBEX’
ORDER BY 1, 2;

returns all 20 rows from our data set:

 SQL Pattern Matching Deep Dive   Part 4, Empty matches and unmatched rows?

LiveSQL

I have taken all the code and the associated explanations and created a tutorial on LiveSQL so you can try out the code for yourself: https://livesql.oracle.com/apex/livesql/file/tutorial_DZO3CVNYA7IYFU1V8H0PWHPYN.html.

Summary

I hope this helps to explain how the various output keywords that are part of the ALL ROWS PER MATCH syntax can affect the results you get back. You should now understand why your results contains match_number values that are not contiguous and why classifier can return a NULL value along with specific aggregate functions. I expect the hardest concept to understand is the idea of empty matches. As I stated earlier it is always a good idea to determine from the start if your pattern is capable of returning an empty match: are you using an asterisk * within the PATTERN clause? Then you can determine how you want to manage those rows: include the empty matches (SHOW EMPTY MATCHES) or exclude them (OMIT EMPTY MATCHES). Be careful if you are using MATCH_NUMBER() within the DEFINE section as part of a formula because empty matches increment the MATCH_NUMBER() counter.

What should be immediately obvious is that in all the examples I have used the default skip behaviour: AFTER MATCH SKIP PAST LAST ROW. In the next post I will explore the various skip keywords and how they can impact the results returned by your MATCH_RECOGNIZE statement.

What’s next?

In the next post in this series I am going to review the keywords that control where we restart searching once a pattern has been found: the keywords SKIP TO. Feel free to contact me if you have an interesting use cases for SQL pattern matching or if you just want some more information. Always happy to help. My email address is keith.laker@oracle.com

Looking for more Information

Use the tag search to see more information about pattern matching or SQL Analytics or Database 12c.

Let’s block ads! (Why?)

The Data Warehouse Insider