PowerPivot is a remarkable tool that allows regular users to create relational data models and calculations within an Excel workbook. It’s a relatively simple program to use, but it does take a little bit of tinkering to overcome the learning curve. However, if you invest your time in PowerPivot, you will discover its power and become a data wizard!
One of the keys to becoming a data wizard is understanding how context works in tabular data models (such as the ones you build in PowerPivot or SSAS). There are three types of context:
1) Filter context
2) Query context
3) Row context
Now, we could spend hours going over the finer details of how each of these three things influences your tabular data model. But since I mainly want to focus on row context, let’s keep it simple for now. Context refers to the data that is available for a calculation to be performed upon. When you filter data down, you’ve changed the context for your calculation.
Filter context is made when you, the person writing the DAX, code an explicit filter in a measure or calculated column. For example, if you write FILTER(Geography,Geography[City]= "Seattle") in a calculation, then you know that this calculation will only be dealing with data that is for Seattle.
Query context is created by the user, not by you. Most commonly, it’s because the user has pulled a field into the rows of a Pivot Table, and thus your calculation is being sliced by each value in that Pivot Table. Here is a calculation in my data model, not being filtered or sliced by anything. It has the context of my full dataset:
And here is the same calculation, being sliced by ProductID. The data model is queried 8 times - once per product - each time with the query context of only one specific ProductID:
Row Context: The Fussy Child of Relational Data Modeling
Lastly, there is row context. If you are creating a calculated column in PowerPivot, you will encounter row context. Row context means that the calculation you are writing only has the context of the specific row that it’s in; speaking in general terms, it cannot reference any data that is outside of that row without the proper use of certain functions.
Let’s look at an example: A factory has a number of different tools and machines that need to be inspected regularly in order to prevent issues on the production line. The inspections are made using a scanner that outputs the ID number of the tool/machine, the condition that it’s in, and the date of scanning. Here’s what the raw data looks like, right after it’s imported from the scanner:
The supervisor of this factory needs a view that shows him a list of all the products with their latest inspection date and their latest condition. So how do we go about this? Getting the latest inspection date is easy, using query context. The formula for the calculation is =MAX([Date]). When the user pulls ProductID into the rows of their Pivot Table, the calculation works:
But how do we return the latest condition? That’s tougher.
The first task is to create a unique key for each instance of a product being scanned by concatenating the ProductID and the date. In database terms, this is a compound primary key. This was done with the formula =VALUE(CONCATENATE(ScanHistory[ProductID],FORMAT(ScanHistory[Date],"yyMMddHHmmss"))).
This formula uses row context. When I used CONCATENATE(), I referenced [ProductID] and [Date]. Those are two columns, each of which contains many values. But because a calculated column is calculated in row context, each column name only returned one value – the value in that specific row.
The next column is a tricky one. I want to show the Latest ScanID for the product in each row. I want to tell PowerPivot that this calculated column should, for each row, “filter the ScanHistory table down so that the only ProductID showing is the ProductID of the row we’re currently in, then return the max ScanID from those filtered results.” My first thought is to write something like =MAXX(FILTER(ScanHistory,ScanHistory[ProductID]=ScanHistory[ProductID]),[ScanID]), but that doesn’t make sense. How do I distinguish the [ProductID] of this row compared to the [ProductID] of the dataset I’m trying to filter?
Loops, Layers, and EARLIER()
Let’s back up for a second. The calculated column I’m trying to make is a nested loop. The outermost layer of this nesting is the MAXX() function, and the inner layer is FILTER(). The loop works like this:
- Top layer: MAXX(). The MAXX() function is trying to find the maximum [ScanID] of the table we’re calculating inside of it using FILTER(). This layer has row context – it is an aggregation function that does not reference any outside values.
- Inner layer: FILTER(). This FILTER() is nested inside MAXX(), and is calculating a result set from which MAXX() will pull the maximum [ScanID].
The loop goes like this: pick one row, calculate the inner layer, calculate the outer layer, move on to the next row, repeat. EARLIER() is a function that allows you to reference values in the different layers of a nested loop. Specifically, it references the next layer up. This function only works when the layer above has row context (i.e. is not scanning the entire table as FILTER() does).
I can apply EARLIER() to my attempted DAX formula above: =MAXX(FILTER(ScanHistory,ScanHistory[ProductID]=EARLIER(ScanHistory[ProductID])),[ScanID]). When I use EARLIER() in the FILTER() statement, I’m referencing the layer above FILTER(), which has the row context. In this way, I’m finally able to distinguish the [ProductID] of this row compared to the [ProductID] of the dataset I’m trying to filter.
Finally, I use the same concept to retrieve the latest condition for each product:
Now, I can pull LatestCondition into the rows of the Pivot Table and the measure called Latest Inspection (that I created earlier) into the values of the Pivot Table:
Ta-da! Good luck with your row context, nested loops, and use of EARLIER()!
By: Brian Pohl
*The above image of Hagrid comes from Harry Potter and the Sorcerer’s Stone. The rights to Hagrid and all things Harry Potter-related are owned by Warner Bros and J.K. Rowling. Not at all by us.
Posted by admin