PowerPivot is an incredible, under-appreciated tool for data analysis. But using such a robust program can be intimidating at first. This article will cover some basic DAX functions to get you started on the way to becoming a PowerPivot legend: RELATED, IF, and CALCULATE.
Before proceeding, let’s start with the basics. You may want to brush up on normalization – the set of rules for making databases most efficient. Also, here’s our tutorial for getting started in PowerPivot. Once you’ve got your data imported and tables joined, we’re ready to begin. Note that, when creating a relationship between two tables using a particular field, that field cannot have duplicate values in at least one of the tables. Or, in database terms, it must serve as a primary key before a foreign key can be linked to it.
Ok, you’ve got your tables imported and your relationships established. It’s time to make your calculations.
This is the DAX equivalent of Excel’s VLOOKUP function. I have a PowerPivot data model that keeps track of all of the blog posts on Decisive Data’s website. One of my tables is Topics, in which I assign one or more topics to each blog post. This table currently has the fields BlogID (which is linked to the Blogs table) and Topic.
To pull the blog name into this table, I click in the blank cell under “Add Column” in my Topics table, and type =RELATED(Blogs[Title]). Blogs is the foreign table I’m referencing, and [Title] is the name of the column. Because I already created a relationship between these tables based on BlogID, PowerPivot can do the rest:
IF is the fundamental building block of logic, and is present in just about every programming language you can find. The format for IF in DAX is =IF(LogicalTest,ResultIfTrue,ResultIfFalse). Let’s use an IF statement to determine if an Employee is still working at Decisive Data or not. Below is my Employees table:
Notice that only David Brown has an End Date – the rest of us are still working at Decisive Data. I can use this in my IF statement. I will create a new column, just as I did with RELATED above, and type =IF(Employees[End Date]=BLANK(),"Yes","No"). In plain English, it would say “If the End Date column is blank, then show Yes, otherwise show No.”
Ta-da! IF is pretty straightforward to use in DAX, except for when you have more than 2 possible outcomes. Most languages have a structure that allows multiple conditions, like this:
IF condition1 THEN result1 Else If condition2 THEN result2 ELSE result3.
DAX does not have this; you have to nest your IF statements instead. An IF statement like the one above would look like this in DAX: =IF(condition1,result1,IF(condition2,result2,result3)).
I want to find the earliest Start Date of the current employees at Decisive Data, so I’m going to use the Employees table again.
I will start by creating a calculation for this table (remember from the tutorial, calculations are added in the calculation workspace at the bottom of the screen) that says EarliestStart:=MIN(Employees[Start Date]). The format here is CalcName:=MIN(Table[Column]). This will return the earliest start date out of the entire Employees table; however, I want to filter out people who are not current employees. This is where we can use CALCULATE.
CALCULATE is the function used to apply filters to calculations. Its format is =CALCULATE(Calculation,FilterExpression1,FilterExpression2,etc.), and you can add unlimited FilterExpressions or none at all.
My final DAX calculation will look like this: EarliestCurrentEmpStart:=CALCULATE([EarliestStart],Employees[Current Employee]= "Yes"). I already had EarliestStart saved as a calculation from earlier, so I was able to reference it here. However, if I hadn’t created it earlier, I could have put the MIN statement inside of my CALCULATE statement. It would look like this: EarliestCurrentEmpStart:=CALCULATE(MIN(Employees[Start Date]),Employees[Current Employee]= "Yes"). A little less neat, but still perfectly functional.
These three functions are a solid foundation for your future in PowerPivot. One last thought before you go running off to build some data models: if you want to calculate a sum of denormalized data, then I’d recommend this blog post from our resident PowerPivot expert, Tom Escobar. For example, let’s say you have two rows for the same sale in your sales transaction table, but you don’t want the revenue to be counted twice. Your data is denormalized, and this solution will help you get an accurate revenue figure.
Looking to learn more about PowerPivot? Want to see what kinds of wizardry a professional PowerPivot user is capable of? Go check out PowerPivotPro!
Posted by Brian Pohl