If you work in analytics or business intelligence, you are probably already familiar with a date dimension, since almost every analysis has a reference to a date.
But although nearly every data model has a date table, there is quite some confusion about the details. Should I use the column from the type date for the relationship or better the date as integer? Which of both is faster? And do I have to mark my date dimension as date table? Why sometimes my functions work without marking it as date table and sometimes they don’t?
For a long time, I didn’t understand completely myself many details until Alberto Ferrari explained the for me missing details at the PASS Community Summit in November 2022.
In this article, I try to break down this complex topic into an easy guide, in order to understand the details of date table relationships.
The problem and scenario for our demo
Whenever you work with a date related data model, at one point you want to use some time intelligence functions. In DAX, time intelligence functions are easy to work with, but you need a proper date table.
A proper date table is required
Microsoft already defined, what a proper date table is. A date table has to fulfil the following requirements:
- It must have a column of data type date (or date/time)—known as the date column.
- The date column must contain unique values.
- The date column must not contain BLANKs.
- The date column must not have any missing dates.
- The date column must span full years. A year isn’t necessarily a calendar year (January-December).
Microsoft also mentions that “the date table must be marked as a date table“, but that is not required, as we will learn today.
The easiest way to create a date table is with DAX. This also has the advantage, that the table will always be up-to-date as it expands as new data is appearing. If you don’t know how to create a date table, take a look at the following article on SQLBI.
Year-to-date function as example for this article
As an example for this article, we will use the year-to-date function. In finance, a year-to-date-function provides a cumulative view of a given measure from the beginning of the year up to the current date. Here is an example of our [Sales] measure and a corresponding year-to-date or YTD measure with the name [Sales YTD] and how it should look like:
Be aware that for every month, the year-to-date function is summarizing all the values from January 1st of the year until the given date.
In DAX you can achieve that result extremely easy, by just using CALCULATE and the function DATESYTD as a CALCULATE modifier. The code for the measure above looks like this:
Sales YTD = CALCULATE( [Sales], DATESYTD( 'Date'[Date] ) )
So far everything is easy and you might ask what exactly is the problem with all of that. The problem is with the same function and the same data model, but with one change in the relationship, your result might not work anymore and will look like this:
As we didn’t change anything at our DAX code, the question is why does it work once and another time it’s not working anymore?
We focus on a single value
In order to understand our examples better, we will focus on one specific value. For the upcoming examples, I will filter down our table to the calendar year 2008 and the month number 04.
The correct values should look like in the marked row:
Should I use Date or Integer for the connection
As you might guess from the topic, the reason why once it’s working and once not, is the relationship between the tables.
You can usually use two kind of columns to connect the dimensional date table to the fact table. You can either use the column with the datatype date or date/time or you can use a column that represents the date as an integer, what I will call DateID in this article.
I think the difference is visible in the following graphic:
In fact, you could also use a string or any random integer as ID. Both will behave like our example with the date as Integer, the DateID. Also, you can use date/time instead of date, these two types have a similar behavior and are exchangeable. But we will not take a look at this cases as it’s less likely in real life and as the behavior is similar to the ones we anyway take a look at.
Date for the relationship
Let’s take a look first at the most easy solution, we will use the Date column for our relationship.
The relationship is straight forward, we connect our two tables with the date columns.
Here is a representation of how it would look from a table view:
And this is how the relationship actually looks:
Our time-intelligence function works as we expect:
It sounds too easy to be true, but if I use the date column, all of my time intelligence functions will work. We will dive a little deeper into the technical details at the end of the article, but keep that in mind for the beginning.
DateID for the relationship
Our second alternative is that we connect the tables with our DateID column.
Again, here is a representation of how it would look from a table view::
And this is how the relationship actually looks:
This time surprisingly our result is wrong! We don’t get the year-to-date value, we only get the value from the sliced and in the filter context available year and month:
Reason is the filter context
You might wonder why we don’t get the correct result. But in order to understand how we can solve it, we first have to understand the reason, why we don’t get the expected result.
For that, let’s take a look at the filter context of the given cell. For this cell, the filter context is the Date[Calendar Year] = “CY 2008” and the Date[Month Number] = 04.
Our calculate modifier DATESYTD adds another filter context to the calculation. It adds the column Date[Date] to the filter context and with that all the dates from January 1st until April 30th of the given year. But as the inner filter context of the year 2008 and month 04 is still there, all the dates from January to March are gone due to the filter context anyway.
The following graphic shows the filter context of the current cell:
This filter context will then be applied on the underlying date table. Here is a simplified representation. As the table will be filtered for Date[Month Number] = 4 and Date[Calendar Year] = “CY 2008”, the filter for Date[Date] has no effect on the months January to March as these dates were already removed from the first two filters:
You can make that visible in DAX Studio in the xmSQL query:
As you can see, the query anyway is filtered to “CY 2008” and the month 04 as shown in part 1 of the screenshot above. This comes from the actual inner filter context. But also the date is filtered to the tuple of year, month and day as shown in part 2. The decimal numbers you see in part 2 are the days themselves. The representation of a date and date/time in the tabular model is a decimal number. The hint “121 total tuples, not all displayed” shows already that there are in total 121 day (31 days in January + 29 days in 2008 in February + 31 days in March + 30 days in April = 121 days).
For that reason, the result only shows the value of April, because all the other values of the year are removed by the filter context.
Solution one, use ALL
As we know now why the correct values are not shown, we can now think about possible solutions.
I think the first one that comes to my mind would be to remove the inner filter context for the DATE table with the ALL function, as we anyway get the correct dates from the DATESYTD function. We change our measure accordingly:
Sales YTD DateID ALL = CALCULATE( [Sales], DATESYTD( 'Date'[Date] ), ALL( 'Date' ) )
And as we expected, the result is now calculated correctly:
If we take a look again at the xmSQL query, the code changed and the inner filter context was removed:
This way, the calculation returns the correct result. But the clear downside is that I manually have to add an ALL-function every time I use a time intelligence function!
But don’t be worried, because here our next approach will be helpful for you.
Solution two, mark as date table
We have now a solution for the DateID-connection, but we have to manually add an ALL statement to every measure. At this point, you might be wondering what is the deal with the “mark as date table” function?
The “mark as date table” function is making your life easier in the case of a integer-connection. With only a few clicks after selecting the date table, you can mark your table as date table as shown here:
Once you’ve done that, the DAX engine is automatically adding an ALL( ‘DateTable’ ) to every query that is using the date column, like in our case the DATESYTD function.
This means the last approach works again with the original measure:
Sales YTD DateID DateTable = CALCULATE( [Sales], DATESYTD( 'Date'[Date] ) )
And as expected, we get the correct result:
And the xmSQL query is absolutely identical to the manual way, as exactly the same happens in the background:
Summary and how to remember
How about the performance?
As you know now the details of the possible relationships, you might be wondering how are the differences in the performance between these two data types.
Also here our friends from SQLBI did already a great job and compared both approaches. The short summary, it doesn’t really matter, date is a little bit faster and integer needs a tiny little bit more space. But as mentioned, the differences are extremely small. I highly recommend reading the article in order to properly understand the differences.
Why did it work with the DATE column?
If you paid attention, you realized I didn’t explain why it worked immediately with the Date column. From a logical understanding, it should behave like the date as integer. And that was also the part that was driving me crazy for quite a while because for me it didn’t make any sense that it works with a date column but not with a date as integer.
What I was not aware is a special behavior when the column of the relationship is from the type date. In this case, the engine is automatically adding an ALL(‘DateTable’) to the query!
So this means if I just use a column from the type date, an ALL(‘DateTable’) is always added to the queries! Like this, you don’t have to deal at all with adding an ALL or mark-as-date-table.
There is an ALL in every solution
As we also understand now the different behaviours, this means for the query it doesn’t matter if you use a date column for the relationship, if you use an integer column and manually add an ALL( ‘DateTable’ ) or if you use an integer column and mark the table as date table. In all 3 cases there will be an ALL( ‘DateTable’ ) added to the query and the same query will be executed.
I also compared the query plan in DAX Studio for all 3 cases and they are completely identical. I added all the query plans to the zip-file at the end of this article, together with the PBIX file I used to demonstrate the specific cases.
Decision tree to remember
As sometimes it’s hard to remember the details, I thought it’s a good idea to add a small decision tree to choose the right solution for every case:
Like this, you can always refer to the steps you have to take in order to make the time intelligence functions work.
Download of all the files
You can download all the files from this article here: