The importance of relationships in SQL Server 2012 tabular models
An essential feature in BISM (Business Intelligence Semantic Model) tabular data models, the new BI development paradigm built in SQL Server 2012, are the relationships between tables of a particular model, because without them it would be impossible perform a proper analysis of the information contained therein.
Let's say we have a database named AdvWksDW, created from some of the tables belonging to AdventureWorksDW2012 database, but without the corresponding relations between them. The following code block shows the SQL statements necessary for its creation.
CREATE DATABASE AdvWksDW
SELECT * FROM INTO FactInternetSales AdventureWorksDW2012. Dbo.FactInternetSales
SELECT * FROM INTO DimDate AdventureWorksDW2012. Dbo.DimDate
For those readers familiar working with multidimensional analysis models, if we were to build an OLAP cube using this database, the FactInternetSales table represents the fact table because it contains the columns from which we obtain measures or analyze numerical results, while DimDate table represents a dimension table, used to filter the information by the dates in the model.
Then we'll create a tabular analysis project with SQL Server Data Tools (SSDT), the development environment for SQL Server 2012 based on Visual Studio 2010, known in earlier versions as Business Intelligence Development Studio.
In the data model of this project we will include the tables in the database just created, adding also a measure that sums the values in column SalesAmount.
To analyze this model from Excel we will select the menu option SSDT "Model | Analyze in Excel", or we will click on the button on the toolbar that serves the same purpose.
This action will result in opening a new spreadsheet showing the data model in a pivot table which we'll place the measure "Sum of SalesAmount", from FactInternetSales table, into the Values block, and the CalendarYear field from DimDate table in block "Row Labels".
The result we would need to obtain from this pivot table is the sum of SalesAmount field, that belongs to FactInternetSales table, grouped by sale years, that is, for the year in OrderDateKey field. Since we have DimDate table as catalog dates, if we combine both tables in a SQL query using the appropriate date fields, we'll get the expected numbers.
However, the result obtained in the pivot table is very different, because all the cells show the same value: the sum total of SalesAmount field.This is due to the lack of a relationship between fields and DateKey OrderDateKey from FactInternetSales and DimDate tables respectively, which causes tabular model engine don't know how to apply the measure "Sum of SalesAmount".
Relationships in the tabular model
In scenarios like the one just described is where we realize the importance that proper establishment of relations between tables in our model, in order to achieve optimal data analysis.
A tabular model gets, if any, the relationships directly from the data source during execution of the data import wizard.The relationships obtained in this way will name automatic relationships.
If the model has no relationships or we need additional relationships to existing ones, we can create them using the management relationships window available for this purpose in the development environment of Visual Studio 2010; we'll call this type manual relationships.
In our current situation, for the information in model is consistent, we need, as noted above, create a relationship between FactInternetSales and DimDate tables, using OrderDateKey and DateKey fields; so we'll select the menu opion "Table | Create Relationships" which opens the window where we will make this operation by selecting tables and fields members of the relationship.
After relationship creation, we'll back to the pivot table in Excel that we are using to analyze the model, and then clicking on the Refresh option (Data group in Options tab, contained in the top-level tab "PivotTable Tools") a new data read will take place, refreshing the pivot table content, and making sales figures by year show correctly.
The state of the relationship. Active and inactive relationships
In a data model may happen that a table where we have defined measures, several of his fields relate to a single field in another table that we will use to filter data from the first.When this happens, how can we know what field into measures table is relating to the filter table when we analyze the model for both tables? For those readers experienced OLAP cubes development, the question to ask is: what field of the fact table is relating to the dimension table?
The answer to this question is found in the relationship status (active or inactive), and the best way to illustrate this concept is through an example.Therefore, we create a new tabular project in SSDT, whose data source is this time the AdventureWorksDW2012 database, importing into the model, as in the previous example, FactInternetSales and DimDate tables, and also defining a measure with the sum of SalesAmount column.
When we analyze the model in Excel, placing the field CalendarYear in the row labels, the resulting numbers correspond to sales by invoice date.
That will be the same figures we obtained from the SQL query presented in a previous section.
At this point is where the reader will surely wonder: "How does the model I want to get the sum of SalesAmount field based on OrderDateKey field, and not on DueDateKey or ShipDateKey, which are also type date? "
Actually, the model does not know, but merely it uses the active relationship of the model. Let us explain this in more detail:
If we look at the designer's model diagram view, we'll see that from the table FactInternetSales three relationships go to DimDate table.Between the arrows indicating visually the relationships, the one with the continuous line trace corresponds to the active relationship, and therefore, will be used by default when you see the model using the field CalendarYear from DimDate table.Clicking on this relationship, we'll see its detail in the Visual Studio properties window . Note that the Active property is True.
We can also find out the working relationship, and therefore inactive by selecting in Visual Studio the option menu "Table | Manage Relationships", which will open the relationship management window, where the Active column report us this particular aspect.
Querying the model through a inactive relationship
The issue to ask now would be how to question the model, so that it provide us the sum of SalesAmount field, but using any of the other two date fields (ShipDateKey or DueDateKey) that are also related to the DateKey field in DimDate table. In other words, what we need is to alter the context of the currently active default filter.
The solution is to create a new measure, which will also sum the field SalesAmount, but somehow indicating that the grouping field will be one other than the default.For example, if we want the sales results through ShipDateKey field, we'll use the following DAXexpression:
VentasPorFechaEnvio: = CALCULATE (SUM ([SalesAmount]); USERELATIONSHIP (FactInternetSales [ShipDateKey] DimDate [DateKey]))
To add this measure to the data model, within the area reserved for the measures we will click on an empty cell below the SalesAmount column, and write the previous expression in the formulas / expressions panel.
Consider this expression by parts: we use the SUM() function since our main goal is to add the SalesAmount field, but when grouping the sum, we want use the ShipDateKey field through its relationship with the DateKey field of DimDate table, so we must explicitly designate that relationship because isn't active, and this is what we do with USERELATIONSHIP() function.But because of using a relationship that is not active, we need to change the current filter context, what we get using the CALCULATE() function.
As we did in previous examples, if we transfer this to a SQL query to check that the values obtained are the same, the sentence would be as illustrated below.
In this paper we have made an introduction to the relationships in tabular models of the new semantic model for Business Intelligence built in SQL Server 2012 2012.This feature represents a cornerstone in building analysis systems using this technology, so we hope this article will help the reader to start in developing their own analysis models using BISM. The worked examples throughout the article are available at this link.