In this blog post, we will show you some information regarding the new Power BI functionality known as Dataflow, that already exists in services such as Office 365. We must highlight that this new service is still in the Beta stages, so it is currently subject to modifications and updates.
Dataflow is a new Power BI service functionality for ETL self-service processes that allows data analysts and/ or business users to share reusable data entities and to build them using Power Query.
In contrast with Power BI datasets which, at the end of the day, are merely Analysis Services table models, Dataflows are stored as entities in the Common Data Service (CMD) in Azure Data Lake Storage Gen2 (ADLSg2). Both the Datasets (Analysis Services), as well as the Dataflows (CMD), are characterized by an underlying technology that is opaque to users.
Dataflow allows users who are familiar with datasets, reports, and Power BI queries, to also create the reusable data building blocks themselves. These can be later used as data sources for their datasets.
Before going deeper into the matter, we must be aware that Dataflow has certain limitations depending on the Power BI license. In order to use this functionality, you must be at least a Pro user. You cannot benefit from using Dataflow with a free standard license.
So, depending on whether you have a Premium or Pro license, you will have access to more or fewer functionalities within Dataflow:
|Functionality||Pro License||Premium License|
|Total storage||10 GB/user||100 TB/node|
|Dataflow Authoring with Power Query Online|
|Dataflow Management within Power BI|
|Dataflows Data Connector in the Power BI Desktop|
|Integration with Azure|
|Computed Entities (in-storage transformations via M)|
|Dataflow incremental refresh|
|Running on Power BI Premium capacity / Parallel execution of transforms|
|Dataflow linked entities|
|Standardized Schema / Built-In Support for the Common Data Model|
Table 1 Information obtained from Microsoft’s official website (docs.microsoft.com)
Creating and using Dataflow examples
Coming up next, we will provide a detailed step by step guide of how to create and use Dataflows with a Power BI Pro license.
First of all, go to your app.powerbi.com service. Once there, create a Workspace on which you can create your Dataflows. Before you can start using Dataflow, you will be required to download and setup the Data Gateway in order to connect to your data source.
Once you have created your Workspace and your Data Gateway is set up pointing to your data source, go to your new Workspace and create your Dataflow.
A window containing the Dataflow editor will immediately pop up. Here you must first add a new entity that will be the source of your data. Following, you will be required to select a data source. In this case, it will be SQL Server.
Next, you will need to setup your connection to this data source by selecting the Gateway you have previously downloaded, installed and setup.
Once the connection to your data source is created, you will be able to export the desired tables using Power Query:
In the next window, there will be a query editor that is very similar to that in Power BI Desktop. Here, you will be able to modify your source data using code M or basic transformations. A question mark will be visible on the left hand side at the top of each table icon. This question mark will disappear as you click on each of the tables in order to prompt the first data refresh. Once you have completed the required modifications within the editor, simply click on “Done” in order to save your first Dataflow. This can be edited as often as required by the user.
You will be required to refresh your data before connecting to the Dataflow from Power BI.
In addition to this, and in the same manner as with the rest of Power BI service elements, Dataflows can also be set to carry out scheduled data refresh. As you can see, since we are using a Pro license, and as commented in the above section on limitations, we are allowed to refresh the data “only” 8 times.
Finally, you will not only be able to connect to your Dataflow from Power BI Desktop, but also to process it as any other data source where you can establish the required links, create DAX measures and edit the queries using M code or basic transformations, etc.
We must also highlight that, since this is at Beta stage, there are some small faults. Among these, for example, is the fact that Power BI creates a series of links when importing your Dataflow tables. There are no reasons why these links would have been made by correct fields. Furthermore, not all tables are linked with one another. For this reason, we recommend users to check them before they start working on the model.
Any technical information used in this blog was obtained from: