How to Architect Your Power BI Solution
The best place to start when learning to develop Power BI solutions is to review the logical architecture your Data Visualization projects will have. There are many moving parts and pieces to a Power BI solution, especially when looking at the varied data sources you can use. From on-premises to web-based, from public to confidential, you must make sure you expose your data securely. Having a good clear architecture will help you achieve a not only a good solution but also a secure one. The following article will show you how you can architect your own solution.
Architecture
Without getting too deep into the physical architecture a quick review of the various logical architecture components are pictured below. The sections are organized by the numbers referenced in the screen capture. This architecture covers at a high level both the online PowerBI.com (SaaS) and on-premises Power BI Report Server.
Actors / Personas (1) & (2)
There are two groups of actors I classify in Power BI, Developers and Users.
- Developers & Designers – The first group is those that create content. They can create data sets or visualizations. They provide the role of developers and designers of the solution. This group crosses departments and roles. They may not work on Power Bi fulltime but should have a role in your solution.
You cannot create a visualization without data and the DBA and data team needs to be involved not only in granting of access to data, but also to assist in creating views onto the data.
It is important that the data is as close to presentation ready as close to the data source as possible. This is my most important recommendation about data used in a solution. Leverage the power of all that money tied up in the database servers. If 100 people use the same data for reports, having one set of views will save a great deal of time, not only in development but also in auditing the final results. It also saves the 100 people from performing the same transformations and data modifications.
2. User Community – This group of people is the ones that will be the final consumers of the solution. For some organizations, there may be overlap, but these are divided up by the minimum roles required.
Admin / Audit / QA – users who maintain the final systems including security, testing, and approvals. In addition to Quality Assurance (QA), I have split out Audit as a specific group. These are the data owners or a group that is tasked with making sure the data on the final reports balance with the source systems. (This should be job 1)
Included in the QA user role should be the balancing and validation of calculations and formulas used in the solution. Auditing should also be proactive with balance reports to capture breaking changes that may occur during the life of your project. Trust in your solution is an important part of your reputation. If you have incorrect data, it is very easy for your users to lose faith in not only the report but also the team or people producing the analysis.
Analysts & Power Users – These are the core users of your solution. Their demands are different than most and should represent your most vocal source of feedback and upgrade suggestions. You will identify these users early in your solution as they will be the main consumers of your work. They will use the data sets created in Power BI and may also create the dashboards and reports themselves which will build upon your solution
End Users – These are simply the consumers of the reports and dashboards. Sometimes are the executive teams who may have different needs than power users and analysts. Needs such as being able to drill down to detail or are looking for proactive analysis and suggestions based on the data. They will be the source for your Guided Analytics or stories that they will be built from the data. For example, if one metric is showing an issue, then drill to another report with different detail on the member selected which could lead to another report with different analysis. Get to know how your users will use the data, what their questions are and how they developing solutions to issues brought forward from the data.
Data Sources (3) & (4)
Data sources in this architecture are divided into two groups, those in the cloud and those on-premises. Depending on your organization’s policies, you may have to design your solution to operate on data within your firewalls. In following sections, discussions around the Power BI gateways will review how to provide secure access to local data though pushing updates to Power BI. This becomes part of a solution around using the Power BI cloud service to host your dashboard and reports as they need access to the source data to reflect updates.
3. Internet-Based Data Sources – These can take several forms; structured data such as Azure Databases, Excel files in SharePoint or OneDrive, Software as a Service (SaaS) such as Google Analytics or even just web pages scraped for their data. The key differentiator of these sources is that they do not need to access data through your firewalls to your local, on-premises systems.
4. On-Premises Based Data Sources – These sources are within your network. If you are using the Power BI Report Server, which is a server located on your on-premises network, you do not have to worry about your data being housed in the Cloud.
If you are using PowerBI.com you can still use your on-premises data sources using a Power BI Gateway which is referenced below. This is important if you plan to use the Power BI online SaaS destination for your reports and dashboards as the update services need access to process updates if required.
Active Directory Security and Access (5) & (6)
Being able to maintain the security of your data and still use it in the cloud requires the same care as you have in your on-premises data. Azure can be just as secure, if not more secure, than your local data. When using Power BI SaaS and Power BI Report Server access is controlled by Active Directory.
5. Active Directory Domain Services (On-Premises)
Active Directory Domain Services is the foundation for security and access to the objects on your local network. Through users and groups, it provides a broad range of hierarchical style identity and security-related services. For Power BI Report Server, for example, you would create groups that would contain users to provide the correct level of rights to each user based on the role that they would have in your solution. Security roles such as;
- Administrators
- Report Developers
- Data Architecture team members
- Quality Control team
- Readers
- Editors
- Data Source service accounts
- Project Lead
- Read Only users
- Business Analysts
- Designers
- ETL Team
6. Azure Active Directory – This is the cloud version of Active Directory (AD). This allows for object security and identity management in a central location with centralized policies and rules. This mirrors the functionality provided by an on-premises AD. You would use this to provide access to your Power BI reports and dashboards.
Integrating your internal Active directory with Azure Active Directory can be accomplished with Azure Directory Federation Services (ADFS). Azure Active Directory (Azure AD) Connect will allow you to federate with on-premises Active Directory Federation Services (AD FS) and Azure AD. Once you have the federation sign-in, you can sign in to Azure AD-based services with your on-premises passwords with the benefit of not having to enter your passwords again.
Power BI Gateways (7)
7. Power BI Gateway – This allows keeping the dashboards created using on-premises data sources, deployed to the PowerBI.com service, updated using either a refresh schedule or on demand by a data steward. There are a couple of gateways depending on the solution you need. The Power BI gateways are installed on a server or even a desktop machine in your local network. They provide outgoing data feeds to Power BI. Some ports need to be open for outgoing communication only. The following table reviews the current features of each version.
On-premises data gateway | On-premises data gateway (personal mode) | |
Cloud services it works with | Power BI, PowerApps, Azure Logic Apps, Microsoft Flow | Power BI |
Serves multiple users with access control per data source | X | |
Runs as an app for users who aren’t administrators on the computer | X | |
Runs as a single user with your credentials | X | |
Import data and set up scheduled refresh | X | X |
Support for DirectQuery to SQL Server, Oracle, Teradata | X | |
Support for a live connection to Analysis Services | X |
Power BI Reporting Server
8. Power BI Reporting Server – This provides a combined on-premises solution for self-service and enterprise reporting. Power BI Report Server is software installed on-premises that is part of the Power BI Premium offering. The easiest way to look at this software is that it is SQL Server Reporting Services (SSRS) with the ability to host and serve up Power BI reports and datasets. The one thing that this does not have is the ability to host Power BI Dashboards. You can create dashboards in SSRS, but cannot use Power BI dashboards. The functionality of Power BI is the same, only that there is a separate version of Power BI Desktop that publishes to the Power BI Report Server URL instead of the PowerBI.com service.
Power BI Service (SaaS)
9. Power BI Service – This is the Power BI service on the web. This will house the datasets, reports, and dashboards you produce using Power BI Desktop. You can also schedule periodic refreshes of the datasets using the Schedule Refresh options.
The recommended way to share and distribute your Power BI solution is by using Power BI Applications. These can be secured using Azure Active directory and allow you greater flexibility to have team development of your Power BI solutions.
Conclusion
In order to design a successful Power BI solution, you need to address all the architectural elements in your solution. Having a good overview will allow you to avoid problems further down the road.