How to Prevent the Deletion of Datasets in Power BI

Note - This guide is being published in Oct. 2020. This month Microsoft began rolling out the "New Look" to the Power BI service. We have enabled the new look and the features underlying so any screenshots in our guide are based on the new look. Also to note, we are using SSAS tabular with a data gateway and a live connection from the Power BI service.

Update May 2023 - As of May 2023 the protected dataset is still the best approach to prevent the deletion of datasets (and associated reports) in your Power BI tenant. Store your datasets in a workspace that only admins have access to. Keep all master reports in your GIT repository. Share the protected dataset from the admin workspace to user groups in your organization. Hard to believe this is still an issue, but it is.

Overview

Datasets are what we use to build reports in the Power BI service. A dataset will reference a data source that has been configured in the Power BI service. We are using SSAS Tabular to build models that can be consumed by Power BI, we publish these models and then access them through a data gateway. We create the data sources on the data gateways in the Power BI Service and then link the data sets to the data sources.

All reports you build in the Power BI service are going to reference a dataset. If a single dataset has 100 reports built off of that data and somebody comes in an accidentally deletes that dataset, all the reports that are tied to that dataset are also deleted. Power BI is a self-service product and you want your teams to be able to build their own reports. In order for teams of a Power BI workspace to build their own reports you need to grant those users contributor permissions on their workspace. This permission also allows those users the ability to delete datasets and that is where you can run in to trouble.

For some reason Microsoft has not done a great job with options to protect content in Power BI. There is no concept as a recycle bin in the Power BI service, when it's gone it's gone. The security roles for workspaces are too broad, the main roles are admin, contributor, and viewer (we'll get more in to this in a minute) and most users will be in the contributor role if you want them to create their own reports (we do). The contributor role is over-permissioned unfortunately so we have to take extra steps to protect our content in Power BI.

That is the purpose of this guide, to you walk you through steps on how to protect content in the Power BI service and reduce the risk of a dataset (and all of the content that depends on it) from being deleted while at the same time enabling your teams to use self-service reporting and workspace management.

Dataset Basics

  • Datasets depend on data sources. We can define a data source on a data gateway but to create a dataset, we first need to create a report in the Power Bi desktop. In our environment our Power BI desktop report will use a live connection to an SSAS Tabular database/model. When we publish this report to the Power BI service it will create a dataset. We link the dataset to the data source.

  • When you delete a dataset in a workspace, all reports that are built from that dataset in that same workspace will be deleted along with it. If a report was created in another workspace that referenced the deleted dataset, that report will become orphaned and no longer work.

  • Datasets are based on GUIDs and not names. If you delete a dataset and try to republish it again with the same name, as far as Power BI is concerned that is a new dataset and no existing content can be linked to it. Since datasets are based on GUIDs and not names, it is possible to have more than one dataset with the same name so this could cause some confusion in a workspace.

  • A dataset inherits permissions from the workspace it resides in, but it can also be granted additional permissions beyond the workspace permissions by modifying the dataset permissions (we will explain this below).

Workspace Configuration

In order to protect our Power BI datasets we need to store them in a workspace that only your BI developers have access to. From there we will grant build permissions to users on the datasets we publish in this workspace but they won't have access to delete that dataset. Using this along with version control (DevOps, BitBucket, etc.) is the best way to secure and protect content in your environment.

Note - what we are doing here assumes you and your users are using the Power BI environment in the Professional or Premium capacity. You will need this type of licensing to share content across the organization.

In the Power BI Service, create a new workspace named BI Admin. For the members, start by granting admin rights to your BI Developer admin(s). These are the people who will publish, modify, and maintain all the content in this workspace. In the screen shot below you can see we created the new workspace and permissioned myself as admin to the workspace, but there is no content in the workspace yet. While we are at we are also going to add a single user who only has view permissions in the workspace. This is the user we will delegate build permissions to on the dataset.

PBI_Dataset1.jpg

Publish Dataset

Now we are ready to publish our first dataset to the secure workspace.  Remember, we are using an SSAS tabular model with a live connection, your environment might be different but the securing process is still the same.

Open Power BI desktop and make a connection to your data source, for us that is an SSAS tabular model.

PBI_Dataset2.jpg

When you connect to the SSAS tabular model your will see all the tables in the model populate in the report.

PBI_Dataset3.jpg

On the report canvas there is no need to create a visualization, what you should do instead is to make a note about the report and the dataset, something that tells you what it is and what team it is for. For example, if this dataset is for the "Product Team" I might add something like you see in the screen shot below.

PBI_Dataset4.jpg

When you publish your first report for a particular dataset to the Power BI service it will also publish the dataset that report is connected to. In the case of our report here it is connected to the SSAS tabular model for the Adventure Works database. It's important to keep in mind that the dataset in the Power BI service will have the same name as the report it references. You do not want to rename the dataset after it is published, instead give your template report a name you want reflected in the dataset, something like "Product Support Dataset".

Now we are ready to publish this to the Power BI service. Click on the Publish button on the home tab, select the target workspace, in this case the BI Admin workspace, and click the Select button.

PBI_Dataset5.jpg

You will see the following message when the report and dataset have been published.

PBI_Dataset6.jpg

Now let's open up the Power BI workspace we created earlier, you should now see your report and your dataset.

PBI_Dataset7.jpg

I understand the naming around the report doesn't make much sense, but we don't want to rename the dataset after it is published so we give the report a name that best suits the dataset. After we have published the dataset next we need to link it to a data source. Click on the Manage Gateway option in the settings gear. When you are in the Gateway settings, click Add a Data Source and add the model of the SSAS tabular model your dataset is based on. You will need all the details of the SSAS server you are connecting to, this also assumes you already have the data gateway up and running.

PBI_Dataset8.jpg

Once the data source has been added we can go back to settings of the dataset. Expand the Gateway connection section and for the target database, select the data source you just added in the previous step. As you can see in the image below, we named the target data source AdventureWorks.

PBI_Dataset9.jpg

If you recall earlier, we setup a user on the workspace that has only view permissions. That user is a member of the Product Support team and wants to use the new dataset we just published to build reports from. We want to make sure the user can't delete the underlying dataset thus deleting any reports tied to that dataset. So what we are going to do is give the user build permissions on the dataset directly, this will allow them to build reports off of the dataset but will not let them modify the dataset. If this user was a contributor to the workspace they would be able to delete the dataset, that is what we are trying to prevent.

By default the dataset will inherit permissions from the workspace, but we can grant additional permissions on the dataset. Click on the Manage Permissions option of the dataset.

PBI_Dataset10.jpg

Now we can grant the build permission to our view only user, this will allow that user to select this dataset and build reports from it while working in their own workspace. It's basically sharing the dataset without elevating the users rights to the dataset beyond exactly what they need.

PBI_Dataset11.jpg

If our user Kevin wants to build a report from this new dataset, in Kevin's team workspace he can click New>Report. From there he can view all datasets he is permissioned to and can select the Product Support dataset we just published and granted build permissions on.

PBI_Dataset12.jpg

Now we need to select the Product Support dataset and click create. This will effectively create a linked reference to the dataset allowing the user to create a new report in their workspace.

PBI_Dataset13.jpg

When they save the report they won't even see the dataset in their workspace much less be able to delete the master copy of it. This is how we ensure that our users have permissions to create and modify their own content without running the risk of errantly deleting a dataset that then wipes out reports for the rest of the team.

Also, now when we want to add changes to the underlying data model in SSAS all we have to do is update the model, refresh the report in Power BI desktop, and republish the report back to the Power BI service. This will preserve all connections from reports to the dataset while also giving those reports access to the new assets published to the SSAS model.