Over time, user workspaces become cluttered and inefficient. This is a common issue encountered by medium and large organizations utilizing the Power BI Service. Frequently, users create reports that remain unused, others become obsolete, and some are only generated for testing purposes but are not subsequently removed. This surplus of data and unnecessary reports can lead to chaos and hinder effective management of the Power BI platform within the organization. Therefore, it is crucial to take actions aimed at organizing, 'cleaning up,' and getting rid of unused reports, as well as archiving workspaces in Power BI. This is the focus of the project I have been working on in recent days. I invite you to read further.
The project aimed to remove rarely used reports from the Power BI Portal, archive them on Azure Blob Storage, and create a table with information about which users have access to the archived reports.
The project is divided into three parts:
Let's begin by considering how to approach the problem. What tools do we have at our disposal, and which ones should we prioritize using initially, and which as a last resort? We certainly need a way to communicate with Power BI Service – we need to find data about reports to delete, assign permissions to workspaces, archive reports, and then remove them from the Portal. The first tools that come to mind are Power BI API and PowerShell cmdlets. It's worth noting that to execute the described script, we will need very powerful permissions – we require read-write access to all workspaces in the organization, including Personal workspaces.
API or cmdlets are always the preferred methods, as they are significantly faster!
Unfortunately, as of the current date in September 2023, neither Power BI API nor cmdlets provide the capability to grant access to Personal Workspaces using these tools. Therefore, we have to implement a workaround – access will be granted "manually" through a web scraping script in Python. This means that the program will navigate to the Admin Portal page in Power BI and automate the process of granting access to thousands of workspaces (Power BI Portal → Settings → Admin Portal → Workspaces → select a workspace → Get Access).
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium import webdriver
# chromedriver settings
options = webdriver.ChromeOptions()
options.add_argument("--start-maximized")
# parameters
# example path to chrome data
options.add_argument(r"--user-data-dir=C:\Users\UserName\AppData\Local\Google\Chrome\User Data")
# directory with access to PowerBI Admin Portal (thanks to that we can skip MFA)
options.add_argument(r'--profile-directory=Profile 1')
# path to chromedriver
service = Service(r"C:\Users\folders\chromedriver.exe")
driver = webdriver.Chrome(service=service, options=options)
url = "https://app.powerbi.com/admin-portal/workspaces?experience=power-bi"
driver.get(url)
checkbox_index = 2
# counter of workspaces on the page
element_number = 0
counter = 0
while True:
checkbox_id = f"mat-checkbox-{checkbox_index}"
# checkbox
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, checkbox_id))).click()
counter += 1
print(counter)
try:
# get access
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, 'button.mat-button span.mat-button-wrapper span[localize="Get_Access"]'))).click()
checkbox_index += 11
except:
print(f"Cannot click checkbox with ID: {checkbox_id}")
# unclick checkbox
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, checkbox_id))).click()
checkbox_index += 1
element_number += 1
# go to next page
if element_number == 10:
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, 'button.mat-paginator-navigation-next[aria-label="Next page"]'))).click()
element_number = 0
In this case, the process is much simpler – there is a PowerShell cmdlet designed specifically for this purpose. Before using it, we need to obtain a list of workspace IDs to which we want to grant access – this step can be accomplished through a Power BI API query.
Import-Module -Name Az.Storage
Import-Module -Name MicrosoftPowerBIMgmt
Import-Module -Name ImportExcel
# PowerBI API parameters
$tenantId = ""
$clientId = ""
$clientSecret = "”
$resource = "https://analysis.windows.net/powerbi/api"
$tokenEndpoint = "https://login.microsoftonline.com/$tenantId/oauth2/token"
$adminAccount = "123abc@email.com"
$body = @{
grant_type = "client_credentials"
client_id = $clientId
client_secret = $clientSecret
resource = $resource
}
# generate access token
$accessTokenResponse = Invoke-RestMethod -Uri $tokenEndpoint -Method Post -Body $body
$accessToken = $accessTokenResponse.access_token
# list all workspaces in the organization
$workspaces = Invoke-RestMethod -Uri 'https://api.powerbi.com/v1.0/myorg/admin/groups?$top=10000' -Method Get -Headers @{
Authorization = "Bearer $accessToken"
}
# connect to PowerBI Service
Connect-PowerBIServiceAccount
# iterate through every workspace
$workspaces.value | ForEach-Object {
$workspaceType = $_.type
$workspace = $_.id
$workspace
if ($workspaceType -eq "Workspace") {
# grant contributor access to the workspaces
Add-PowerBIWorkspaceUser -Scope Organization -Id $workspace -UserEmailAddress $adminAccount -AccessRight Contributor
}
There are multiple ways to obtain a list of unused reports. In Astral Forest, we have implemented the Power BI Monitoring tool, which allows us to monitor and identify unused reports in real-time. In this case, exporting the data from the tool to a CSV file should be sufficient.
In case your organization doesn't have access to the mentioned tool, you can try using alternative, less efficient methods to generate the list. An example of such a solution could be query in Power BI API. However, this solution has limitations – for instance, it only returns unused artifacts from the last 30 days and may also include other objects in the Power BI Portal. I recommend having at least 180 days of usage history of the reports.
Once the necessary permissions have been obtained, we can proceed to the main part of the script. Additionally, we will create a list of users along with the reports assigned to them. This way, if someone wants to retrieve their report after running the script, we will know whether they had permissions for it or not. If you would like to learn more about authorization code or if you encounter issues executing the script, I recommend reading this article.
Import-Module -Name Az.Storage
Import-Module -Name MicrosoftPowerBIMgmt
Import-Module -Name ImportExcel
# PowerBI Admin parameters
$tenantId = ""
$clientId = ""
$clientSecret = ""
$resource = "https://analysis.windows.net/powerbi/api"
$tokenEndpoint = "https://login.microsoftonline.com/$tenantId/oauth2/token"
# PowerBI API parameters
# link for authorization code (past in your browers and copy the part after code=)
# rememer to fill your client_Id!!!
# https://login.microsoftonline.com/common/oauth2/authorize?client_id=xxxxxxxxxxxxxxx8&response_type=code&redirect_uri=https://localhost/redirect/&response_mode=query&scope=openid&state=12345
$code = ''
$clientIdDelete = ""
$clientSecretDelete = ""
# rest of parameters
# path to folder where reports are saved
$baseReportPath = ""
# path to csv with list of reports ids
$filePath = ""
# path to xlsx with reports users
$csvPath = ""
# Azure parameters
$storageAccountName = ""
$storageContainerName = ""
$blobPath = ""
$storageAccountKey = ""
$storageContext = New-AzStorageContext -StorageAccountName $storageAccountName -StorageAccountKey $storageAccountKey
$bodyAdmin = @{
grant_type = "client_credentials"
client_id = $clientId
client_secret = $clientSecret
resource = $resource
}
# generate access token for read-only admin API
$accessTokenResponse = Invoke-RestMethod -Uri $tokenEndpoint -Method Post -Body $bodyAdmin
$accessToken = $accessTokenResponse.access_token
$bodyDelete = @{
grant_type = "authorization_code"
client_id = "$clientIdDelete"
client_secret = "$clientSecretDelete"
resource = $resource
code = "$code"
redirect_uri = 'https://localhost/redirect/'
}
$accessTokenResponse = Invoke-RestMethod -Uri $tokenEndpoint -Method Post -Body $bodyDelete
$accessTokenDelete = $accessTokenResponse.access_token
$reports = Invoke-RestMethod -Uri "https://api.powerbi.com/v1.0/myorg/admin/reports" -Method Get -Headers @{
Authorization = "Bearer $accessToken"
}
# get datasetId, reportName and workspaceId
$workspaceMap = @{}
$nameMap = @{}
$datasetMap = @{}
$reports.value | ForEach-Object {
$reportId = $_.id
$workspaceId = $_.workspaceId
$name = $_.name
$datasetId = $_.datasetId
$workspaceMap[$reportId] = $workspaceId
$nameMap[$reportId] = $name
$datasetMap[$reportId] = $datasetId
}
# read xlsx file
$reportsExcel = Import-Excel -Path $filePath -StartRow 2 -NoHeader
$reportsArray = $reportsExcel | ForEach-Object { $_.PSObject.Properties.Value }
$csvData = @()
Connect-PowerBIServiceAccount
foreach ($reportId in $reportsArray) {
$name = $nameMap["$reportId"]
$workspaceId = $workspaceMap["$reportId"]
$datasetId = $datasetMap["$reportId"]
$extension = ".pbix"
# Find users which have access to report
$users = Invoke-RestMethod -Uri "https://api.powerbi.com/v1.0/myorg/admin/reports/$reportId/users" -Method Get -Headers @{
Authorization = "Bearer $accessToken"
}
# for each report get all users who have access to it
$users.value | ForEach-Object {
$user = $_.identifier
$userRole = $_.reportUserAccessRight
$userName = $_.displayName
$principalType = $_.principalType
$mail = $_.identifier
$rowData = [PSCustomObject]@{
"ReportName" = $nameMap["$reportId"]
"ReportID" = $reportId
"UserName" = $userName
"UserEmail" = $user
"UserRole" = $userRole
}
$csvData += $rowData
# if the user is a group, write all the group members and assign them to the report
if ($principalType -eq "Group") {
$members = Invoke-RestMethod -Uri "https://graph.microsoft.com/v1.0/groups/$user/members" -Method Get -Headers @{
Authorization = "Bearer $accessToken"
}
$members.value | ForEach-Object {
$groupUserName = $_.displayName
$groupUserMail = $_.userPrincipalName
$rowData = @{
"ReportName" = $nameMap["$reportId"]
"ReportID" = $reportId
"UserName" = $groupUserName
"UserEmail" = $groupUserMail
"UserRole" = $userRole
}
$csvData += New-Object PSObject -Property $rowData
}
}
}
# check wheter the name of the report already exists
$reportPath = Join-Path -Path $baseReportPath -ChildPath "$name$extension"
$index = 0
while (Test-Path $reportPath) {
$index++
$newName = "${name}_$index"
$reportPath = Join-Path -Path $baseReportPath -ChildPath "$newName$extension"
}
# download the report
Export-PowerBIReport -Id $reportId -OutFile $reportPath
# add number to the name if report exists
if ($index -eq 0) {
$newName = $name
}
else {
$newName = "${name}_$index"
}
# upload report to the blob storage
Set-AzStorageBlobContent -Context $storageContext -Container $storageContainerName -Blob $blobPath/$newName -File $reportPath -Force -StandardBlobTier Cool
# remove dataset and report
Invoke-RestMethod -Uri "https://api.powerbi.com/v1.0/myorg/groups/$workspaceId/datasets/$datasetId" -Method Delete -Headers @{
Authorization = "Bearer $accessTokenDelete"
}
}
# save the user list
$csvData | Export-Csv -Path $csvPath -NoTypeInformation -Encoding UTF8
We've successfully navigated through all the steps. The process is quite intricate, so even with the above assistance, implementing the solution may take some time. However, this marks the initial stride towards a well-organized Power BI environment and lays the foundation for further efforts in enhancing Data Governance within your organization. Best of luck!