Home Business Intelligence Slowly Changing Dimension (SCD) in Power BI, Part 2, Implementing SCD 1

Slowly Changing Dimension (SCD) in Power BI, Part 2, Implementing SCD 1

by datatabloid_difmmk
Power BI Slowly Changing Dimension (SCD), Part 2, Implementing SCD 1

In my last article, I explained what SCD means for business intelligence solutions. He also explained that in a Power BI implementation he was not supposed to handle SCD2, but could handle scenarios similar to SCD1. This post will show you how.

scenario

We have a retail company that sells our products. Every year, the company publishes a list of products, including list prices and store prices, in Excel format. The product list will be released on July 1st, when the fiscal year begins. You need to implement a Power BI solution with up-to-date product data to analyze your sales transactions. The following image shows the product list for 2013.

2013 Product List Excel
2013 product list

So every year I receive an Excel file similar to the image above. Files are stored on your SharePoint Online site.

Scenario description

As explained in the previous post, SCD1 always keeps current data by updating old data with new data. Therefore, the ETL process reads data from the source, identifies existing data in the destination table, inserts new rows into the destination, updates existing rows, and deletes deleted rows.

Here’s why this scenario is similar to SCD1: However, there is one exception.

  • It doesn’t actually update the data in the Excel file, nor does it create an ETL process that reads the data from the Excel file, identifies the changes, and applies the changes to an intermediate Excel file.
  • I need to read the data from the source Excel file, filter out the old data while keeping the latest data, and load the data into the data model.

As you can see, it takes a very different implementation approach, but the result is very similar, no rows are removed.

Implementation

Here’s what we should do to reach our goal:

  • to get the data in Power Query Editor. SharePoint folder connector
  • combine files
  • we, Product No Column for identifying duplicate products
  • we, Reporting date A column that identifies the latest date
  • keep only the most recent rows

Get data from SharePoint Online folders

To retrieve data from multiple files stored in SharePoint Online, SharePoint folder connector. Do the following:

  1. login SharePoint Online Go to the site that holds the product list Excel file and copy the site URL from your browser.
Get the URL of your SharePoint Online site
Get the URL of your SharePoint Online site
  1. from get data In Power BI desktop, SharePoint folder connector
  2. click connection
Connect to SharePoint Online folders from Power BI
Connect to SharePoint Online folders from Power BI
  1. pasting Site URL Copy in step 1
  2. click all right
Connect to SharePoint Online folders from Power BI using the SharePoint Folder Connector
Connect to SharePoint Online folders from Power BI using the SharePoint Folder Connector
  1. click data conversion
Transforming data in Power Query Editor
Transforming data in Power Query Editor

  1. click the filter dropdown. folder path digit
  2. Find and select the Products List folder that hosts your Excel file.
  3. click all right
Filter SharePoint Online site folders in Power BI's Power Query Editor
Filter SharePoint Online site folders in Power Query Editor
  1. the name of the query product
Rename a query in Power BI's Power Query Editor
Rename Query1 to Product

So far, I’ve connected to a SharePoint Online folder in Power Query Editor. The next step is to combine the Excel files.

Combining files

There are multiple options for joining binary files to tables from Power Query Editor. This post uses the easiest method.

  1. click combine files button from content digit
Join an Excel file to a table in Power BI's Power Query Editor
Join an Excel file to a table in Power Query Editor
  1. choose Product list table
  2. click all right
in Power BI's Power Query Editor[ファイルの結合]In the window, select objects to extract from each file
in Power Query Editor[ファイルの結合]In the window, select objects to extract from each file

The above process creates several queries grouped in separate folders as shown in the following image.

Combining Excel files
Combining Excel files

So far, you have successfully merged your Excel files. The next step is to keep only the latest data.

Keep your data up-to-date

The next few steps are to dig into the data and implement mechanisms to identify and retain the most recent data and load it into the data model.

  1. Looking at the result of the combined data, source.name You need to keep the column, so select it and click to remove it. delete column button from House tab
Delete a column in Power BI's Power Query Editor
Delete a column in Power Query Editor

So far, we’ve connected to a SharePoint Online folder and merged the Excel files it contains. Let’s look at the data and see what we got.I sorted the data by Product No To better understand data changes. The following figure shows the result.

Products repeated in different lists

As the image above shows, there are multiple listings displaying multiple products. It’s exactly what we expected. The goal is to keep it based on the latest product data only. Reporting date. Therefore, it is necessary to obtain Product No maximum value of Reporting date. To achieve this, grouping Features of the Power Query Editor. Using Group By from her UI in the Power Query Editor, Table.Group() Works in Power Query. Because the Group By process does not require data to be sorted, sorted rows step. Now let’s get the job done.

  1. choose Product No digit
  2. click grouping columns from Transform tab
  3. come in Reporting date for new column name
  4. select Max from surgery drop down
  5. choose Reporting date from digit drop down
  6. click all right
Table.Group() Group By feature in Power Query for Power BI
Power Query grouping feature

The following figure shows the result.

Group By operation results in Power Query for Power BI
Result of Group By operation

Now you have all the product numbers and the latest reporting date. The only remaining piece of the puzzle is grouped rows Step using the data from the previous step. for that reason, query merging function to perform Table.NestedJoin() Works in Power Query.

  1. choose grouped rows steps from Application procedure list from query settings pain
  2. click query merging button from House tab
  3. Select the Product (Current) table from the dropdown. Note that we have selected the current query (product).
  4. At the top table, press the Ctrl button on your keyboard to Product No and the Reporting date columns are sequential
  5. Do the same for the table below. Note that the order in which you select the columns matters
  6. Please make sure that join kind teeth Left Outer (first to all, second to match)
  7. click all right
Left outer join (merge) in Power Query for Power BI
Left outer join (merge) in Power Query

As mentioned above, in a merge operation Table.NestedJoin() A function that accepts two tables (highlighted in the figure) yellow List of key columns to use in the join (highlighted in formula below) red the name of the new column in type table and the type of bond. In the above operation, grouped rows is the final transformation step. grouped rows A transformation step in itself. Here is the code generated by the Power Query Editor after steps 21-27.

Table.NestedJoin(#"Grouped Rows", {"ProductNumber", "Reporting Date"}, #"Grouped Rows", {"ProductNumber", "Reporting Date"}, "Grouped Rows", JoinKind.LeftOuter)

But that’s not what we want. grouped rows A transformation step in itself. You need to combine the results of grouped rows Transformation step with the result of Deleted column step. So the above formula should be changed to:

Table.NestedJoin(#"Grouped Rows", {"ProductNumber", "Reporting Date"}, #"Removed Columns", {"ProductNumber", "Reporting Date"}, "Grouped Rows", JoinKind.LeftOuter)

The following illustration shows the changes made to the formula and the result.

Combined results of the Transform Grouped Rows step with the results of the Removed Columns step in Power Query for Power BI
The result of combining the results of grouped rows Transformation step with the result of Deleted column step
  1. click expansion button grouped rows digit
  2. deselect Product No When Reporting date Column to keep other columns selected
  3. uncheck Use original column name as prefix option
  4. click all right
Expand structured columns in Power Query for Power BI
Expand structured columns in Power Query

All done! The following figure shows the final result.

Final result of SCD1 implementation
Final result of SCD1 implementation

You can now apply your changes and load the data into the data model. Using this technique, when a new Excel file (new product list) arrives in SharePoint Online, Power BI will perform the transformation steps above to ensure that the latest product data is always loaded into the data model. This is very similar to behavior. of SCD1.

Have you used this method before? Do you have better techniques for handling similar scenarios? You can share your thoughts in the comments section below.

You may also like

Leave a Comment

Subscribe my Newsletter for new blog posts, tips & new photos. Let's stay updated!

About Us

We're a provider of Data IT News and we focus to provide best Data IT News and Tutorials for all its users, we are free and provide tutorials for free. We promise to tell you what's new in the parts of modern life Data professional and we will share lessons to improve knowledge in data science and data analysis field.

Facebook Twitter Youtube Linkedin Instagram

5 Strategies To Reduce IT Support Tickets – Ultimate Guide

Recent Articles

Redefining the Role of IT in a Modern BI World What (Really) Are Issues Faced by Data Scientist in 2022 How I start Data Science Projects | What to do when you're stuck

Featured

5 Strategies To Reduce IT Support Tickets – Ultimate Guide Redefining the Role of IT in a Modern BI World What (Really) Are Issues Faced by Data Scientist in 2022

Copyright ©️ All rights reserved. | Data Tabloid