The Ultimate Guide To Power Query Data Cleaning
Let me paint you a picture.
It’s month-end. You’re neck-deep in reports, someone just asked for a reforecast, and now your dashboard looks like it’s had a stroke because someone uploaded “PARIS”, “paris ”, and “ Paris” as three separate cities. Again.
If you’ve ever burned an hour trying to trace a broken pivot table back to a rogue space or inconsistent formatting, you already know the truth: dirty data is the silent killer of finance. It bloats your workload, nukes your credibility, and makes you want to chuck Excel into oncoming traffic.
But here’s the good news: there’s a better way. And it’s called Power Query.
If you’ve never used it, Power Query is the tool inside Excel and Power BI that lets you transform messy, chaotic data into something that won’t make your eyes bleed. Think of it as a pressure washer for your spreadsheets—built to strip away the grime before you ever load data into your report.
And here’s why I swear by it:
- It’s point-and-click (no scary code required, though you can flex with M language if you’re into that).
- It’s repeatable—set it up once and refresh every week.
- It’s part of the Microsoft stack you’re already using, so no need to buy yet another overpriced SaaS tool.
So roll up your sleeves. We’re going to teach your spreadsheets some manners. Mastering these steps will enhance your Power Query data cleaning skills and the quality of your analysis.
Quick Tour Of Power Query
Alright, let’s get you past the “I’ve-heard-of-it-but-never-touched-it” stage.
Power Query lives inside Excel and Power BI. Its user interface is designed to help you easily navigate through various features, allowing you to access, clean, and transform your data efficiently. If you’ve ever clicked “Get Data” and stared blankly at the next 17 options, wondering which one won’t explode your file, this is for you.
Opening the Power Query Editor
Let’s keep it simple. Here’s how you open the gates:
In Excel (365 or 2016+):
- Go to Data > Get & Transform Data > Get Data.
- Pick your poison—Excel file, CSV, SQL Server, SharePoint, etc.
- Click Transform Data (not Load yet! We’re not savages).
- Boom. You’re inside Power Query Editor.
In Power BI:
- Click Home > Transform Data.
- Welcome to the same interface, different house.
Power Query Editor Layout
Here’s your five-second tour before you panic and close the window.
🔹 Ribbon at the top
Think of this like Excel’s toolbar—your command center. You’ve got buttons for:
- You can perform actions such as removing rows/columns
- Splitting or merging data
- Data type changes
- Fuzzy matching (yes, really)
- Grouping, filtering, transposing, etc.
💡 Pro tip: Don’t click randomly here like you’re trying to find the “bold” button in Word. Each click adds a step to your query. We’ll get to why that matters soon.
🔹 Preview Pane (middle section)
This shows a sample of your data—not the whole thing. It’s like a trailer for the terrible movie that is your raw export.
This is where you inspect, highlight weird values, and decide which columns to keep or purge. You can also explore your data in this pane to better understand its structure and quality.
🔹 Query Pane (left sidebar)
Think of this as your “data inventory.” If you’ve pulled in multiple tables, files, or sheets, you’ll see them listed here.
You can rename queries here too. Please rename them. “Table1 (2) Copy Final v3” is not a vibe.
🔹 Applied Steps (right sidebar)
Here’s where the real magic lives. Every action you take—removing a column, renaming a header, filtering rows—is recorded here like a recipe. It’s a chronological list of transformations.
If something breaks later? This is where you debug.
⚠️ Warning: Don’t delete random steps unless you know what you’re doing. You might unravel your entire data pipeline like a Jenga tower built by an intern.
Profiling & Discovery: Sniff Out the Spreadsheet Sabotage
Before you touch anything, rename anything, or do a single transformation… you profile the data.
Why? Because data is like an unreliable narrator. It looks fine at first glance, but then you realize “2023-01-05” is in one format, “1/5/23” is in another, and someone spelled “United States” four different ways. All in one column. Profiling also helps you spot incomplete records—rows with missing or insufficient information—which is essential to address early on. This makes data profiling a critical first step in the data preparation process, ensuring your dataset is accurate and ready for analysis.
Power Query’s data profiling tools are your detective kit. They tell you where the mess is hiding so you don’t waste time scrubbing what’s already clean—or worse, miss the part that’s quietly breaking your entire model.
Let’s walk through how I audit a dataset before cleaning a single cell.
Step 1: Turn On Profiling Tools
If you didn’t already do this in Section 2, do it now. Seriously. Right now.
Go to the View tab in Power Query and check:
- ✅ Column quality
- ✅ Column distribution
- ✅ Column profile
This adds three incredibly powerful lenses right below your data.

Step 2: Scan Column Quality
Look just below the column headers. You’ll see tiny bars with color coding:
- ✅ Green = Valid values
- ⚠️ Dark gray = Empty/nulls
- ❌ Red = Errors
This tells you instantly if a column is clean or if it’s riddled with problems. It’s like having a BS detector baked into your spreadsheet.
Example:
Say you’ve got a Transaction Date column:
- 92% green
- 6% blank
- 2% red (invalid dates)
Now you know:
- You’ll need to filter or fill blanks
- You might need to fix inconsistent formats
- There’s likely some bad imports lurking (maybe someone typed “Febtober 31”)
Step 3: Check Column Distribution
Hover over a column, and Power Query will show you a bar chart of distinct values and frequencies.
This is huge when:
- You’re checking if a “Region” column has rogue entries (e.g., “Northeast”, “N-east”, “NE”)
- You want to spot duplicates or outliers fast, making it easier to identify data that may require removal, such as unnecessary entries or blank cells
- You’re diagnosing why your pivot table has 27 flavors of “France”
Pro move:
Click the column header → Sort → Count of distinct values. You’ll spot inconsistencies instantly.
Step 4: Use Column Profile (Full Stats)
Click into any column, scroll to the bottom of Power Query, and boom—full profiling stats:
- Count
- Distinct count
- Empty values
- Min/max
- Most frequent values
- String operations needed (such as identifying columns that require string manipulation)
This is where I get nerdy. You can spot:
- Text columns with trailing spaces (all values look the same but have different lengths)
- Numbers stored as text (hello, accidental quotes)
- Currency fields with “K” or “M” you’ll need to convert
- Columns where you may need to extract specific information from text using Power Query functions, like extracting substrings or key data from unstructured text
⚠️ If this is your finance data feed: Check for unexpected decimal places. I once found a client feeding us “Total Revenue” in billions… formatted as text… with four decimal places. That dashboard was lying to everyone until I fixed it here.
Case Study: The City That Wouldn’t Die
One of my clients had a “Location” field. I thought it was fine—until I checked column distribution.
Turns out we had:
- “Paris”
- “paris ”
- “PARIS”
- “ Paris ”
- “París” (with an accent… fancy)
All meant the same thing. But guess what? Excel and Power BI don’t think they’re the same.
This one fix alone—standardizing the Location field—turned a weekly reporting process from 90 minutes of manual cleanup to a 30-second refresh. Why? Because once we knew the problem, Power Query let us solve it permanently. That’s the power of profiling.
Cleaning Essentials: Fixing the Spreadsheet One Column at a Time
Alright, we’ve profiled the data. We’ve seen the horrors: nulls, duplicates, “USA” vs. “U.S.A” vs. “United States of A.” Now it’s time to clean house. This is where Power Query stops being a preview tool and starts being your best friend.
These cleaning steps help prepare your data for accurate analysis and reporting, ensuring that your results are reliable and meaningful.
I’m going to walk you through the cleaning moves I use on nearly every file that crosses my desk—because 90% of data issues come from the same few sins. And once you know how to handle them, you can rinse and repeat on every import, every month, forever.
Trim Spaces, Fix Casing, and Standardize Text
Dirty little secret of finance data: Most “errors” are really just inconsistent formatting.
Trimming whitespace is a crucial first step in cleaning your data. By removing extra spaces, you ensure that your text is standardized and ready for analysis. Additionally, splitting text into individual words can further enhance your analysis, making it easier to create word clouds or perform sentiment analysis. This process helps transform messy sentences into structured data, improving overall data quality.
Step 1: Trim Whitespace
If your column has “ Paris”, “Paris ”, and “ Paris ”, your formulas and pivots will treat them as three different cities. Kill the spaces.
Here’s how:
- Select the column
- Go to Transform → Click Format → Choose Trim
- (Optional) Click Clean to remove weird non-printable characters too
💡 Pro tip: You can also use the Text.Trim() function in a custom column if you want to clean but preserve the original for audit.
Step 2: Fix Capitalization
Want to standardize names or cities? Power Query’s got you.
- Transform → Format → choose:
- lowercase (paris)
- UPPERCASE (PARIS)
- Capitalize Each Word (Paris)
- lowercase (paris)
- UPPERCASE (PARIS)
- Capitalize Each Word (Paris)
I usually go with proper case for things like customer or region names—it just looks cleaner.
Replace or Correct Values
Now that text is standardized, you need to normalize the actual values. Using these techniques offers a comprehensive solution to common data cleaning and normalization challenges, ensuring your data is accurate and ready for analysis.
Step 1: Replace Manually
- Select the column
- Transform → Replace Values
- Input what you want to find and what to replace it with
I use this when I’ve got a manageable number of variations (like “NE” → “Northeast”).
Step 2: Use a Lookup Table
For recurring clean-up—like mapping 100 vendor name variants to a single standard—you’ll want to create a translation table.
How it works:
- Create a separate table with two columns: Original and Cleaned
- Load both into Power Query
- Merge your messy data with the clean table (Left Join)
- Replace the old value with the cleaned one
🚨 Do this once, and you’ll never fix “G00gle Inc.” by hand again.
Bonus: Fuzzy Matching
If your text variants are slightly off (“P&G” vs “P and G”), Power Query has fuzzy matching. Just enable it when doing a merge.
- Home → Merge Queries
- Choose fuzzy match in the merge options
- Adjust the threshold if needed (lower = looser match)
Remove Duplicates and Filter Junk Rows
Step 1: Remove Duplicates
If you’re working with transaction logs, GL dumps, or exported CRM data—always check for dupes.
How:
- Select the identifying columns (like Invoice Number + Date)
- Home → Remove Rows → Remove Duplicates
Step 2: Filter Out Garbage
Use filters to eliminate:
- Blank rows
- Test data
- Values that say “N/A”, “null”, “undefined”, etc.
⚠️ Some files look fine but include garbage rows 10,000 records down. Always scroll to the bottom of a file preview if performance allows.
Handle Nulls and Missing Values
Null values are like landmines. Left unchecked, they’ll break your formulas later.
Common Approaches:
- Replace Values → Null → “Unknown” or 0 (depends on context)
- Use Fill Down or Fill Up if the data has missing data (like missing categories)
- Add a conditional column flagging nulls for exception reporting
Fix Data Types (This One’s Huge)
If your numbers are stored as text, or dates are in “1/1/2023” vs “2023-01-01” chaos—this step’s for you. Power Query makes it easy to change data types, ensuring consistency and accuracy in your dataset.
How:
- Select a column
- Go to Transform → Data Type → Choose the correct one:
- Text
- Decimal number
- Whole number
- Date
- Date/Time
Pro tip:
Always force the type after all replacements and cleaning steps. If you change the type too early, it might throw errors or silently convert values wrong.
Case Study: Cleaning a Vendor Spend Report
Client sent me a vendor payment file with:
- Vendor names like “AMZ”, “Amazon Inc.”, “Amazon”, and “Amazon LLC”
- Spend in multiple currencies (some with €, some with USD, some with no label at all)
- Totals stored as text (“$3,000.00”, “3K”, “1.5M”)
Here’s what I did:
- Trimmed and formatted all vendor names to proper case
- Built a translation table to unify them under “Amazon”
- Stripped currency symbols using Text.Select() and converted to numbers
- Converted “3K” and “1.5M” using custom logic (K = ×1,000, M = ×1,000,000)
- Replaced nulls with zeros in spend fields
Once done, it turned into a clean, consistent vendor spend model. The original? An absolute circus. But now? It refreshes every week with zero human intervention.
Transforming Columns: Give Your Data a Makeover
Cleaning your data is like scrubbing a raw potato. It’s necessary. But if you stop there, congrats—you have a clean potato. What you really want is mashed potatoes. Or fries. Something delicious. That’s what column transformations are.
Here’s where we shape your data into something analysis-ready—something that flows into your reports, makes sense in your models, and doesn’t make stakeholders squint and ask, “What the hell is this column?”
Splitting Columns
Let’s say you get a column like this: “John Smith – Sales – West” or “5’11”, 185 lbs”. It’s all jammed into one cell and you need to split it out to work with it. In Power Query, you can also use functions to extract specific information from these text columns, such as extracting names, departments, or measurements during the splitting process.
Split by Delimiter
This is your go-to move when the column is structured but messy.
Steps:
- Select the column
- Go to Home → Split Column → By Delimiter
- Choose your delimiter (space, comma, dash, custom)
- Decide how many times to split (first occurrence, each occurrence, etc.)
Example:
“5’11”, 185 lbs” → Split by comma → one column for height, one for weight.
Then:
- Use Text.BeforeDelimiter() or Text.AfterDelimiter() to fine-tune
- Convert to number by stripping text (“185 lbs” → 185)
Merging Columns
Merging is the flip side of splitting. You use it to create a human-readable label or combine fields for lookup tables.
Example:
- First Name: John
- Last Name: Smith
You want: John Smith
Steps:
- Select both columns
- Transform → Merge Columns
- Choose a separator (space, comma, nothing)
Boom. Now you’ve got full names, categories, composite keys—whatever your little analysis-loving heart desires.
Custom Column Calculations
Here’s where Power Query’s M language starts showing off.
You can build logic that:
- Converts units (e.g. “1.5M” → 1,500,000)
- Flags exceptions (if Spend > 1000000 then “High Roller” else “Normal”)
- Cleans embedded symbols (Text.Select([Amount], {“0”..”9″, “.”}) to strip $)
To create a custom column:
- Add Column → Custom Column
- Write your formula using M language syntax
- Use [ColumnName] to reference columns
- Use functions like Text.Upper, Number.FromText, Date.FromText, etc.
- Use [ColumnName] to reference columns
- Use functions like Text.Upper, Number.FromText, Date.FromText, etc.
Example:
= if [Spend] > 100000 then “High” else “Low”
Or strip “K” and multiply:
= if Text.EndsWith([Amount], “K”) then Number.FromText(Text.BeforeDelimiter([Amount], “K”)) * 1000 else Number.FromText([Amount])
💡 Pro tip: Custom columns are where automation magic really kicks in. This is how you stop manually fixing columns every. single. week.
Unpivoting (Yes, You Probably Need This)
So many exports come in the dreaded “report format”:
Customer | Jan | Feb | Mar | Apr ———|—–|—–|—–|—- Acme | 100 | 200 | 150 | 300
Looks fine, right? Until you try to chart it. Or group it. Or roll it up by quarter.
Unpivoting flips this mess into a usable structure:
Customer | Month | Value ———|——-|—— Acme | Jan | 100 Acme | Feb | 200 …
How:
- Select all the month columns
- Transform → Unpivot Columns
🔁 Bonus: Use Pivot Columns to go the other way if needed.
Case Study: FIFA Player Data
I once pulled a FIFA dataset where players had:
- Height as “5’11”” (not a number)
- Weight as “176 lbs” (again, not a number)
- Value as “€60M” and “€1.2K” (stored as text)
- Contract end dates embedded in a single column with no year
Here’s what I did:
- Split height into feet and inches, then calculated total inches:
- Custom column: = [Feet]*12 + [Inches]
- Stripped “lbs” from weight and converted to number
- Converted currency by removing symbols and multiplying based on suffix:
= if Text.EndsWith([Value], “M”) then Number.FromText(Text.BeforeDelimiter([Value], “M”)) 1000000 else if Text.EndsWith([Value], “K”) then Number.FromText(Text.BeforeDelimiter([Value], “K”)) 1000 else Number.FromText([Value]) 4. Extracted contract year using a fuzzy text split and a date function
All of this was done in Power Query—with zero manual intervention once it was set up. The result? A dashboard-ready dataset I could update every week with a single click. Power Query provided the answer to my data cleaning challenges, handling errors and text manipulation efficiently.
Automating & Structuring the Query: Set It and Forget It (Mostly)
Here’s a truth bomb: if you clean and transform your data but don’t structure it properly, you’re just signing yourself up for Groundhog Day. Every. Single. Month.
So let’s break the cycle.
This section is all about making your Power Query workflow repeatable, readable, and refreshable. Because I don’t care how slick your formulas are—if you have to manually click through 10 cleanup steps every time you get a new CSV, you’ve already lost.
Structure Your Queries for Scale
The longer your query, the more likely it is to break. That’s not pessimism—it’s Power Query physics.
Use “Reference” Queries
Let’s say you pull in a raw data file, clean it, transform it, build some calculations, and then load it into three different dashboards.
Don’t do all of that in one query.
Instead:
- Load your raw data
- Create a cleaned version (remove errors, fix types, standardize names)
- Reference that cleaned version to create your transformed query
- Reference that for each reporting-specific view
This makes each query easier to debug and update. Plus, you’ll avoid spaghetti logic and midnight tears.
Rename Everything (Seriously, Everything)
Default query names like Table1 (2) or Query5 are how nightmares begin. If you can’t tell what a query does just by looking at its name, rename it.
Examples:
- ✅ Raw_GL_Import
- ✅ Cleaned_Transactions
- ✅ Vendor_Spend_Summary
Same goes for columns:
- Rename Column1 to Invoice Number
- Rename Custom to something real like Annualized Spend
This helps future-you and anyone else who touches your file not lose their minds.
Set Query Load Options
Not every query needs to load into your workbook or data model. In fact, loading too many queries slows everything down and bloats your file size.
Here’s what I do:
- Right-click each query → Enable Load only for final outputs
- Disable load for staging, cleanup, or lookup tables
🧠 Think of it like a restaurant: the kitchen (raw data) doesn’t serve food directly. Only the polished dish (final output) goes to the table (report/dashboard).
Parameterize File Paths and Filters
If you’re working with recurring files (e.g., “Sales_July2025.csv”, “Sales_Aug2025.csv”), don’t hard-code the file path or sheet name.
Instead:
- Go to Manage Parameters
- Create a parameter for the file path or filter (e.g. CurrentMonth = “Aug2025”)
- Reference that parameter in your query steps
Now, to update the report for next month? Just change the parameter. No digging through steps or editing formulas.
Avoid Performance Landmines
Power Query is powerful, but it’s not magic. Here’s how to avoid the slowdowns and timeouts that haunt so many Power BI and Microsoft Excel users:
Filter Early
Apply filters near the top of your query to reduce rows as soon as possible. Less data = faster steps.
Remove Unused Columns Immediately
If you’re not using a column later, ditch it early with Remove Other Columns. Don’t carry dead weight.
Don’t Nest Queries Too Deeply
Merging huge queries into each other recursively? Congratulations, you just built a memory hog. Instead, use lookup tables and keep joins shallow.
Watch for Type Changes
Changing data types multiple times in one query is a silent killer. Set types once, after the major cleanup is done.
Case Study: Weekly Vendor Spend Report Automation
I built a recurring vendor spend report for a client that used to take them 4 hours a week to clean and update manually.
Here’s how we structured it:
- Raw query: pulled in the weekly CSV
- Cleaned query: fixed currency symbols, standard vendor names, nulls, and date types
- Reference query: calculated monthly and YTD spend
- Final report view: filtered to top 20 vendors, grouped by department
- Parameter: file path for new weekly drop
- Schedule: automatic refresh every Monday morning
If any issues arise in the Power Query data cleaning process, sharing an attached PBIX file with colleagues or clients can help them troubleshoot or reproduce the problem more efficiently.
They now click once—or just open the file—and it’s done in under 30 seconds.
This is what Power Query is meant for. Not just cleanup… but liberation.
Bonus: Set It to Auto-Refresh
In Power BI:
- You can schedule automatic refreshes (daily, weekly, etc.)
- Just make sure your data source is accessible (OneDrive, SharePoint, SQL, etc.)
- Use a gateway if you’re pulling from on-prem sources
In Excel:
- You can set queries to refresh on open or on a timer:
- Go to Data → Queries & Connections → Right-click query → Properties
- Check Refresh on open or Refresh every X minutes
- Go to Data → Queries & Connections → Right-click query → Properties
- Check Refresh on open or Refresh every X minutes
🧠 If you’re delivering reports to execs or clients, “set it and forget it” is the dream. This is how you get there.
Case Study: Turning a Dumpster Fire CSV into a Refreshable Report
Let me tell you about a client we’ll call “Every Company Ever, Inc.” because they do what 90% of companies do: export raw data from five different systems into Excel, mash it into one file, and email it to finance with the subject line “Clean this up, please :)”.
This file was a monthly recurring nightmare. But after one Power Query intervention, it became a set-it-and-forget-it report that runs in under 60 seconds.
Here’s exactly what we did—step by step.
The Situation: A Weekly Vendor Spend Report That Was Anything But
The client had a “vendor spend report” coming out of an AP system. Here’s what we were dealing with:
- Data delivered weekly as a flat CSV with no formatting
- Vendor names were all over the place—“Google,” “GOOGLE INC,” “G00gle,” and even “Alphabet”
- Spend amounts in multiple currencies, with no standardized formatting (“€3,500”, “$4.2K”, “1.5M USD”)
- Date column with mixed formats—some proper dates, others as text
- Extra garbage rows at the bottom saying “End of Report”
- A 4-hour cleanup process every. single. week.
They were losing a half-day just to prep data before analysis. That’s not reporting—that’s spreadsheet babysitting.
Step-by-Step: How We Fixed It With Power Query
Step 1: Import the Raw File
- Used Get Data → From File → From CSV
- Gave the query a name (Raw_Vendor_Spend)
- Disabled “Load to worksheet” immediately (no one wants to look at the raw mess)
Step 2: Clean It Up
Created a new query: Cleaned_Vendor_Spend, using Raw_Vendor_Spend as the source.
Trimmed and cleaned text:
- Transform → Format → Trim and Clean on all text columns
- Standardized casing to Capitalize Each Word for vendor names
Handled vendor name chaos:
- Merged in a vendor normalization table with fuzzy matching:
- Messy Name, Clean Name GOOGLE INC, Google G00gle, Google Alphabet, Google
Fixed currency symbols and converted to USD:
- Removed “$”, “€”, and “USD” using Text.Remove or Text.Select
- Converted K/M using a custom column:
- = if Text.EndsWith([Amount], “K”) then Number.FromText(Text.BeforeDelimiter([Amount], “K”)) 1000 else if Text.EndsWith([Amount], “M”) then Number.FromText(Text.BeforeDelimiter([Amount], “M”)) 1000000 else Number.FromText([Amount])
- Added a “Currency” column and applied exchange rates via a merge
Filtered junk rows:
- Used Text.Contains([Vendor], “End of Report”) → Filtered out those rows
Fixed dates:
- Used Date.FromText([Transaction Date]) and flagged rows with errors
Step 3: Build Transform View
Created a new query: Spend_By_Vendor
- Referenced the cleaned query
- Grouped by Vendor Name, Department, and Month
- Aggregated spend with Sum
Renamed the columns to human-friendly titles like:
- Monthly Spend (USD)
- Department Name
- Report Period
Step 4: Load to Report
- Only loaded the Spend_By_Vendor query
- Output to Excel table or Power BI report, depending on stakeholder need
- Set to refresh on file open
The Result: From Pain to Push-Button Report
What used to take 4 hours a week turned into a one-click refresh.
Before:
- Manually cleaned vendor names every week
- Copy-pasted into a pivot table
- Wasted hours explaining why numbers didn’t tie out
After:
- Standardized vendor names using a translation table
- Currency conversions handled automatically
- Filters and summaries baked into the query
- Refreshed in under a minute with no intervention
The kicker? We added parameters so if the file name changed (e.g., “Spend_Week32.csv”), all they had to do was update one input. That’s it.
Best Practices & Tips from the Trenches
You’ve cleaned. You’ve transformed. You’ve automated.
Now comes the part most people skip—and pay for later: making your Power Query workflows sustainable.
Because here’s the thing: building it once is easy. Keeping it running week after week, across different files, new team members, and weird exceptions? That’s the real skill.
Let’s talk about the habits that separate finance pros who dabble in Power Query from the ones who own it.
Always Build in Layers
This is the golden rule. Never cram all your cleaning, transforming, and logic into a single monster query.
Instead, use the “staging” method:
- Raw Query – pulls in the messy data
- Cleaned Query – trims, replaces, standardizes
- Transformed Query – adds calculated columns, aggregates
- Final Output – filtered and shaped for your report
🧠 Why this matters: You can troubleshoot specific layers when things break, swap data sources without nuking your logic, and reuse stages in other reports.
Rename Everything Like You’re Handing It to a Stranger
Because future-you is that stranger.
If your query is named Query7 and has steps like Changed Type1, Changed Type2, and Removed Columns3, you’re creating a maintenance nightmare.
Do this instead:
- Name queries clearly: Raw_GL, Cleaned_Vendor_Spend, Top20_Vendors
- Rename every applied step: FilteredNulls, StandardizedCurrency, MergedWithExchangeRates
It takes 30 extra seconds, and it saves you hours of “what the hell did I do here?” six weeks later.
Use Data Profiling Early and Often
We talked about this in Section 3, but let me hammer it home:
Turn on data profiling (View → Column quality/distribution/profile) and never turn it off.
Why?
- It flags dirty columns before they break your formulas
- You can see at a glance if a data dump has nulls, duplicates, or weird outliers
- It helps you sanity check after merges and type changes
🧠 I once caught a bad join because column distribution showed 3,000 nulls after what looked like a successful merge. Profiling saved me from delivering a report with millions in missing transactions.
Stop Manually Cleaning the Same Thing More Than Once
If you’re fixing the same issue—like “GOOGLE INC.” vs. “Google”—in five different workbooks, you’re doing it wrong.
Centralize your logic.
Create one “vendor normalization” table and reuse it everywhere via merge. Same goes for:
- Department mappings
- GL account groupings
- Currency conversions
- Date offsets (e.g., fiscal periods)
Build a “Common Cleanup” query library and reference it in every file.
Save Your Query Logic in a Template File
Your boss is going to say “Hey, can you do this same thing for Marketing data?”
Don’t start from scratch.
Instead:
- Save your best Power Query setup as an Excel/Power BI template
- Swap out the data source
- Adjust only what’s different
Think of this like your personal finance automation playbook. Build it once, rinse and repeat.
Performance Is a Feature
Slow queries kill adoption. If it takes 5 minutes to refresh, people will just go back to manual Excel work.
Speed it up:
- Filter early. Cut down row count before merging.
- Remove unused columns ASAP. Fewer fields = faster memory usage.
- Avoid unnecessary steps. Especially repeated type changes or recalculations.
- Disable load on anything not needed for output.
- Don’t merge full tables if you can merge only needed fields (e.g., just bring in the “Rate” column, not all 50).
⚠️ Power Query isn’t SQL. It’s awesome—but it can get real sluggish real fast if you throw everything and the kitchen sink at it.
Document and Version Control Your Queries
If you’re in a team environment—or even just want to sleep at night—keep track of:
- When you built the query
- What file it came from
- What assumptions you made (e.g., fiscal year starts in July, “K” means thousand)
How to do it:
- Create a “Notes” query or parameter with metadata
- Add a comment block at the top of your M code
- Save old versions of your file with clear version names (Spend_Report_v1.2_cleaned)
