Migration Data to Metabase
At my company, we already had a database server where the warehouse team inputs sales data through a web system. But as I started preparing to build dashboards with Metabase, I realized that the database wasn’t ready for analysis.
- The data was messy and unstructured.
- There were testing records mixed with real data.
- SKUs were inconsistent—different formats, typos, mixed naming.
- Most importantly: the product data was combined in a single column.
For example, one transaction was recorded as a single row, and the product column contained a mix like:
“SKU-A 2 SKU-B 1 SKU-C 5”
This structure made it impossible to analyze product-level sales properly.
Meanwhile, the warehouse team was also maintaining manual spreadsheets for their daily operations. I decided to use those spreadsheets as my starting point, clean the data, and migrate it properly to a structured database.
The Challenges
- Product columns contained combined SKUs and quantities → I needed to split them into separate rows.
- SKU inconsistencies → typos, inconsistent naming.
- Missing Customer IDs → no unique identifier to track customer behavior.
- Most importantly: the product data was combined in a single column.
- Existing database polluted with testing data → not usable for direct visualization.
What I Did
- Collected the manual spreadsheet data from the warehouse team.
- Processed it using Python:
- Split product columns → one row for each product with its corresponding quantity.
- Cleaned and standardized SKU names.
- Generated unique customer IDs using a fallback logic: If phone number exists → use phone number. If phone is empty → use recipient’s name → or user’s name → or as a last fallback, use shipping address.
- Migrated the cleaned data to a new MySQL database: Transactions Table → contains transaction info (date, channel, total amount, customer ID). Product Sales Table → one product per row, linked via transaction_id. After restructuring the data → connected it to Metabase for visualization.
The Result
- Structured and cleaner database → no more errors in Metabase.
- Built transaction and product-level dashboards for tracking marketplace and CS sales.
- Provided a better foundation for further analytics and reporting in the company.
- Successfully transformed unusable data into a reliable source for insights.
What I Learned
- This project taught me that data visualization is only as good as the data underneath. Tools like Metabase are powerful, but they require clean, well-structured data to work properly.
- It was also my first time combining roles of data cleaning, lightweight data engineering, and visualization in a single project.