Skip to main content

Cleaning vs Transformation (vs Enrichment)

At a high-level, cleaning focuses on the quality of the data, transformation focuses on the usability & efficiency, and enrichment focuses on data completeness. Data cleaning is the process of removing data that does not belong in a dataset (e.g. duplicate, corrupted, or unnecessary data) and standardizing the remaining data (e.g. you might see values of Null, NA, Not Applicable, and ' ' that all mean the same thing). This is a crucial step for ensuring that decisions are being made on complete & accurate data, especially when analyzing data from more than one source. Data transformation is the process of converting data from one format to another for the purposes of warehousing and analysis. The whole purpose of transforming data is to make it easier for computers & end-users to work with! Data enrichment falls somewhere in the middle of cleaning and transformation. Enrichment is the process of enhancing collected data with relevant context (more data) obtained from additional trusted sources. One outcome of this enrichment process is in SourceMedium is our ability to place 30% more UTM coverage than Google Analytics 4 alone. SourceMedium’s data data cleaning, transformation & enrichment process harmonizes your data sources. Whether it’s correcting data discrepancies, or aligning date formats, we ensure your data is consistent, reliable and ready for analysis.