E: Evaluate

Identify the 'messiness' in your data. Check for missing, duplicate, and inconsistent data to understand the scope of your data quality issues.

The S.C.A.N. Methodology

A process for thoroughly examining your data to uncover quality issues.

Spot errors

Look for obvious errors in data entries and formats, such as typos, incorrect data types, or values that fall outside expected ranges.

Check completeness

Assess the completeness of your records and fields. Identify missing values and determine their potential impact on your analysis.

Assess duplication

Identify and quantify duplicate records across your datasets. Duplicates can skew analysis and lead to incorrect conclusions.

Note inconsistencies

Document inconsistencies in naming conventions, units of measurement, and categorical values across different data sources.

Next Step: Align

After evaluating your data, the next step is to standardize it. Learn how to apply consistent formats and naming conventions with the Align dimension.

Continue to A: Align