Project information

  • Development Language: Python
  • Technology: Airflow, Soda Core, SQL, Snowflake, Slack
  • Coolness Factor: 🌟🌟🌟
  • Time/Cost Savings: ~.5-1 FTE

Data Quality

This was a simple, not-so-novel-but-powerful project that automated the majority of our data quality checks. Our QA engineer was previously running each SQL check manually, which was extremely time consuming. I set up airflow to run a daily soda core job that would read the SQL from a large yaml file and execute it against our snowflake database. We added some of our own customizations, like posting the results to a separate snowflake table for visibility, as well as sending slack alerts when specific test cases failed. While this was a pretty simple project, it improved our data observability significantly and helped reduce the turnaround time for data quality issues.