From Detection to Correction: How to Keep Your Production Data Clean and Reliable
Published in
8 min readApr 24, 2023
In Production ML, data quality is everything. No matter how great your models or algorithms are, if the data you feed them is garbage, you’ll get garbage results. But how can you tell if your data is good or bad? That’s what we’re going to explore in this article.
We’ll start by discussing the importance of validating data and detecting data issues in production. Specifically, we’ll focus on two types of data issues: data and concept drift and schema and distribution skew. These issues can be…