Machine Learning Systems Pt. 2: Data Pipelines with TensorFlow Extended

Building all the data pipeline components for production ML with TFX

Published in

Towards AI

10 min readMar 28, 2022

In part 1, I covered an overview and some of the primary challenges in doing MLOps. Implementing models at scale can be a difficult exercise due to the changing nature of data, business, and code.

Machine Learning Systems Pt. 1: Overview and Challenges

A gentle introduction to MLOps

animadurkar.medium.com

In this part, I’ll show how you can build data pipeline components using TensorFlow Extended (TFX). This will follow the work and skills taught in the Machine Learning Engineering (MLOps) in Production Specialization by DeepLearning.ai, specifically the second course on the Data Lifecycle in Production. I’ll go through the final assignment here, but I’ll be applying it to a new dataset. The dataset I’ll be working with is this Stroke Prediction Dataset via Kaggle.

Data Ingestion
Feature Selection
Data Validation and Pipeline
Feature Engineering

Machine Learning Systems Pt. 2: Data Pipelines with TensorFlow Extended

Building all the data pipeline components for production ML with TFX

Machine Learning Systems Pt. 1: Overview and Challenges

A gentle introduction to MLOps

Table of Contents

Published in Towards AI

Written by Ani Madurkar

No responses yet