Docker Essentials: Streamlining Multi-Service Application Orchestra

Unlocking the Power of Compose for Seamless Machine Learning Workflows

Afaque Umer

Published in

Towards AI

10 min readJan 27, 2024

Introduction 📌

In the ever-evolving landscape of machine learning experimentation and application development, navigating the coordination of diverse system components poses a significant logistical challenge. Whether overseeing user interfaces, managing API endpoints, or handling databases, developers consistently grapple with the complexities of deploying and seamlessly connecting these integral elements. Crucially, this challenge transcends the realm of machine learning, extending its reach into diverse environments where interconnected services create distinct universes of dependencies resembling their own solar systems.

Now, honing our focus on the data science domain, envision a scenario where an application is actively used, and our objective is to monitor experiments as users engage with it. A dedicated team of data scientists, driven by innovative insights, works tirelessly to continuously evolve models and conduct experiments, striving to enhance model capabilities. To effectively manage and track these experiments and evaluations, the need for a shared database becomes apparent — an environment where all logs are recorded and visualized in a centralized manner.

Considering the current challenge, one potential solution involves building individual Docker images for each essential component — Streamlit UI, FastAPI server, and MLflow server. While this approach allows for the separation of concerns, it introduces a cumbersome process. Running each container independently, coupled with the intricate tasks of mapping ports and configuring volumes for each, becomes a tedious and repetitive endeavor. Recognizing the need for a more streamlined solution, Docker Compose emerges as the key to simplifying our orchestration challenges. In this blog, we’ll first explore the traditional approach of running these images individually and then unveil how Docker Compose transforms this experience.

Before proceeding, I’d like to mention that this is the third installment in the Docker Essentials series. If you feel the need to refresh your memory or find yourself a bit out of touch with the basics, I highly recommend checking out my previous article. In that piece, I covered the fundamentals of creating a Docker image and running it. Here’s the link for your reference 👇

Docker Essentials: Transforming Python Apps into Portable Containers

Sail through the journey of dockerization, mastering the intricacies effortlessly from inception to deployment

ai.plainenglish.io

In this concise blog post, I’ll walk you through the upcoming steps. Without any delay, let’s dive right into it. Here’s an overview of the blog structure:

Table of Contents 🔖

Building UI using Streamlit 🔥
Setting up API with FastAPI ⚡
Building Mlflow tracking server 🔎
Orchestrating Containers 🚀

Autobots 🤖 let’s roll out! ⚡

Section 1: Lighting up UI with Streamlit 🔥

For the initial section, let’s swiftly construct a UI using Streamlit. We won’t delve into the code intricacies as it’s self-explanatory — a UI capturing experiment name, parameter, and value for logging into MLflow. Think of it as a frontend application with distinct dependencies and packages, each in its own environment. Let’s proceed to build it, here’s the code for it:

import streamlit as st
import requests

FASTAPI_URL = "http://fastcontainer:8000/track-experiment"


def main():
    st.title("MLflow Experiment Tracker")

    experiment_name = st.text_input("Enter Experiment Name", value="Experiment 01")    
    
    col1, col2= st.columns(2)
    with col1:
        param = st.text_input("Enter Parameter Name", value="Accuracy")  
    with col2:
        value = st.text_input("Enter Parameter Value", value=89)


    if st.button("Track Experiment", use_container_width=True):
        try:
            track_experiment(experiment_name, param, value)
            st.success("Experiment tracked successfully ✅")
        except:
            st.error('Failed to connect to FastAPI server', icon="🚨")
            st.write(experiment_name, param, value)

def track_experiment(experiment_name, param, value):
    payload = {
        "experiment_name": experiment_name,
        "param": param,
        "value": str(value),
    }
    response = requests.post(FASTAPI_URL, json=payload)

    if response.status_code == 200:
        st.success("Updated the Database ✅")
    else:
        st.error(f"Failed to track experiment. Server returned status code: {response.status_code}", icon="🚨")


if __name__ == "__main__":
    main()

The FASTAPI_URL, including the server name, will be discussed later, for now, assume it’s a placeholder awaiting requests we’ll send. Moving on, let’s craft a Docker file for this. It’s straightforward — just the Streamlit dependency, and we’ll run the container by aligning the internal container port with the port on the host machine. This process was previously explained in our blog, so here’s a glimpse of the Docker file.

**Image By Author: Dockerfile for Streamlit App**

Let’s generate a Docker image from this giving the name as streamlitui and tagging as v1. docker build -t streamlitui:v1 and initiate it by mapping ports using the following commands docker run -p 0.0.0.0:8501:8501 streamlitui:v1 .

**Image By Author: Docker build and run**

Our Docker image has been built, and upon running it, here’s the visual representation of the UI. This marks the conclusion of our first section.

Section 2: Accelerating APIs with FastAPI ⚡

Now, let’s dive into the exciting part of our journey! Here, we’re going to create a FastAPI server, which is like a traffic cop for our application. It’s all set to manage the POST requests coming from our Streamlit UI.

Looking for a quick refresher on FastAPI fundamentals? Check out this link for a speedy review 👇⚡

FastAPI Fundamentals — Getting Faster with FastAPI

FastAPI is a modern, fast (high-performance), web framework for building APIs with Python 3.7+ based on standard…

levelup.gitconnected.com

The main goal? To effortlessly record these requests into MLflow, our trusty experiment tracker. Here’s a quick look at the FastAPI code 👇

from fastapi import FastAPI
from pydantic import BaseModel
import mlflow

mlflow.set_tracking_uri("sqlite:////app/db/mlflow.db")

app = FastAPI()

class ExperimentData(BaseModel):
    experiment_name: str
    param: str
    value: str

@app.post("/track-experiment")
async def track_experiment(data: ExperimentData):    
    
    mlflow.set_experiment(data.experiment_name)

    with mlflow.start_run():
        mlflow.log_param(data.param, data.value)
        
    return {"message": "Experiment tracked successfully!"}

To keep things organized, we’ve chosen a local SQLite database within MLflow as the storage space. Imagine it as a digital notebook where we’ll jot down all the details. But here’s the cool twist: we’re not keeping this notebook to ourselves; it’s going to be a shared resource. How? Well, Docker volumes come into play. When we run a container, it operates in its isolated environment, and any files created inside might disappear when the container stops.

To ensure our data persists beyond the container’s lifetime, we’ll harness the power of volumes. Picture a docker volume as a portal bridging multiple universes. It's a mechanism for persistently storing and sharing data between Docker containers and the host machine. By associating volumes with containers, when we restart a container that uses the same named volume, the data stored in that volume is persistently available.

It’s important to note that when crafting a Dockerfile, the commonly designated working directory, like /app, results in the contents being copied inside the container under /app. Consequently, if a volume is established within, let’s say, a db/ directory inside the app directory, accessing it requires specifying the absolute path, not a relative one. Let’s create the docker image for this using this Dockerfile.

**Image By Author: Dockerfile for FastAPI server**

Now let's build a docker image for this, giving it the name as fastapiserver:v1 . To run it, we will have to take care of naming the container and mapping ports and volumes.

Here we are creating a volume -v by linking the local directory /home/optimus/Compose_project/data on the host to the /app/db directory inside the container. This allows persistent storage and data sharing between the host and the container.

Let’s assess the functionality of the API using Thunderclient to confirm its operation.

**Image By Author: Requests using thunder client**

With two containers successfully up and running and the API in operation, the crucial question arises: How do we observe the experiments being tracked? This leads us to the third section, the MLflow container.

Section 3: Crafting the MLflow Container 🚀

This section is straightforward — we’re creating a Docker image solely for accessing the shared database where MLflow logging takes place. The concept here is that with this container running, anyone from the team can access it and track experiments.

**Image By Author: Dockerfile for Mlflow server**

The key considerations lie in port mapping and volume tagging. Let’s swiftly build the image by running docker build -t mlflowserver:v1 .and run it to explore the MLflow UI.

The server is now active at port 5000. Let’s delve into it. Recall the request we sent through Thunder client ⚡now, we’ll verify if it’s available in the shared database. It’s crucial to note that we’re utilizing a shared database between two distinct containers.

And there you have it! We’ve successfully built our containers and ensured correct mapping. This concludes our section.

Now that you’ve gone through the exhausting process of running each image separately, building them one by one, and juggling with port and volume mappings, you’ve probably realized the real-world scenario is anything but simple.

So, how do we make this process less cumbersome? Enter Docker Compose, the hero that simplifies orchestrating multi-service applications. This leads us to our concluding section.

Section 3: Composing the Orchestra 🎺

After navigating the complexities of individual container management, port mappings, and volume configurations, it’s clear that orchestrating multi-service applications can be a daunting task. This is where Docker Compose steps in as a game-changer. Docker Compose is a tool designed to simplify the deployment and management of multi-container applications. By providing a streamlined way to define, manage, and run interconnected services, it significantly eases the burden of intricate orchestration tasks.

It is used to define and run multi-container docker applications. It allows you to define the services, networks, and volumes needed for your application in a docker-compose.yml file. Before creating the Docker image, let’s familiarize ourselves with the file structure. Typically, the Docker Compose file resides in the root directory, and the services are organized as separate directories. Each directory represents an individual universe of a Docker image, encapsulating all the necessary files, requirements, and Dockerfile essential for building that image. Within the Compose file, the paths to these services are specified as relative paths, starting from the current directory (denoted by .). This implies that the paths reference the locations of the services relative to the directory containing the Docker Compose file. Let’s take a look at our compose folder structure 📂

Running docker-compose --build builds the Docker images defined in the Dockerfiles and starts the containers defined in the docker-compose.yml file. Let’s take a look at the compose file.

**Image By Author: Docker compose file**

Let’s break it down a bit. Services define the different services (containers) that make up your application. The name can be custom. Within each service build specifies the build context for the Docker image (the path where the docker file for that particular app exists). Dependencies are specified using depends_on to ensure that services start in the correct order. Volumes mounts the ./data directory on the host to the /app/db directory in the container. This allows persisting data between container restarts. This Docker Compose configuration ensures proper coordination between the Streamlit app, FastAPI server, and MLflow server, fostering a seamless and organized environment for developing and deploying the machine learning application.

In the Docker Compose file, I specified the ports as 5000:5000 instead of 0.0.0.0:5000:5000. Consequently, when this Compose container is running, all components of the application will be accessible exclusively via localhost. This means that the services will only be reachable from the host machine and won’t be exposed externally.

With a single command docker-compose up you can deploy the entire application stack. This simplifies the deployment process, making it accessible to developers with varying levels of expertise. So after the build let’s hit this and test all servers at once 🚀

**Image By Author: Docker Compose stack**

✨ Voila! With just one command, our services are up and running. 🚀

Let’s put this to the test! Open the Streamlit UI, set an experiment name, add parameters, and track it.

**Image By Author: Streamlit Service Container**

Now, fire up the MLflow server on the respective port and let’s check if the tracking is in action! 🚀

**Image By Author: Mlflow Service Container**

Woaaaaah! 🎉 We’ve just successfully crafted our inaugural multi-service application orchestration using Docker Compose. 🔥🚀 Alrighty then, maestro of multi-services, you’ve earned your composer stripes! What are you waiting for? Let’s compose some symphonies! 🎶🚀

I hope you enjoyed this article! and found it informative and engaging. You can follow me Afaque Umer for more such articles. Thanks for reading 🙏

The entire code for this app can be found in the GitHub repository. Here’s the repo link 🗃️👇

GitHub - afaqueumer/Docker-Compose-101

Contribute to afaqueumer/Docker-Compose-101 development by creating an account on GitHub.

github.com

I will try to bring up more machine learning/data science concepts and break down fancy-sounding terms and concepts into simpler ones.

Keep learning 🧠 Keep Sharing 🤝 Stay Awesome 🤘