Skip to main content

Django Docker and Celery

I've finally had the time to create a Django+Celery project that can be completely run using Docker and Docker Compose. I'd like to share some of the steps that helped me achieve this. I've created an example project that I've used to demo this process. The example project can be viewed here on Github.

https://github.com/JoeJasinski/docker-django-demo/tree/blogpost

To run this example, you will need to have a recent version of Docker and Docker Compose installed. I'm using Docker 17.03.1 and Docker Compose 1.11.2.

Let's take a look at one of the core files, the docker-compose.yml.   You'll notice that I have defined the following services:
  •  db - the service running the Postgres database container, needed for the Django app
  • rabbitmq - service running the RabbitMQ container, needed for queuing jobs submitted by Celery
  • app - the service containing Django app container
  • worker - the service that runs the Celery worker container
  • web - the service that runs the Nginx container, which proxies web requests to the Django service.  
Some of these services depend on others, and are specified as such by using the depends_on Docker Compose parameter. Some of the services set environment variables to configure the behavior of each container.

   i.e.
   environment:
       - DATABASE_URL=postgres://postgres@db/postgres
       - CELERY_BROKER_URL=amqp://guest:guest@rabbitmq:5672//
       - SITE_DIR=/site/
       - PROJECT_NAME=dddemo
       - DJANGO_DEBUG=False

Each of these services belong to the network named jaz, and can communicate with each other. For services that only need to communicate internally with other services, I've used the expose configuration parameter to expose given ports to the other containers. Only for the Nginx container, I've used the ports parameter to expose tcp ports 80 and 443 to the Docker host and make the app available for browsing with a web browser.

   i.e.
   web:
     image: nginx:1.11
     ...
     ports:
       - "80:80"
       - "443:443"
     networks:
       - jaz

   networks:
     jaz:

I've created a Docker volume called static-volume to hold static files generated by the app. These are where static files - copied by Django's collectstatic management command - get copied to. This volume is shared between the app service and the Nginx service, and Nginx serves up the staticfiles. 

    services:
       
       ...
       web:
         image: nginx:1.11
         ...
         volumes:
           ...
           - static-volume:/static

    volumes:
      static-volume:

The Postgres, RabbitMQ, and Nginx services all are built off of official Docker images for those services.  However, the app and worker services run the Django codebase, and they require some environmental customization.  The build docker-compose keyword lets us specify the Dockerfile to use and define build context (the subdirectory tree the Dockerfile build process has access to).

   i.e.
   app:
     build:
       context: .
       dockerfile: Dockerfile

Therefore, I've created a Dockerfile to build the app service properly. The Dockerfile uses the ubuntu:16.04 base image; however, a more lightweight image could be used instead. One Docker best practice is to run all the apt-get installs plus a cleanup step in a single Dockerfile RUN command. This ensures that when the Docker layer is created, it is not created with extra temporary files needed by Apt. (Specifically, the layer is created only after doing some space cleanup.)

    RUN apt-get update && apt-get install -y \
        build-essential \
        ...
        zlib1g-dev \
        && rm -rf /var/lib/apt/lists/*

Next, in the Dockerfile, we create a virtualenv and some other supporting directories. There is some debate about the need for creating a virtualenv inside of a Docker container. I personally think it is a good idea because it isolates the application Python modules from the OS-level Python modules (just as it does in a non-Docker setup). It offers one more layer of isolation.

    RUN mkdir -p $SITE_DIR
    WORKDIR $SITE_DIR
    RUN mkdir -p proj/ var/log/ htdocs/
    
    RUN python3 -mvenv env/

After creating the virtualenv, I force an upgrade of pip to ensure I'm using the most recent pip version.

    RUN env/bin/pip install pip --upgrade

One important step that may seem out of place is that I copy in the Python requirements.txt file and install the Python requirements early on in the Dockerfile build process. Installing the Python requirements is a time-consuming process, and I can leverage Docker's build-in caching feature to ensure that Docker only needs to install the requirements if a change is specifically made to the requirements file. If I were push that step further down in the Dockerfile, I'd risk unnecessarily re-installing the Python requirements every time I make an arbitrary change to the code.

    COPY requirements.txt requirements.txt
    RUN env/bin/pip install -r requirements.txt

I like to explicitly install uwsgi to ensure it's present, since it's required to even attempt to start the app. I also set some environment variables, such as the database connection string. Note: these environment variables can be overridden by defining them in the environment section of the docker-compose.yml file.

    RUN env/bin/pip install uwsgi

    ENV NUM_THREADS=2
    ENV NUM_PROCS=2

    ENV DJANGO_DATABASE_URL=postgres://postgres@db/postgres

Next, the Dockefile copies a docker-utils folder into the container. This folder contains most of the Docker-specific scripts and configuration files needed to run Django and other services, and this copy makes these files available to the container.

    COPY docker-utils/ docker-utils/

After that, the entire codebase is copied from the current directory to a proj/ directory inside the container. The proj directory will be where the container runs the codebase from.

    COPY . proj/

Finally, the Dockerfile ENTRYPOINT is set to a script called entrypoint.sh. Normally, the entrypoint defaults to a shell executable, such as /bin/bash. However, overriding it allows us to do some interesting things - more on that later.  The Dockerfile CMD is set to another shell script, app-start.sh. This specifies the default command (or script) that should be run when the container starts. In this case, the app-start.sh actually runs the Django process by executing the uwsgi command.

    ENTRYPOINT ["./docker-utils/entrypoint.sh"]
    CMD ["./docker-utils/app-start.sh"]

Let's take a closer look at the entrypoint.sh. This was one of the files in the docker-utils folder copied to the container. The entrypoint.sh is called every time the container starts, regardless of arguments passed to the container. It is just a shell script that lets us add some additional optional logic. In this example, if someone passes "init" as an argument to the docker-compose run command, we run the Django migrate and collectstatic management commands.

    #!/bin/bash

    set -eoux pipefail

    if [ "$1" == 'init' ]; then
        echo "Run Migrations"
        ${SITE_DIR}/env/bin/python ${SITE_DIR}/proj/manage.py migrate
        ${SITE_DIR}/env/bin/python ${SITE_DIR}/proj/manage.py collectstatic --no-input
    elif [ "$1" == 'manage' ]; then
        shift
        echo "Manage.py $@"
        ${SITE_DIR}/env/bin/python ${SITE_DIR}/proj/manage.py $@
    else
        exec "$@"
    fi

The start-app.sh command is pretty straightforward as it is a simple wrapper to call uwsgi and run the Django app. There are a few things worth mentioning about it. First, several configuration options are passed into it via environment variables. We use the --chdir flag to change directory to the proj/ directory (containing the codebase) before running. As required by Django, we set the DJANGO_SETTINGS_MODULE environment variable to make uwsgi aware of the Django settings. Another useful setting is the --python-autoreload=1 parameter, which tells uwsgi to reload when it detects a change to the codebase. This operates very similarly to how runserver reloads and is very useful for development. Many of the options are fairly standard uwsgi command options.

#!/bin/bash


echo "Starting uWSGI for ${PROJECT_NAME}"


$SITE_DIR/env/bin/uwsgi --chdir ${SITE_DIR}proj/ \
    --module=${PROJECT_NAME}.wsgi:application \
    --master \
    --env DJANGO_SETTINGS_MODULE=${PROJECT_NAME}.settings \
    --vacuum \
    --max-requests=5000 \
    --virtualenv ${SITE_DIR}env/ \
    --socket 0.0.0.0:8000 \
    --processes $NUM_PROCS \
    --threads $NUM_THREADS \
    --python-autoreload=1

Now that we've talked a bit about the Dockerfile and docker build process, let's take a closer look at the Nginx service in the docker-compose.yml file. The official Nginx Docker image has a very specific way that you are supposed to customize the Nginx configuration.  Specifically, an Nginx configuration template (default.template.conf) is passed into the container via a Docker volume. When the container is executed, the envsubst command combines the configuration template with environment variables from the container and generates the actual Nginx configuration. This allows you to dynamically craft an Nginx configuration file by passing in different environment variables via the docker-compose.yml file.

   web:
     image: nginx:1.11
     ...
     volumes:

       - ./docker-utils/nginx/default.template.conf:/root/default.template.conf

     command: /bin/bash -c "envsubst '$$NGINX_HTTP_PORT $$NGINX_HTTPS_PORT' < /root/default.template.conf > /etc/nginx/conf.d/default.conf && nginx -g 'daemon off;'"

     environment:
       - NGINX_HTTP_PORT=80
       - NGINX_HTTPS_PORT=443

Lets take a look at the Celery worker service in the docker-compose.yml file. This service uses the same Dockerfile that was used for the build of the app service, but a different command executes when the container runs. There is nothing magic going on with this command; this simply executes Celery inside of the virtualenv.

   worker:
     build:
       context: .
       dockerfile: Dockerfile
     container_name: dddemo-worker
     command: /site/env/bin/celery worker -A dddemo --workdir /site/proj/ -l info

Finally, we can move away from the Docker-related configuration and take a look at the Celery configuration in the Django project. Much of the following configuration is boilerplate from the Celery 4.0 docs, so I won't go into too much detail.

First, there exists a celery.py file inside of the Django project. This integrates Celery into the Django project, and reads in the Celery configuration from settings.py. Next, there exists an __init__.py at the root of the project, which initializes the aforementioned celery.py. The Django settings.py contains some Celery configuration, including how to connect to the RabbitMQ service. A very simple Celery add task is defined in tasks.py; this task will add two numbers passed to it. A very simple Django view hosts a page at the root url and will execute the add task in the background. The results of the add task can be viewed in the taskresult module in the Django Admin.

Though this is a very basic prototype example, it demonstrates a complete Django project and backing services needed to execute Celery jobs. The entire distributed system needed for this application can be executed with only two Docker Compose commands.

To get started, the README describes how to run the project and provides some other common commands that can be used to administer the project. Hope this is useful, and let me know your feedback.



Comments

  1. This comment has been removed by a blog administrator.

    ReplyDelete
  2. Hey!!! Do you know how to implement certbot with this setup?

    ReplyDelete
  3. Very Good And Useful information Thank For Sharing if you want to join - Website Designing Company visit Vcare Technical Institute.


    ReplyDelete

Post a Comment

Popular posts from this blog

Docker: Run as non root user

It's good practice to run processes within a container as a non-root user with restricted permissions.  Even though containers are isolated from the host operating system, they do share the same kernel as the host. Also, processes within a container should be prevented from writing to where they shouldn't be allowed to as extra protection against exploitation. Running a Docker process as a non-root user has been a Docker feature as of version 1.10. To run a Docker process as a non-root user, permissions need to be accounted for meticulously.  This permission adjustment needs to be done when building a Dockerfile. You need to be aware of where in the filesystem your app might write to, and adjust the permissions accordingly.  Since everything in a container is considered disposable, the container process really shouldn't be writing to too many locations once build. Here is an annotated example of how you might create a Dockerfile where the process that runs within runs a

Django: Using Caching to Track Online Users

Recently I wanted a simple solution to track whether a user is online on a given Django site.  The definition of "online" on a site is kind of ambiguous, so I'll define that a user is considered to be online if they have made any request to the site in the last five minutes. I found that one approach is to use Django's caching framework to track when a user last accessed the site.  For example, upon each request, I can have a middleware set the current time as a cache value associated with a given user.  This allows us to store some basic information about logged-in user's online state without having to hit the database on each request and easily retrieve it by accessing the cache. My approach below.  Comments welcome. In settings.py: # add the middleware that you are about to create to settings MIDDLEWARE_CLASSES = ( .... 'middleware.activeuser_middleware.ActiveUserMiddleware' , .... ) # Setup caching per Django docs. In actuality, you

Automatic Maintenance Page for Nginx+Django app

If you've used Django with Nginx, you are probably familiar with how to configure the Nginx process group to reverse proxy to a second Gunicorn or uWSGI Django process group.  (The proxy_pass Nginx parameter passes traffic through Nginx to Django.) One benefit of this approach is that if your Django process crashes or if you are preforming an upgrade and take Django offline, Nginx can still be available to serve static content and offer some sort of "the system is down" message to users.  With just a few lines of configuration, Nginx can be set to automatically display a splash page in the above scenarios. If the Django process running behind the reverse proxy becomes unavailable, a 502 error will be emitted by Nginx.  By default, that 502 will be returned to the browser as an ugly error message.  However, Nginx can be configured to catch that 502 error and respond with custom behavior.  Specifically, if a 502 is raised by Nginx, Nginx can check for a custom html erro