Skip to content
This repository has been archived by the owner on Dec 18, 2024. It is now read-only.

jiakai-li/page-tracker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Page Tracker

This repo is a note from following the tutorial of Build Robust Continuous Integration With Docker and Friends from RealPython

Overall architecture

page_tracker_image

There are different methods to achieve repeatable installation, this pose specifically use pyproject.toml file without defining dependency versions, but use the requirements-file and constrains-file for the pinned version

You can also use pipenv or poetry. Speaking of which, pipx is also worth to take a look.

This project follows src layout, and running below command makes development more convenient

(.venv) $ python -m pip install --editable .                            # Install current project in editable mode
(.venv) $ python -m pip freeze --exclude-editable > constraints.txt     # Remove editable packages from constraints file

Some dependencies are not required by all the end user, and therefore can be organized using optional dependencies like:

# ...
[project.optional-dependencies]
dev = [
    "pytest",
    # ...
]
# ...

In this way you don't force pytest to be installed with main dependencies. You can install the dev optional dependencies using:

(.venv) $ python -m pip install --editable ".[dev]"

Test

  • Unit Test

    Involves testing a program’s individual units or components to ensure that they work as expected. In this simple project, it means to test the functionality of page_tracker.app.index handler function, which means we need to mock the behavior of page_tracker.app.redis. It's worth noting that, apart from the happy path, mocking side effect should also be involved in the unit test (e.g. test.unit.test_app.test_should_handle_redis_connection_error).

  • Integration Test

    The goal of integration testing is to check how your components interact with each other as parts of a larger system. In this simple project, it means to test the communication with a genuine Redis server instead of a mocked one.

  • End-to-End Test

    Put the complete software stack to the test by simulating an actual user’s flow through the application. As a result, end-to-end testing requires a deployment environment that mimics the production environment as closely as possible. In this simple project, the end-to-end test scenario is similar to the integration test. The main difference, though, is that you’ll be sending an actual HTTP request through the network to a live web server instead of relying on Flask’s test client.

    Now, running the end-to-end test requires the flask app and redis server are both running first:

    (.venv) $ docker start redis-server
    (.venv) $ flask --app page_tracker.app run

Static Code Analysis and Security Scanning

This project uses black to flag any formatting inconsistencies in your code, isort (seems not being actively maintained anymore) to ensure that your import statements stay organized according to the official recommendation, and flake8 (seems not being actively maintained anymore) to check for any other PEP 8 style violations.

(.venv) $ python -m black src/
(.venv) $ python -m isort src/
(.venv) $ python -m flake8 src/

Once everything’s clean, you can lint your code to find potential code smells or ways to improve it using pylint

(.venv) $ python -m pylint src/

For each unique pylint identifier that you want to exclude, you can:

  • Include the suppressed identifiers in a global configuration file for a permanent effect, or
  • Use a command-line switch to ignore certain errors on a given run, or
  • Add a specially formatted Python comment on a given line to account for special cases like:
    @app.get("/")
    def index():
        try:
            page_views = redis().incr("page_views")
        except RedisError:
            app.logger.exception("Redis error")  # pylint: disable=E1101  <--- suppress E1101 
            return "Sorry, something went wrong \N{pensive face}", 500
        else:
            return f"This page has been seen {page_views} times."

And finally bandit is used to perform security or vulnerability scanning of your source code before deploying it anywhere

(.venv) $ python -m bandit -r src/

Dockerize Web Application

One good practice is to create and switch to a regular user without administrative privileges as soon as you don't need them anymore.

RUN useradd --create-home realpython
USER realpython
WORKDIR /home/realpython

Another good practice suggested is to use a dedicated virtual environment even within the container, due to the concern of risk interfering with the container’s own system tools.

Unfortunately, many Linux distributions rely on the global Python installation to run smoothly. If you start installing packages directly into the global Python environment, then you open the door for potential version conflicts.

It was suggested to directly modify the PATH environment variable:

ENV VIRTUALENV=/home/realpython/venv
RUN python3 -m venv $VIRTUALENV

# Put $VIRTUALENV/bin before $PATH to prioritize it
ENV PATH="$VIRTUALENV/bin:$PATH"

The reason of doing it this way is:

  • Activating your environment in the usual way would only be temporary and wouldn’t affect Docker containers derived from your image.
  • If you activated the virtual environment using Dockerfile’s RUN instruction, then it would only last until the next instruction in your Dockerfile because each one starts a new shell session.

The third good practice suggested is to leverage layer caching, before copying source code and run test

# Copy dependency files first
COPY --chown=pagetracker pyproject.toml constraints.txt ./
RUN python -m pip install --upgrade pip setuptools && \
    python -m pip install --no-cache-dir -c constraints.txt ".[dev]"

# Copy source files after the cached dependency layer
COPY --chown=pagetracker src/ src/
COPY --chown=pagetracker test/ test/

# Run test (install the project first)
# The reason for combining the individual commands in one RUN instruction is to reduce the number of layers to cache
RUN python -m pip install . -c constraints.txt && \
    python -m pytest test/unit/ && \
    python -m flake8 src/ && \
    python -m isort src/ --check && \
    python -m black src/ --check --quiet && \
    python -m pylint src/ --disable=C0114,C0116,R1705 && \
    python -m bandit -r src/ --quiet

Multi-Stage Builds

FROM python:3.11.2-slim-bullseye AS builder
# ...

# Building a distribution package
RUN python -m pip wheel --wheel-dir dist/ -c constraints.txt .

FROM python:3.11.2-slim-bullseye AS target

RUN apt-get update && \
    apt-get upgrade -y

RUN useradd --create-home pagetracker
USER pagetracker
WORKDIR /home/pagetracker

ENV VIRTUALENV=/home/pagetracker/venv
RUN python -m venv $VIRTUALENV
ENV PATH="$VIRTUALENV/bin:$PATH"

# Copy the distribution package
COPY --from=builder /home/pagetracker/dist/page_tracker*.whl /home/pagetracker

RUN python -m pip install --upgrade pip setuptools && \
    python -m pip install --no-cache-dir page_tracker*.whl

Version Docker Image

Three versioning strategies:

  • Semantic versioning uses three numbers delimited with a dot to indicate the major, minor, and patch versions.
  • Git commit hash uses the SHA-1 hash of a Git commit tied to the source code in your image. E.g:
    $ docker build -t page-tracker:$(git rev-parse --short HEAD) .
  • Timestamp uses temporal information, such as Unix time, to indicate when the image was built.

Multi-Container Docker Application

Docker compose is used to coordinate different containers to run as a whole application

services:
  redis:
    image: "redis:7.0.10-bullseye"
    # ...

  web:
    build: ./web
    # ...
    command: "gunicorn page_tracker.app:app --bind 0.0.0.0:8000"

The command overwrite makes sure that we are using a production grade webserver for deployment. When we say production grade webserver, the flask provided webserver (reference):

  • It will not handle more than one request at a time by default.
  • If you leave debug mode on and an error pops up, it opens up a shell that allows for arbitrary code to be executed on your server (think os.system('rm -rf /')).
  • The development server doesn't scale well.

Run End-to-End Tests

The docker compose can be used to set up a test container for running the end-to-end test. The profiles can be used to mark the test container service:

services:
# ...

  test:
    profiles:
      - testing  # This is a helpful feature
    build:
      context: ./web
      dockerfile: Dockerfile.dev  # Dockerfile.dev bundles the testing framework
    environment:
      REDIS_URL: "redis://redis:6379"
      FLASK_URL: "http://web:8000"
    networks:
      - backend-network
    depends_on:
      - redis
      - web
    command: >
      sh -c 'python -m pytest test/e2e/ -vv
      --redis-url $$REDIS_URL
      --flask-url $$FLASK_URL'

Define a Docker-Based Continuous Integration Pipeline

Depending on your team structure, experience, and other factors, you can choose from different source control branching models, also known as workflows.

Once the code is hosted on git, GitHub Actions let you specify one or more workflows triggered by certain events, like pushing code to a branch or opening a new pull request. Each workflow can define a number of jobs consisting of steps, which will execute on a runner. Each step of a job is implemented by an action that can be either:

  • A custom shell command or a script
  • A GitHub Action defined in another GitHub repository

In the workflow CI file, we defined below four activities, which are quite self-explanatory

  • Checkout code from GitHub
  • Run end-to-end tests
  • Login to Docker Hub
  • Push image to Docker Hub

And this completes this note.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published