Merge pull request #4 from jiakai-li/readme

Update README.md
jiakai-li · Dec 18, 2024 · 723525d · 723525d
2 parents 080c4fb + 2c6de33
commit 723525d
Show file tree

Hide file tree

Showing 5 changed files with 251 additions and 6 deletions.
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -13,7 +13,7 @@ jobs:
     name: Build Docker image and run end-to-end tests
     runs-on: ubuntu-latest
     steps:
-      - name: Checkout code from Github
+      - name: Checkout code from GitHub
         uses: actions/checkout@v3
       - name: Run end-to-end tests
         run: >

diff --git a/README.md b/README.md
@@ -1,3 +1,249 @@
 # Page Tracker
 
-This repo is a note from following the tutorial of [Build Robust Continuous Integration With Docker and Friends](https://realpython.com/docker-continuous-integration) from RealPython
+*This repo is a note from following the tutorial of [Build Robust Continuous Integration With Docker and Friends](https://realpython.com/docker-continuous-integration) from RealPython*
+
+## Overall architecture
+![page_tracker_image](./static/page_tracker_architecture.png)
+
+## [Repeatable Installs](https://pip.pypa.io/en/stable/topics/repeatable-installs)
+
+There are different methods to achieve repeatable installation, this pose specifically use `pyproject.toml` file without defining dependency versions,
+but use the [requirements-file](https://pip.pypa.io/en/stable/user_guide/#requirements-files) and [constrains-file](https://pip.pypa.io/en/stable/user_guide/#constraints-files) for the pinned version
+
+You can also use [pipenv](https://pipenv.pypa.io/en/latest) or [poetry](https://python-poetry.org). Speaking of which, [pipx](https://pipx.pypa.io/stable) is also worth to take a look.
+
+## [Editable Install](https://setuptools.pypa.io/en/latest/userguide/development_mode.html)
+
+This project follows [src layout](https://packaging.python.org/en/latest/discussions/src-layout-vs-flat-layout), and running below command makes development more convenient
+```bash
+(.venv) $ python -m pip install --editable .                            # Install current project in editable mode
+(.venv) $ python -m pip freeze --exclude-editable > constraints.txt     # Remove editable packages from constraints file
+```
+
+## [Optional Dependencies](https://setuptools.pypa.io/en/latest/userguide/dependency_management.html#optional-dependencies)
+
+Some dependencies are not required by all the end user, and therefore can be organized using optional dependencies like:
+```toml
+# ...
+[project.optional-dependencies]
+dev = [
+    "pytest",
+    # ...
+]
+# ...
+```
+In this way you don't force `pytest` to be installed with main dependencies. You can install the `dev` optional dependencies using:
+```bash
+(.venv) $ python -m pip install --editable ".[dev]"
+```
+
+## Test
+
+- Unit Test
+
+  Involves testing a program’s individual units or components to ensure that they work as expected.
+  In this simple project, it means to test the functionality of `page_tracker.app.index` handler function, which means we need to mock the behavior of `page_tracker.app.redis`.
+  It's worth noting that, apart from the happy path, mocking side effect should also be involved in the unit test (e.g. `test.unit.test_app.test_should_handle_redis_connection_error`).<br><br>
+
+- Integration Test
+
+  The goal of integration testing is to check how your components interact with each other as parts of a larger system.
+  In this simple project, it means to test the communication with a genuine Redis server instead of a mocked one.<br><br>
+
+- End-to-End Test
+
+  Put the complete software stack to the test by simulating an actual user’s flow through the application. As a result, end-to-end testing requires a deployment environment that mimics the production environment as closely as possible.
+  In this simple project, the end-to-end test scenario is similar to the integration test.
+  The main difference, though, is that you’ll be sending an actual HTTP request through the network to a live web server instead of relying on Flask’s test client.<br><br>
+
+  Now, running the end-to-end test requires the flask app and redis server are both running first:
+
+  ```bash
+  (.venv) $ docker start redis-server
+  (.venv) $ flask --app page_tracker.app run
+  ```
+
+## Static Code Analysis and Security Scanning
+
+This project uses [black](https://black.readthedocs.io/en/stable) to flag any formatting inconsistencies in your code,
+[isort](https://pycqa.github.io/isort) (seems not being actively maintained anymore) to ensure that your import statements stay organized according to the official recommendation, and
+[flake8](https://github.com/PyCQA/flake8) (seems not being actively maintained anymore) to check for any other PEP 8 style violations.
+```bash
+(.venv) $ python -m black src/
+(.venv) $ python -m isort src/
+(.venv) $ python -m flake8 src/
+```
+
+Once everything’s clean, you can lint your code to find potential code smells or ways to improve it using [pylint](https://pylint.readthedocs.io/en/stable)
+```bash
+(.venv) $ python -m pylint src/
+```
+For each unique [pylint identifier](https://pylint.readthedocs.io/en/latest/user_guide/messages/index.html) that you want to exclude, you can:
+  - Include the suppressed identifiers in a global configuration file for a permanent effect, or
+  - Use a command-line switch to ignore certain errors on a given run, or
+  - Add a specially formatted Python comment on a given line to account for special cases like:
+    ```python
+    @app.get("/")
+    def index():
+        try:
+            page_views = redis().incr("page_views")
+        except RedisError:
+            app.logger.exception("Redis error")  # pylint: disable=E1101  <--- suppress E1101 
+            return "Sorry, something went wrong \N{pensive face}", 500
+        else:
+            return f"This page has been seen {page_views} times."
+    ```
+And finally [bandit](https://github.com/PyCQA/bandit) is used to perform security or vulnerability scanning of your source code before deploying it anywhere
+```bash
+(.venv) $ python -m bandit -r src/
+```
+
+## Dockerize Web Application
+
+One good practice is to create and switch to a regular user without administrative privileges as soon as you don't need them anymore.
+```dockerfile
+RUN useradd --create-home realpython
+USER realpython
+WORKDIR /home/realpython
+```
+
+Another good practice suggested is to use a dedicated virtual environment even within the container, due to the concern of risk interfering with the container’s own system tools.
+
+>Unfortunately, many Linux distributions rely on the global Python installation to run smoothly. If you start installing packages directly into the global Python environment, then you open the door for potential version conflicts.
+
+It was suggested to directly modify the `PATH` environment variable:
+```dockerfile
+ENV VIRTUALENV=/home/realpython/venv
+RUN python3 -m venv $VIRTUALENV
+
+# Put $VIRTUALENV/bin before $PATH to prioritize it
+ENV PATH="$VIRTUALENV/bin:$PATH"
+```
+
+The reason of doing it this way is:
+- Activating your environment in the usual way would only be temporary and wouldn’t affect Docker containers derived from your image.
+- If you activated the virtual environment using Dockerfile’s `RUN` instruction, then it would only last until the next instruction in your Dockerfile because each one starts a new shell session.
+
+The third good practice suggested is to leverage layer caching, before copying source code and run test
+```dockerfile
+# Copy dependency files first
+COPY --chown=pagetracker pyproject.toml constraints.txt ./
+RUN python -m pip install --upgrade pip setuptools && \
+    python -m pip install --no-cache-dir -c constraints.txt ".[dev]"
+
+# Copy source files after the cached dependency layer
+COPY --chown=pagetracker src/ src/
+COPY --chown=pagetracker test/ test/
+
+# Run test (install the project first)
+# The reason for combining the individual commands in one RUN instruction is to reduce the number of layers to cache
+RUN python -m pip install . -c constraints.txt && \
+    python -m pytest test/unit/ && \
+    python -m flake8 src/ && \
+    python -m isort src/ --check && \
+    python -m black src/ --check --quiet && \
+    python -m pylint src/ --disable=C0114,C0116,R1705 && \
+    python -m bandit -r src/ --quiet
+```
+
+## Multi-Stage Builds
+```dockerfile
+FROM python:3.11.2-slim-bullseye AS builder
+# ...
+
+# Building a distribution package
+RUN python -m pip wheel --wheel-dir dist/ -c constraints.txt .
+
+FROM python:3.11.2-slim-bullseye AS target
+
+RUN apt-get update && \
+    apt-get upgrade -y
+
+RUN useradd --create-home pagetracker
+USER pagetracker
+WORKDIR /home/pagetracker
+
+ENV VIRTUALENV=/home/pagetracker/venv
+RUN python -m venv $VIRTUALENV
+ENV PATH="$VIRTUALENV/bin:$PATH"
+
+# Copy the distribution package
+COPY --from=builder /home/pagetracker/dist/page_tracker*.whl /home/pagetracker
+
+RUN python -m pip install --upgrade pip setuptools && \
+    python -m pip install --no-cache-dir page_tracker*.whl
+```
+
+## Version Docker Image
+
+Three versioning strategies:
+- **Semantic versioning** uses three numbers delimited with a dot to indicate the major, minor, and patch versions.
+- **Git commit hash** uses the SHA-1 hash of a Git commit tied to the source code in your image. E.g:
+  ```bash
+  $ docker build -t page-tracker:$(git rev-parse --short HEAD) .
+  ```
+- **Timestamp** uses temporal information, such as Unix time, to indicate when the image was built.
+
+## Multi-Container Docker Application
+
+Docker compose is used to coordinate different containers to run as a whole application
+```yaml
+services:
+  redis:
+    image: "redis:7.0.10-bullseye"
+    # ...
+
+  web:
+    build: ./web
+    # ...
+    command: "gunicorn page_tracker.app:app --bind 0.0.0.0:8000"
+```
+
+The command overwrite makes sure that we are using a production grade webserver for deployment. When we say production grade webserver, the flask provided webserver ([reference](https://stackoverflow.com/questions/12269537/is-the-server-bundled-with-flask-safe-to-use-in-production)):
+- It will not handle more than one request at a time by default.
+- If you leave debug mode on and an error pops up, it opens up a shell that allows for arbitrary code to be executed on your server (think os.system('rm -rf /')).
+- The development server doesn't scale well.
+
+## Run End-to-End Tests
+
+The docker compose can be used to set up a test container for running the end-to-end test. The [profiles](https://docs.docker.com/compose/how-tos/profiles) can be used to mark the test container service:
+```yaml
+services:
+# ...
+
+  test:
+    profiles:
+      - testing  # This is a helpful feature
+    build:
+      context: ./web
+      dockerfile: Dockerfile.dev  # Dockerfile.dev bundles the testing framework
+    environment:
+      REDIS_URL: "redis://redis:6379"
+      FLASK_URL: "http://web:8000"
+    networks:
+      - backend-network
+    depends_on:
+      - redis
+      - web
+    command: >
+      sh -c 'python -m pytest test/e2e/ -vv
+      --redis-url $$REDIS_URL
+      --flask-url $$FLASK_URL'
+```
+
+## Define a Docker-Based Continuous Integration Pipeline
+
+Depending on your team structure, experience, and other factors, you can choose from different source control branching models, also known as [workflows](https://www.atlassian.com/git/tutorials/comparing-workflows).
+
+Once the code is hosted on git, [GitHub Actions](https://docs.github.com/en/actions) let you specify one or more workflows triggered by certain events, like pushing code to a branch or opening a new pull request. Each workflow can define a number of jobs consisting of steps, which will execute on a runner.
+Each step of a job is implemented by an action that can be either:
+- A custom shell command or a script
+- A GitHub Action defined in another GitHub repository
+
+In the [workflow CI file](./.github/workflows/ci.yml), we defined below four activities, which are quite self-explanatory
+- Checkout code from GitHub
+- Run end-to-end tests
+- Login to Docker Hub
+- Push image to Docker Hub
+
+And this completes this note.
diff --git a/static/page_tracker_architecture.png b/static/page_tracker_architecture.png
diff --git a/web/pyproject.toml b/web/pyproject.toml
@@ -1,7 +1,7 @@
 [build-system]
 requires = [
     "setuptools>=67.0.0",
-    "wheel"
+    "wheel",
 ]
 build-backend = "setuptools.build_meta"
 
@@ -11,7 +11,7 @@ version = "1.0.0"
 dependencies = [
     "Flask",
     "gunicorn",
-    "redis"
+    "redis",
 ]
 
 [project.optional-dependencies]
@@ -23,5 +23,5 @@ dev = [
     "pylint",
     "pytest",
     "pytest-timeout",
-    "requests"
+    "requests",
 ]
diff --git a/web/test/unit/test_app.py b/web/test/unit/test_app.py
@@ -1,5 +1,4 @@
 import unittest.mock
-from http.client import responses
 
 from redis import ConnectionError