Skip to content

The private PyPI server powered by flexible backends.

License

Notifications You must be signed in to change notification settings

d1opensource/pywharf

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pywharf

The private PyPI server powered by flexible backends.

build-and-push license


What is it?

pywharf allows you to deploy a PyPI server privately and keep your artifacts safe by leveraging the power (confidentiality, integrity and availability) of your storage backend. The backend mechanism is designed to be flexible so that the developer could support a new storage backend at a low cost.

Supported backends:

  • GitHub. (Yes, you can now host your Python package in GitHub by using pywharf. )
  • File system.
  • ... (Upcoming)

Design

Screen Shot 2020-03-24 at 8 19 12 PM

The pywharf server serves as an abstraction layer between Python package management tools (pip/poetry/twine) and the storage backends:

  • Package management tools communicate with pywharf server, following PEP 503 -- Simple Repository API for searching/downloading package, and Legacy API for uploading package.
  • pywharf server then performs file search/download/upload operations with some specific storage backend.

Usage

Install from PyPI

pip install pywharf==0.2.3

This should bring the execuable pywharf to your environment.

$ pywharf --help
SYNOPSIS
    pywharf <command> <command_flags>

SUPPORTED COMMANDS
    server
    update_index
    github.init_pkg_repo
    github.gen_gh_pages

Using the docker image (recommended)

Docker image: pywharf/pywharf:0.2.3. The image tag is the same as the package version in PyPI.

$ docker run --rm pywharf/pywharf:0.2.3 --help
SYNOPSIS
    pywharf <command> <command_flags>

SUPPORTED COMMANDS
    server
    update_index
    github.init_pkg_repo
    github.gen_gh_pages

Run the server

To run the server, use the command pywharf server.

SYNOPSIS
    pywharf server ROOT <flags>

POSITIONAL ARGUMENTS
    ROOT (str):
        Path to the root folder. This folder is for logging,
        file-based lock and any other file I/O.

FLAGS
    --config (Optional[str]):
        Path to the package repository config (TOML),
        or the file content if --config_or_admin_secret_can_be_text is set.
        Default to None.
    --admin_secret (Optional[str]):
        Path to the admin secrets config (TOML) with read/write permission.
        or the file content if --config_or_admin_secret_can_be_text is set.
        This field is required for local index synchronization.
        Default to None.
    --config_or_admin_secret_can_be_text (Optional[bool]):
        Enable passing the file content to --config or --admin_secret.
        Default to False.
    --auth_read_expires (int):
        The expiration time (in seconds) for read authentication.
        Default to 3600.
    --auth_write_expires (int):
        The expiration time (in seconds) for write authentication.
        Default to 300.
    --extra_index_url (str):
        Extra index url for redirection in case package not found.
        If set to empty string explicitly redirection will be suppressed.
        Default to 'https://pypi.org/simple/'.
    --debug (bool):
        Enable debug mode.
        Default to False.
    --host (str):
        The interface to bind to.
        Default to '0.0.0.0'.
    --port (int):
        The port to bind to.
        Default to 8888.
    **waitress_options (Dict[str, Any]):
        Optional arguments that `waitress.serve` takes.
        Details in https://docs.pylonsproject.org/projects/waitress/en/stable/arguments.html.
        Default to {}.

In short, the configuration passed to --config defines mappings from pkg_repo_name to backend-specific settings. In other words, a single server instance can be configured to connect to multiple backends.

Exampe of the configuration file passed to --config:

[pywharf-pkg-repo]
type = "github"
owner = "pywharf"
repo = "pywharf-pkg-repo"

[local-file-system]
type = "file_system"
read_secret = "foo"
write_secret = "bar"

Exampe of the admin secret file passed to --admin_secret:

[pywharf-pkg-repo]
type = "github"
raw = "<personal-access-token>"

[local-file-system]
type = "file_system"
raw = "foo"

Example run:

docker run --rm \
    -v /path/to/root:/pywharf-root \
    -v /path/to/config.toml:/config.toml \
    -v /path/to/admin_secret.toml:/admin_secret.toml \
    -p 8888:8888 \
    pywharf/pywharf:0.2.3 \
    server \
    /pywharf-root \
    --config=/config.toml \
    --admin_secret=/admin_secret.toml

Server API

Authentication in shell

User must provide the pkg_repo_name and their secret in most of the API calls so that the server can find which backend to operate and determine whether the operation is permitted or not. The pkg_repo_name and the secret should be provided in basic access authentication.

Some package management tools will handle the authentication behind the screen, for example,

Some will not, for example,

  • Pip: you need to prepend <pkg_repo_name>:<secret>@ to the hostname in the URL manually like this https://[username[:password]@]pypi.company.com/simple. ref

Authentication in browser

You need to visit /login page to submit pkg_repo_name and the secret, since most of the browsers today don't support prepending <username>:<password>@ to the hostname in the URL. The pkg_repo_name and the secret will be stored in the session cookies. To reset, visit /logout .

Example: http://localhost:8888/login/

Screen Shot 2020-03-25 at 12 36 03 PM

PEP-503, Legacy API

The server follows PEP 503 -- Simple Repository API and Legacy API to define APIs for searching/downloading/uploading package:

  • GET /simple/: List all distributions.
  • GET /simple/<distrib>/: List all packages in a distribution.
  • GET /simple/<distrib>/<filename>: Download a package file.
  • POST /simple/: Upload a package file.

In a nutshell, you need to set the "index url / repository url / ..." to http://<host>:<port>/simple/ for the package management tool.

Server management

GET /index_mtime

Get the last index index synchronization timestamp.

$ curl http://debug:foo@localhost:8888/index_mtime/
1584379892
POST /initialize

Submit configuration and (re-)initialize the server. User can change the package repository configuration on-the-fly with this API.

# POST the file content.
$ curl \
    -d "config=${CONFIG}&admin_secret=${ADMIN_SECRET}" \
    -X POST \
    http://localhost:8888/initialize/

# Or, POST the file.
$ curl \
    -F 'config=@/path/to/config.toml' \
    -F 'admin_secret=@/path/to/admin_secret.toml' \
    http://localhost:8888/initialize/

Update index

Screen Shot 2020-03-25 at 5 39 19 PM

Index file is used to track all published packages in a specific time:

  • Remote index file: the index file sotred in the backend. By design, this file is only updated by a standalone update index service and will not be updated by the pywharf server.
  • Local index file: the index file synchronized from the remote index file by the pywharf server

To update the remote index file, use the command pywharf update_index:

SYNOPSIS
    pywharf update_index TYPE NAME <flags>

POSITIONAL ARGUMENTS
    TYPE (str):
        Backend type.
    NAME (str):
        Name of config.

FLAGS
    --secret (Optional[str]):
        The secret with write permission.
    --secret_env (Optional[str]):
        Instead of passing the secret through --secret,
        the secret could be loaded from the environment variable.
    **pkg_repo_configs (Dict[str, Any]):
        Any other backend-specific configs are allowed.

Backend developer could setup an update index service by invoking pywharf update_index command.

Backend-specific commands

The backend registration mechanism will hook up the backend-specific commands to pywharf. As illustrated, commands github.init_pkg_repo and github.gen_gh_pages are registered by github backend.

$ pywharf --help
SYNOPSIS
    pywharf <command> <command_flags>

SUPPORTED COMMANDS
    server
    update_index
    github.init_pkg_repo
    github.gen_gh_pages

Environment mode

If no argument is passed, pywharf will try to load the arguments from the environment variables. This mode would be helpful if passing argument in shell is not possible.

The format:

  • PYWHARF_COMMAND: to set <command>.
  • PYWHARF_COMMAND_<FLAG>: to set the flag of <command>.

Backends

GitHub

Introduction

pywharf will help you setup a new GitHub repository to host your package. You package will be published as repository release and secured by personal access token. Take https://github.com/pywharf/pywharf-pkg-repo and https://pywharf.github.io/pywharf-pkg-repo/ as an example.

Configuration and secret

Package repository configuration of GitHub backend:

  • type: must set to github.
  • owner: repository owner.
  • repo: repository name.
  • branch (optional): the branch to store the remote index file. Default to master.
  • index_filename (optional): the name of remote index file. Default to index.toml.
  • max_file_bytes (optional): limit the maximum size (in bytes) of package. Default to 2147483647 since each file included in a release must be under 2 GB, as restricted by GitHub .
  • sync_index_interval (optional): the sleep time interval (in seconds) before taking the next local index file synchronization. Default to 60.

Example configuration of https://github.com/pywharf/pywharf-pkg-repo:

[pywharf-pkg-repo]
type = "github"
owner = "pywharf"
repo = "pywharf-pkg-repo"

The GitHub backend accepts personal access token as the repository secret. The pywharf server calls GitHub API with PAT to operate on packages. You can authorize user with read or write permission based on team role.

Initialize the repository

To initialize a GitHub repository as the storage backend, run the command github.init_pkg_repo:

docker run --rm pywharf/pywharf:0.2.3 \
    github.init_pkg_repo \
    --name pywharf-pkg-repo \
    --owner pywharf \
    --repo pywharf-pkg-repo \
    --token <personal-access-token> \
    --pywharf_version 0.2.3

This will:

  • Create a new repository <owner>/<repo>.
  • Setup the GitHub workflow to update the remote index file if new package is published.
  • Print the configuration for you.

If you want to host the index in GitHub page, like https://pywharf.github.io/pywharf-pkg-repo/, add --enable_gh_pages to command execution.

GitHub workflow integration

To use pywharf with GitHub workflow, take thie main.yml as an example.

Firstly, run the server as job service:

services:
  pywharf:
    image: pywharf/pywharf:0.2.3
    ports:
      - 8888:8888
    volumes:
      - pywharf-root:/pywharf-root
    env:
      PYWHARF_COMMAND: server
      PYWHARF_COMMAND_ROOT: /pywharf-root

Secondly, initialize the server with configuration and admin secret (Note: remember to add the admin secret to your repository before using it):

steps:
  - name: Setup pywharf
  run: |
    curl \
        -d "config=${CONFIG}&admin_secret=${ADMIN_SECRET}" \
        -X POST \
        http://localhost:8888/initialize/
  env:
    CONFIG: |
      [pywharf-pkg-repo]
      type = "github"
      owner = "pywharf"
      repo = "pywharf-pkg-repo"
    ADMIN_SECRET: |
      [pywharf-pkg-repo]
      type = "github"
      raw = "${{ secrets.PYWHARF_PKG_REPO_TOKEN }}"

Afterward, set http://localhost:8888/simple/ as the repository url, and you are good to go.

File system

Introduction

You can configure this backend to host the packages in the local file system.

Configuration and secret

Package repository configuration of GitHub backend:

  • type: must set to file_system.
  • read_secret: defines the secret with read only permission.
  • write_secret: defines the secret with write permission.
  • max_file_bytes (optional): limit the maximum size (in bytes) of package. Default to 5368709119 (5 GB).
  • sync_index_interval (optional): the sleep time interval (in seconds) before taking the next local index file synchronization. Default to 60.

Example configuration:

[local-file-system]
type = "file_system"
read_secret = "foo"
write_secret = "bar"

To use the API, user must provide either read_secret or write_secret.

Initialize the package repository

A folder will be created automatically to store the packages, with the path <ROOT>/cache/<pkg_repo_name>/storage.

About

The private PyPI server powered by flexible backends.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 84.3%
  • Dockerfile 15.7%