pywharf
allows you to deploy a PyPI server privately and keep your artifacts safe by leveraging the power (confidentiality, integrity and availability) of your storage backend. The backend mechanism is designed to be flexible so that the developer could support a new storage backend at a low cost.
Supported backends:
- GitHub. (Yes, you can now host your Python package in GitHub by using
pywharf
. ) - File system.
- ... (Upcoming)
The pywharf
server serves as an abstraction layer between Python package management tools (pip/poetry/twine) and the storage backends:
- Package management tools communicate with
pywharf
server, following PEP 503 -- Simple Repository API for searching/downloading package, and Legacy API for uploading package. pywharf
server then performs file search/download/upload operations with some specific storage backend.
pip install pywharf==0.2.3
This should bring the execuable pywharf
to your environment.
$ pywharf --help
SYNOPSIS
pywharf <command> <command_flags>
SUPPORTED COMMANDS
server
update_index
github.init_pkg_repo
github.gen_gh_pages
Docker image: pywharf/pywharf:0.2.3
. The image tag is the same as the package version in PyPI.
$ docker run --rm pywharf/pywharf:0.2.3 --help
SYNOPSIS
pywharf <command> <command_flags>
SUPPORTED COMMANDS
server
update_index
github.init_pkg_repo
github.gen_gh_pages
To run the server, use the command pywharf server
.
SYNOPSIS
pywharf server ROOT <flags>
POSITIONAL ARGUMENTS
ROOT (str):
Path to the root folder. This folder is for logging,
file-based lock and any other file I/O.
FLAGS
--config (Optional[str]):
Path to the package repository config (TOML),
or the file content if --config_or_admin_secret_can_be_text is set.
Default to None.
--admin_secret (Optional[str]):
Path to the admin secrets config (TOML) with read/write permission.
or the file content if --config_or_admin_secret_can_be_text is set.
This field is required for local index synchronization.
Default to None.
--config_or_admin_secret_can_be_text (Optional[bool]):
Enable passing the file content to --config or --admin_secret.
Default to False.
--auth_read_expires (int):
The expiration time (in seconds) for read authentication.
Default to 3600.
--auth_write_expires (int):
The expiration time (in seconds) for write authentication.
Default to 300.
--extra_index_url (str):
Extra index url for redirection in case package not found.
If set to empty string explicitly redirection will be suppressed.
Default to 'https://pypi.org/simple/'.
--debug (bool):
Enable debug mode.
Default to False.
--host (str):
The interface to bind to.
Default to '0.0.0.0'.
--port (int):
The port to bind to.
Default to 8888.
**waitress_options (Dict[str, Any]):
Optional arguments that `waitress.serve` takes.
Details in https://docs.pylonsproject.org/projects/waitress/en/stable/arguments.html.
Default to {}.
In short, the configuration passed to --config
defines mappings from pkg_repo_name
to backend-specific settings. In other words, a single server instance can be configured to connect to multiple backends.
Exampe of the configuration file passed to --config
:
[pywharf-pkg-repo]
type = "github"
owner = "pywharf"
repo = "pywharf-pkg-repo"
[local-file-system]
type = "file_system"
read_secret = "foo"
write_secret = "bar"
Exampe of the admin secret file passed to --admin_secret
:
[pywharf-pkg-repo]
type = "github"
raw = "<personal-access-token>"
[local-file-system]
type = "file_system"
raw = "foo"
Example run:
docker run --rm \
-v /path/to/root:/pywharf-root \
-v /path/to/config.toml:/config.toml \
-v /path/to/admin_secret.toml:/admin_secret.toml \
-p 8888:8888 \
pywharf/pywharf:0.2.3 \
server \
/pywharf-root \
--config=/config.toml \
--admin_secret=/admin_secret.toml
User must provide the pkg_repo_name
and their secret in most of the API calls so that the server can find which backend to operate and determine whether the operation is permitted or not. The pkg_repo_name
and the secret should be provided in basic access authentication.
Some package management tools will handle the authentication behind the screen, for example,
- Twine: to set the environment variables
TWINE_USERNAME
andTWINE_PASSWORD
. ref - Poetry: Configuring credentials.
Some will not, for example,
- Pip: you need to prepend
<pkg_repo_name>:<secret>@
to the hostname in the URL manually like thishttps://[username[:password]@]pypi.company.com/simple
. ref
You need to visit /login
page to submit pkg_repo_name
and the secret, since most of the browsers today don't support prepending <username>:<password>@
to the hostname in the URL. The pkg_repo_name
and the secret will be stored in the session cookies. To reset, visit /logout
.
Example: http://localhost:8888/login/
The server follows PEP 503 -- Simple Repository API and Legacy API to define APIs for searching/downloading/uploading package:
GET /simple/
: List all distributions.GET /simple/<distrib>/
: List all packages in a distribution.GET /simple/<distrib>/<filename>
: Download a package file.POST /simple/
: Upload a package file.
In a nutshell, you need to set the "index url / repository url / ..." to http://<host>:<port>/simple/
for the package management tool.
Get the last index index synchronization timestamp.
$ curl http://debug:foo@localhost:8888/index_mtime/
1584379892
Submit configuration and (re-)initialize the server. User can change the package repository configuration on-the-fly with this API.
# POST the file content.
$ curl \
-d "config=${CONFIG}&admin_secret=${ADMIN_SECRET}" \
-X POST \
http://localhost:8888/initialize/
# Or, POST the file.
$ curl \
-F 'config=@/path/to/config.toml' \
-F 'admin_secret=@/path/to/admin_secret.toml' \
http://localhost:8888/initialize/
Index file is used to track all published packages in a specific time:
- Remote index file: the index file sotred in the backend. By design, this file is only updated by a standalone
update index
service and will not be updated by thepywharf
server. - Local index file: the index file synchronized from the remote index file by the
pywharf
server
To update the remote index file, use the command pywharf update_index
:
SYNOPSIS
pywharf update_index TYPE NAME <flags>
POSITIONAL ARGUMENTS
TYPE (str):
Backend type.
NAME (str):
Name of config.
FLAGS
--secret (Optional[str]):
The secret with write permission.
--secret_env (Optional[str]):
Instead of passing the secret through --secret,
the secret could be loaded from the environment variable.
**pkg_repo_configs (Dict[str, Any]):
Any other backend-specific configs are allowed.
Backend developer could setup an update index
service by invoking pywharf update_index
command.
The backend registration mechanism will hook up the backend-specific commands to pywharf
. As illustrated, commands github.init_pkg_repo
and github.gen_gh_pages
are registered by github
backend.
$ pywharf --help
SYNOPSIS
pywharf <command> <command_flags>
SUPPORTED COMMANDS
server
update_index
github.init_pkg_repo
github.gen_gh_pages
If no argument is passed, pywharf
will try to load the arguments from the environment variables. This mode would be helpful if passing argument in shell is not possible.
The format:
PYWHARF_COMMAND
: to set<command>
.PYWHARF_COMMAND_<FLAG>
: to set the flag of<command>
.
pywharf
will help you setup a new GitHub repository to host your package. You package will be published as repository release and secured by personal access token. Take https://github.com/pywharf/pywharf-pkg-repo and https://pywharf.github.io/pywharf-pkg-repo/ as an example.
Package repository configuration of GitHub backend:
type
: must set togithub
.owner
: repository owner.repo
: repository name.branch
(optional): the branch to store the remote index file. Default tomaster
.index_filename
(optional): the name of remote index file. Default toindex.toml
.max_file_bytes
(optional): limit the maximum size (in bytes) of package. Default to2147483647
since each file included in a release must be under 2 GB, as restricted by GitHub .sync_index_interval
(optional): the sleep time interval (in seconds) before taking the next local index file synchronization. Default to60
.
Example configuration of https://github.com/pywharf/pywharf-pkg-repo:
[pywharf-pkg-repo]
type = "github"
owner = "pywharf"
repo = "pywharf-pkg-repo"
The GitHub backend accepts personal access token as the repository secret. The pywharf
server calls GitHub API with PAT to operate on packages. You can authorize user with read or write permission based on team role.
To initialize a GitHub repository as the storage backend, run the command github.init_pkg_repo
:
docker run --rm pywharf/pywharf:0.2.3 \
github.init_pkg_repo \
--name pywharf-pkg-repo \
--owner pywharf \
--repo pywharf-pkg-repo \
--token <personal-access-token> \
--pywharf_version 0.2.3
This will:
- Create a new repository
<owner>/<repo>
. - Setup the GitHub workflow to update the remote index file if new package is published.
- Print the configuration for you.
If you want to host the index in GitHub page, like https://pywharf.github.io/pywharf-pkg-repo/, add --enable_gh_pages
to command execution.
To use pywharf
with GitHub workflow, take thie main.yml as an example.
Firstly, run the server as job service:
services:
pywharf:
image: pywharf/pywharf:0.2.3
ports:
- 8888:8888
volumes:
- pywharf-root:/pywharf-root
env:
PYWHARF_COMMAND: server
PYWHARF_COMMAND_ROOT: /pywharf-root
Secondly, initialize the server with configuration and admin secret (Note: remember to add the admin secret to your repository before using it):
steps:
- name: Setup pywharf
run: |
curl \
-d "config=${CONFIG}&admin_secret=${ADMIN_SECRET}" \
-X POST \
http://localhost:8888/initialize/
env:
CONFIG: |
[pywharf-pkg-repo]
type = "github"
owner = "pywharf"
repo = "pywharf-pkg-repo"
ADMIN_SECRET: |
[pywharf-pkg-repo]
type = "github"
raw = "${{ secrets.PYWHARF_PKG_REPO_TOKEN }}"
Afterward, set http://localhost:8888/simple/
as the repository url, and you are good to go.
You can configure this backend to host the packages in the local file system.
Package repository configuration of GitHub backend:
type
: must set tofile_system
.read_secret
: defines the secret with read only permission.write_secret
: defines the secret with write permission.max_file_bytes
(optional): limit the maximum size (in bytes) of package. Default to5368709119
(5 GB).sync_index_interval
(optional): the sleep time interval (in seconds) before taking the next local index file synchronization. Default to60
.
Example configuration:
[local-file-system]
type = "file_system"
read_secret = "foo"
write_secret = "bar"
To use the API, user must provide either read_secret
or write_secret
.
A folder will be created automatically to store the packages, with the path <ROOT>/cache/<pkg_repo_name>/storage
.