Skip to content

Commit

Permalink
feat: support Debian flat repos
Browse files Browse the repository at this point in the history
Fixes issue #56

Follow-up and credit to @alexconrey (PR #55), @ericlchen1 (PR #64) and
@benmccown (PR #67) for their work on similar PRs that I've reviewed and
drawn some inspiration to create "one 💍 PR to merge them all" 😅

Problem:

Debian has two types of repos: "canonical" and "flat". Each has a
different sources.list syntax:

"canonical":
```
deb uri distribution [component1] [component2] [...]
```
(see https://wiki.debian.org/DebianRepository/Format#Overview)

flat:
```
deb uri directory/
```
(see https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format)

A flat repository does not use the dists hierarchy of directories, and
instead places meta index and indices directly into the archive root (or
some part below it)

Thus, the URL logic in _fetch_package_index() is incorrect for these
repos and it always fails to fetch the Package index.

Solution:

Just use the Debian sources.list convention in the 'sources' section of
the manifest to add canonical and flat repos. Depending on whether the
channel has one directory that ends in '/' or a (dist, component, ...)
structure the _fetch_package_index and other internal logic will
know whether the source is a canonical or a flat repo.

For example:
```
version: 1

sources:
  # canonical repo
  - channel: bullseye main contrib
    url: https://snapshot-cloudflare.debian.org/archive/debian/20240210T223313Z
  # flat repos, note the trailing '/' and the lack of distribution or components
  - channel: bullseye-cran40/
    url: https://cloud.r-project.org/bin/linux/debian
  - channel: ubuntu2404/x86_64/
    url: https://developer.download.nvidia.com/compute/cuda/repos

archs:
  - amd64

packages:
  - bash
  - r-mathlib
  - nvidia-container-toolkit-base
```

Disregarding the "mixing" of Ubuntu and Debian repos for the purpose of
the example, this manifest shows that you can mix canonical and flat
repos and you can mix multiarch and single-arch flat repos and canonical
repos.

You will still have the same problems as before with packages that only
exist for one architecture and/or repos that only support one
architecture. In those cases, simply separate the repos and packages
into their own manifests.

NOTE:
The NVIDIA CUDA repos don't follow Debian specs and have issues with the
package filenames. This is addressed in a separate commit.
  • Loading branch information
jjmaestro committed Oct 29, 2024
1 parent 1c735c5 commit 7a6222f
Show file tree
Hide file tree
Showing 7 changed files with 139 additions and 8 deletions.
11 changes: 11 additions & 0 deletions WORKSPACE.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,17 @@ load("@bullseye_nolock//:packages.bzl", "bullseye_nolock_packages")

bullseye_nolock_packages()

# bazel run @bullseye_rproject//:lock
deb_index(
name = "bullseye_rproject",
lock = "//examples/debian_flat_repo:bullseye_rproject.lock.json",
manifest = "//examples/debian_flat_repo:bullseye_rproject.yaml",
)

load("@bullseye_rproject//:packages.bzl", "bullseye_rproject_packages")

bullseye_rproject_packages()

deb_index(
name = "apt_security",
manifest = "//examples/debian_snapshot_security:security.yaml",
Expand Down
38 changes: 31 additions & 7 deletions apt/private/manifest.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,13 @@ def _source(src):

index = "Packages"

index_path = "dists/{dist}/{comp}/binary-{arch}".format(**src)
output = "{dist}/{comp}/{arch}/{index}".format(index = index, **src)
if "directory" in src: # flat repo:
src["directory"] = src["directory"].rstrip("/")
index_path = src["directory"]
output = "{directory}/{arch}/{index}".format(index = index, **src)
else: # canonical
index_path = "dists/{dist}/{comp}/binary-{arch}".format(**src)
output = "{dist}/{comp}/{arch}/{index}".format(index = index, **src)

return struct(
arch = src["arch"],
Expand Down Expand Up @@ -72,12 +77,31 @@ def _from_dict(manifest, manifest_label):

for arch in manifest["archs"]:
for src in manifest["sources"]:
dist, components = src["channel"].split(" ", 1)
src["arch"] = arch

for comp in components.split(" "):
src["dist"] = dist
src["comp"] = comp
src["arch"] = arch
channel_chunks = src["channel"].split(" ")

# support both canonical and flat repos, see:
# canonical: https://wiki.debian.org/DebianRepository/Format#Overview
# flat repo: https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format
if len(channel_chunks) > 1: # canonical
dist, components = channel_chunks[0], channel_chunks[1:]

if dist.endswith("/"):
fail("Debian dist ends in '/' but this is not a flat repo")

for comp in components:
src["dist"] = dist
src["comp"] = comp

sources.append(_source(src))
else: # flat
directory = channel_chunks[0]

if not directory.endswith("/"):
fail("Debian flat repo directory must end in '/'")

src["directory"] = directory

sources.append(_source(src))

Expand Down
6 changes: 5 additions & 1 deletion apt/private/package_index.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -93,9 +93,13 @@ def _parse_package_index(packages, contents, source):

if len(pkg.keys()) != 0:
pkg["Root"] = source.base_url

# NOTE: workaround for multi-arch flat repos
arch = source.arch if pkg["Architecture"] == "all" else pkg["Architecture"]

_package_set(
packages,
keys = (source.arch, pkg["Package"], pkg["Version"]),
keys = (arch, pkg["Package"], pkg["Version"]),
package = pkg,
)
last_key = ""
Expand Down
48 changes: 48 additions & 0 deletions examples/debian_flat_repo/BUILD.bazel
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
load("@container_structure_test//:defs.bzl", "container_structure_test")
load("@rules_distroless//apt:defs.bzl", "dpkg_status")
load("@rules_oci//oci:defs.bzl", "oci_image", "oci_load")

PACKAGES = [
"@bullseye//dpkg",
"@bullseye//apt",
"@bullseye_rproject//r-mathlib",
]

# Creates /var/lib/dpkg/status with installed package information.
dpkg_status(
name = "dpkg_status",
controls = [
"%s/amd64:control" % package
for package in PACKAGES
],
)

oci_image(
name = "apt",
architecture = "amd64",
os = "linux",
tars = [
":dpkg_status",
] + [
"%s/amd64" % package
for package in PACKAGES
],
)

oci_load(
name = "tarball",
image = ":apt",
repo_tags = [
"distroless/test:latest",
],
)

container_structure_test(
name = "test",
configs = ["test_linux_amd64.yaml"],
image = ":apt",
target_compatible_with = [
"@platforms//cpu:x86_64",
"@platforms//os:linux",
],
)
15 changes: 15 additions & 0 deletions examples/debian_flat_repo/bullseye_rproject.lock.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"packages": {
"r-mathlib": {
"amd64": {
"arch": "amd64",
"dependencies": [],
"name": "r-mathlib",
"sha256": "cbe3abbcc74261f2ad84159b423b856c1a0b4ebe6fef2de763d8783ff00245d5",
"url": "https://cloud.r-project.org/bin/linux/debian/bullseye-cran40/r-mathlib_4.4.1-1~bullseyecran.0_amd64.deb",
"version": "4.4.1-1~bullseyecran.0"
}
}
},
"version": 2
}
20 changes: 20 additions & 0 deletions examples/debian_flat_repo/bullseye_rproject.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Packages for examples/debian_flat_repo.
#
# Anytime this file is changed, the lockfile needs to be regenerated.
#
# To generate the bullseye_rproject.lock.json run the following command
#
# bazel run @bullseye_rproject//:lock
#
# See debian_package_index at WORKSPACE.bazel
version: 1

sources:
- channel: bullseye-cran40/
url: https://cloud.r-project.org/bin/linux/debian

archs:
- amd64

packages:
- r-mathlib
9 changes: 9 additions & 0 deletions examples/debian_flat_repo/test_linux_amd64.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
schemaVersion: "2.0.0"

commandTests:
- name: "apt list --installed"
command: "apt"
args: ["list", "--installed"]
expectedOutput:
- Listing\.\.\.
- r-mathlib/now 4.4.1-1~bullseyecran.0 amd64 \[installed,local\]

0 comments on commit 7a6222f

Please sign in to comment.