Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize Publish API #84

Open
esoterra opened this issue Mar 22, 2023 · 4 comments
Open

Generalize Publish API #84

esoterra opened this issue Mar 22, 2023 · 4 comments

Comments

@esoterra
Copy link
Collaborator

Expand the publish api to use the content_needed mechanism so clients can automatically discover where/how to upload content.

@esoterra esoterra added this to the Dogfood Registry milestone Mar 22, 2023
@peterhuene
Copy link
Member

peterhuene commented May 25, 2023

This can possibly be closed with the recent updates to the REST API implementation in the client/server.

However, what the API currently does not support is a way for the registry to tell the client that it should upload the content to a location that isn't the registry itself; clients can still inform the registry of known locations for the content already, if supported by the registry (warg-server currently only supports direct content uploads - I personally don't think it's a good idea for a registry to connect to user-supplied hosts to be able to download and check content against policy).

I think we need to really consider why informing the client as to where it should upload its content is needed in the API.

At a surface level, it feels like it isn't buying us much when you take into consideration that the registry will want to stream new content anyway to validate it against content policy.

For example: the registry configures some storage bucket with a special access token only for a particular content upload, informs the client of the location (plus access token), client uploads the content and then notifies the registry the upload is complete, the registry must then turn around and download the content from the storage bucket to check it against policy, then revoke the access token so the bucket is never writable again (ick to even temporarily writable buckets from public IPs, access token or not).

Does that really make sense over direct upload to the registry itself?

If a general purpose registry can't scale to being the go-between for the client and a content storage service like S3 (i.e. client uploads the content to the registry, the registry incrementally checks it against content policy, writes the stream out to a never-writable-by-anything-but-the-registry S3 bucket as it goes, and informs the client that the content is available via a public S3 or CDN download URL), then it probably shouldn't be in the "general purpose registry" business.

For scenarios like mirroring, the registry doing the mirroring gets the option to amend (or entirely replace) the content sources for records, since the client directly asks the registry for the sources prior to downloading content; the mirror could choose to mirror the content as well and replace the sources or simply return the original sources if it only mirrors the logs.

@lann
Copy link
Collaborator

lann commented May 25, 2023

I agree that any particular registry would likely only have one supported upload method, and that for a BA-operated registry that may be "upload to the registry directly". IIRC we were advised that "large cloud providers" wouldn't like that approach and would want clients to push directly to object stores, which matches what I have seen in some large cloud provder service designs.

Separately, several BA members (Fermyon included) would like to host packages with OCI instead of (plain) HTTP; there are certainly other ways to implement that feature but that was a design consideration.

@esoterra
Copy link
Collaborator Author

As per what Lann said, there are reasons people may want to direct content uploads to different places. So, we need to specify it (including in the OpenAPI) and should exercise them in our reference implementation regardless of whether certain registry instances won't need it.

@calvinrp
Copy link
Collaborator

calvinrp commented Jun 12, 2023

@macovedj and I are looking into a PR for the upload flow API endpoints and CLI support. Thinking about blob store signed URL policies, multipart uploads and digest. As well as, OCI Registry push.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants