-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PoC: TAP 19 - Add support for Content Addressable Systems like IPFS in TUF #2415
PoC: TAP 19 - Add support for Content Addressable Systems like IPFS in TUF #2415
Conversation
Signed-off-by: Shubham Nazare <[email protected]>
Quite an elegant implementation: I like it! |
This looks neat. A couple of quick comments for now (I hope to have time to review this next week):
I think the url should be https://shubham4443.github.io/tuf-ipfs...
That sounds a bit strange, likely isn't related to your patch as such... Does this repository really work for you (if you wipe your local metadata cache first to ensure you really get the same files)? I have a question as well: I have never used IPFS and don't really know how it works. What additional software is this code expecting me to run? ipfs_gateway_url = 'http://127.0.0.1:8081/ipfs/'
...
file_url = self.ipfs_gateway_url + self.cid
response = requests.get(file_url, timeout=5) |
Thanks for correcting the url! The code requires you to download an IPFS daemon which can be found here - https://ipfs.tech/#install. The code uses a private gateway to install the IPFS content. However, public gateways can also be used which do not require anything to be downloaded but they are slow sometimes. How we should utilize these gateways to download files is yet to be discussed between my mentors. |
I think I got it now, it's expecting that there is an ipfs application running on localhost that serves as a http-ipfs proxy: https://daniel.haxx.se/blog/2022/08/10/ipfs-and-their-gateways/ Maybe this has advantages that I don't see or is something IPFS users know to expect... but I think this is not a great idea for a client library. It's basically an undocumented runtime dependency on a webservice. Even if the URL was not hard coded it feels wrong. This may be a stupid question and IPFS just does not work like that but ... Is there no self-contained IPFS python module we could depend on? |
@jku There seems to be no actively maintained IPFS python module (ref: https://discuss.ipfs.tech/t/why-there-is-no-python-working-library-to-work-with-ipfs/15871). However, I can see some recent activities in https://github.com/ipfs-shipyard/py-ipfs-http-client (see: ipfs-shipyard/py-ipfs-http-client#316) |
I've taken a closer look now:
My hand wavy suggestion would be to
I'm available for a meeting (in EEST office hours) if the above sounds like I'm not making sense or there's a significant disagreement: like I said elsewhere, CAS and IPFS are not something I'm familiar with so I could be making assumptions... |
@jku So on one hand, I think it should be possible to support both HTTP and IPFS downloading, and then the client can choose. Allowing the server side to easily do both (and possibly more content addressable stores) would be On the other hand, I am very interested in trying to IPFS-ify the metadata itself as a follow-up task, and for that I think further differences from the way things are done today are desirable. E.g. I think explicit/TUF-level snapshots are no longer needed if the root always signs a single Merkle DAG containing everything --- consistency is effectively delegated to the IPFS layer below. More divergences of course mean less to leverage for |
@jku I largely agree with your points. We've discussed having this be a standalone application (though we were also considering is
Could you elaborate so I'm not misunderstanding it? We do have this change proposed in the TAP for CAS, do you mean this can't be included until the TAP is approved and text merged into the spec? Also, a generic TUF client cannot choose the CAS, it must first be enabled on the server side by updating the metadata. A generic client that does not support the CAS at this point will of course fail.
@Ericson2314 I'm not sure if this is practical, though it depends on "root" in your message. Do you mean we remove the snapshot role and have the timestamp role identify the IPFS root node that contains the current set of all TUF metadata? |
Today a TUF client that sees targetpath "ipfs:abcdef" will use that string to build a URL and will try to download that URL with HTTP. With this PR, the client would not do that and would instead connect to an ipfs gateway on localhost. It looks like a change in functionality that client apps should explicitly enable... Maybe this is not very important in practice but it has a bit of a smell. The more important point was in the previous paragraph: No existing repository can just start using IPFS targetpaths because the clients would just stop working even if the TUF library had the IPFS feature (since approximately no-one runs an IPFS gateway). |
@shubham4443 the discussion is a bit removed from your PR, sorry about that. I know this is still marked a draft and I feel a bit bad about drowning the PR with these comments: if you'd rather continue in peace and quiet just say so, we can find another place to have these talks.
Yeah I have seen that mentioned but never expanded on: that would indeed change the equation... but I think that idea is quite far from the current PR and I'm not too keen on letting the existence of that idea affect the decision made here. To be more specific:
If this PR should be viewed more as ground work for metadata-over-IPFS then I think I'd like to see at least a brief design doc for that. |
@jku So the GSOC as formally proposed is just about targets to keep the scope manageable. I personally am most interested in IPFS metadata (+ targets) and the conceptual refactoring that comes with it --- as I think that's when we cross the threshold from "adding misc features" to "reducing complexity via separating concerns" --- but I don't want to speak for the others. I wanted to include where thing might go next to give you additional context, but yes the location of this work should probably be decided mostly/entirely based on the scope of the GSOC. |
Closing since this now lives in https://github.com/theupdateframework/tap19-ipfs-poc |
Fixes #2325
Description of the changes being introduced by the pull request:
This PR is a part of GSoC'23 Project. It introduces support for content addressable systems like IPFS in TUF. Detailed implementation document can be found here.
In order to test with an actual content addressable system, I have created a sample repository which contains metadatas with an IPFS target. Simply run the following commands to download the target file.
The PR is in draft status.
hashes
andlength
fields intargets.json
are redundant in case of content addressable systems and needs to be removed.cc: @adityasaky @Ericson2314 @mnm678
Please verify and check that the pull request fulfills the following
requirements: