Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Better functionality for ipfs add --to-files -w #10658

Closed
wants to merge 9 commits into from

Conversation

ProximaNova
Copy link

@ProximaNova ProximaNova commented Jan 10, 2025

Make ipfs add + --to-files + --wrap-with-directory work by adding the root CID of what was added to MFS. The is different from past official Kubo which did this: added a literal "." folder to MFS (bug). It's also different from current official Kubo: disallow add to be ran with to-files and wrap. Todo/ideas for this fork: make something like --to-files-cid which only adds root added CID to MFS, whether it's a file, folder, ran with wrap, ran without wrap. (Done: add+to-files+wrap always defaults to adding the CID to MFS and not literal "." and not disabled.)

This update makes perfect sense with --to-files; in other words it does the following:

  • Without wrap, the final thing is "[CID] [file/folder name]" -> file/folder name added to MFS.
  • With wrap, the final thing is "[CID]" -> CID added to MFS

If a user ran add + --to-files + -w then it's expected behavior to add the final CID to MFS. It being disabled would be unexpected. BTW, https://github.com/ipfs/community/blob/master/CONTRIBUTING_GO.md says something about changing branch name to something other than "master"; I didn't do that.

@ProximaNova ProximaNova requested a review from a team as a code owner January 10, 2025 05:17
@ProximaNova ProximaNova changed the title Better functionality for ipfs add --to-files -w fix: Better functionality for ipfs add --to-files -w Jan 10, 2025
@lidel lidel requested a review from guillaumemichel January 14, 2025 15:49
@guillaumemichel
Copy link

Related to #10611 and #10612

@ProximaNova
Copy link
Author

Related to 10611 and 10612

(yes it is related to those two. I didn't mention that for some reason.)

Some checks were not successful
4 failing and 12 successful checks

I see this message in regards to and on this pull request after logging in to GitHub. This is insignificant syntax pickiness such as "should omit type bool from declaration; it will be inferred from the right-hand side (stylecheck)" at https://github.com/ipfs/kubo/actions/runs/12703758495/job/35598233653?pr=10658 . I compiled everything before making this pull request and my modified Go code all worked. It compiled or was built with no errors or warnings showing up in the CLI. The resulting binary worked as I described. I am saying all this to point out that those minor "fails" should not detract much from this idea and this functional code.

Copy link

@guillaumemichel guillaumemichel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now passing all tests except sharness tests that need to be updated to reflect the change of behaviour.

I am not sure of the value of combining -w and --to-files, which results in adding an unnamed directory in MFS. Filesystems usually use human readable names, so I don't understand why it would be desirable to have a directory named after a CID. It would be like calling mkdir without providing a name for the new directory.

@lidel
Copy link
Member

lidel commented Jan 21, 2025

Triage notes:

  • using CID as the name in MFS is potential UX footgun
  • using UUID instead is betetr but still feels like a hack
  • at the end of the day, user that needs to adjust the name of parent dir can run ipfs files mkdir -p before adding
    • this makes this PR a net negative, because it introduces name generation that is outside of user control

@ProximaNova we also may miss the point of your PR. Do you mind sharing exact CLI commands you would use with this PR, and how you have to do things without it? We want to make sure existing UX is acceptable.

@lidel lidel added the need/author-input Needs input from the original author label Jan 21, 2025
@ProximaNova
Copy link
Author

ProximaNova commented Jan 22, 2025

@guillaumemichel

don't understand why it would be desirable to have a directory named after a CID.

The benefit would be that everything you add to IPFS also easily or quickly (in one command) gets added to MFS in an explicitly/very deduplicated way* with no bad collisions. This is true if you always run the same options. Like if you always run "ipfs add -wrH --to-files=/cid fileOrFolderName". My original thoughts were about doing things in an "archived by design" way. Everything added to ipfs should also be added to mfs (not later, but immediately and by design), even if it's in an imperfect starter-level state. This is because organizing things in a file system setup (mfs) is better than merely having a set of pins.

Without this pull request code, there is ways to always run --to-files and never hit annoying collisions (like "readme.txt already exists in the folder"). With Bash, you can always run "time=$(date +%Y)/...; ipfs files mkdir -p /cid/$time; ipfs add -rHw --to-files=/cid/$time fileOrFolderName # $time = 2025/01/09/02/03/04". That "messily" fixes the collision problem, but not the deduplication problem in mfs.

@lidel

using CID as the name in MFS is potential user experience footgun

I anticipated such an idea. The idea that whatever added to MFS by --to-files should only be a file name of folder name of something in the OS file system. However, having options is good, and just disabling -w + --to-files is not the best solution. Something has to happen if add + -w + --to-files is used. We could even have a thing in $IPFS_PATH/config JSON which does this:

  • toFilesWrap=false= it is just disabled
  • toFilesWrap=true= my solution

If users mainly have the idea that --to-files= should only add such OS filesystem filenames and folder names to MFS then my solution is not great. However, if users mainly think of --to-files as adding the final thing -- filename/dirname/hash/CID -- to MFS (my mindset which I explained as making sense in the OP), then my solution is a great one.

and how you have to do things without it? We want to make sure existing UX is acceptable.

Without this code, to do what this code does I have only two lame options (Bash):

  • cid=$(ipfs add -rHw -Q fileOrFolderName); ipfs files cp /ipfs/$cid /cid/$cid
  • ipfs add -rHw fileOrFolderName # then manually copy the final CID and run "ipfs files cp /ipfs/$cid /cid/$cid"

I dislike the first method because there isn't a list of things in the CLI. Also, there maybe is no progress indicator (so use -q instead of -Q?). The second method is annoying: too much clicking and copying and pasting.

*Deduplication by unique CIDs, of course you can have two different folders which have the same files (and different files).

@guillaumemichel
Copy link

Adding a directory to MFS without providing a name just feels wrong. Moreover, if a directory is named after its CID, nothing prevents the content within it to be mutated, and then the MFS directory name doesn't match the CID anymore, which breaks deduplication.

I don't think it should be kubo's responsibility to name user's files, just like it isn't mkdir responsibility to provide a directory name if the user omits it. If you want to programmatically derive a directory name for your content you are free to do it yourself (e.g using UUID).

Moreover deduplication would only work for single files, and identical directories, as long as they never mutate. If two distinct directories contain the same file, and that both directories are added, then the file will be duplicated anyway.

If you still want to name your MFS directories after the CID of the wrapping directory, and have the normal ipfs add output, you can use the following one-liner.

output=$(ipfs add -rHw <filename> 2>&1 | tee /dev/tty); cid=$(echo "$output" | grep -oP "(?<=added ).*$" | tail -1 | sed 's/[[:space:]]\+$//'); ipfs files cp /ipfs/$cid /$cid

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need/author-input Needs input from the original author
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants