Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attached detached lite #390

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

ptsefton
Copy link
Contributor

@ptsefton ptsefton commented Jan 10, 2025

This is another approach to clarifying Attached vs Detached that does not introduce much new terminology or conformsTo etc. Again, this is a first pass to see if this approach makes sense, it will need checking.

I moved a bit of stuff around in structure.md but there are no major changes there except to cover how to deal with a Detached RO-Crate Package. Also, did not try to deal with how you'd link one to a website.

In the section on Data Entites i further tidied up the logic around @ids and contentUrls -- I think this has made it clearer, and I don't think it will be hard to implement.

Thanks for your feedback @simleo about the complexity in my last try.

@ptsefton ptsefton requested review from elichad, stain and simleo and removed request for elichad January 10, 2025 04:49
docs/_specification/1.2-DRAFT/data-entities.md Outdated Show resolved Hide resolved
docs/_specification/1.2-DRAFT/data-entities.md Outdated Show resolved Hide resolved
docs/_specification/1.2-DRAFT/data-entities.md Outdated Show resolved Hide resolved
docs/_specification/1.2-DRAFT/data-entities.md Outdated Show resolved Hide resolved
* an absolute URI
* a local reference beginning with `#`

For an _Attached RO-Crate Package_:
* The `@id` MUST be a relative path that resolves to a directory that is present in the _RO-Crate Root_.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A Dataset in an attached crate can also be web-based, with an absolute URI as its @id: this is already allowed in RO-Crate 1.1, I don't think we should change that. The requirement that the directory be present is also absent in 1.1: adding it would basically force validators to perform a check on the file system for every Dataset in the crate, which could be quite expensive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - I don't think I looked at this this on this edit -- what should it say?

- References to files and directories in the RO-Crate Metadata Document are present in the RO-Crate or available online as [Web-based Data Entities](data-entities.html#web-based-data-entities).
2. A _Detached RO-Crate Package_:
- Is defined by a stand alone RO-Crate metadata document which may be stored in a file or distributed via an API.
- If stored in a file, known as a _Detached RO-Crate Metadata File_, the filename SHOULD be `${slug}-ro-crate-metadata.json` where the variable `$slug` is a human readable version of the dataset's ID or name, to signal that the document should be interpreted as part of an _Attached RO-Crate Data Package_.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like something that could break the algorithm for finding the metadata file and root data entity. I also have trouble understanding the use case, which doesn't seem to be mentioned elsewhere: can a detached crate be part of an attached crate? What would that imply?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should not affect finding the RO Crate Metadata description as that is ALWAYS ro-crate-metadata.json in all cases -- it is essentially a magic string (this is already noted that if you get a crate over an API).

The intention of the proposed changes here is that when you are dealing with a Detached crate an algorithm should never have to find it, it will be passed in directly either as a file, a string or a URI to some endpoint. The point of recommending this slug is twofold - firstly to distinguish files people might have in their downloads -- ATM you end up with a lot of ro-crate-metadata.json files and secondly, to STOP detached crates from being accidentally treated as Attached RO-Crate Packages.

If this makes sense then I will look at making it clearer in the spec.

docs/_specification/1.2-DRAFT/structure.md Outdated Show resolved Hide resolved
docs/_specification/1.2-DRAFT/structure.md Outdated Show resolved Hide resolved
docs/_specification/1.2-DRAFT/structure.md Outdated Show resolved Hide resolved

At the basic level, an Attached RO-Crate is a collection of files and resources represented as a Schema.org [Dataset], that together form a meaningful unit for the purposes of communication, citation, distribution, preservation, etc. The _RO-Crate Metadata Document_ describes the RO-Crate, and MUST be stored in the _RO-Crate Root_.
In a _Detached RO-Crate Package_ the [root data entity](root-data-entity) SHOULD have an `@id` which is a URL that resolves to the _RO-Crate Metadata Document_.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would the @id of the root data entity resolve to the metadata document? They are two separate entities with different ids.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have a detached crate then in many cases that will have some kind of online home at an API or a website - this is saying the the @id should point at that so it should say something like.

In a Detached RO-Crate Package the root data entity SHOULD have an @id which is a URL that resolves to an online source for the RO-Crate Metadata Document which may be a on the web or available over an API.

Does that make sense to you @simleo ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants