Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sorting out reserved and open URIs for persons, places, works, etc. #950

Open
5 tasks done
wlpotter opened this issue Feb 2, 2022 · 2 comments
Open
5 tasks done

Comments

@wlpotter
Copy link
Contributor

wlpotter commented Feb 2, 2022

@dlschwartz now that we have the csv transform working for persons and places (and I may make the works module since @jsaintlauren and I are hoping to add some new hagiography records this spring, and we are getting ready to start tagging entities in the mss catalogue, some of which might need newly minted URIs), I was thinking we should make sure we've tracked down all the URIs or csv data that have not yet been converted and added as TEI to the app.

I know we had #865 as a discussion of places, which I need to follow up on.

I have no idea what the situation with persons or works is...I think I will proceed as follows and we can confer with @davidamichelson @jsaintlauren and @nathangibson on other places to look for these data.

  • create a list, likely just a CSV, of all entity URIs on the master branch (include deprecated, but mark them as such)
  • create a similar list of data that's on on dev but not on master (unique to dev)
  • sort out the situation with the tables and repositories cited in Aleppo (diocese) #865 for places
  • make sure the Syriac World Places all got moved to master
  • track down other 'draft data' sources, especially for persons and works

That's everything I can think of for right now. It might then be useful to formalize and centralize an intra-Syriaca method of reserving URIs and creating new records (e.g. a shared Google sheet that has URI lists for all entity types). As we discussed, the csv transform should ideally cut down on any wait time between URI reservation, record creation, and publishing on the server, but it would still be useful to have a centralized way of tracking that process.

@wlpotter
Copy link
Contributor Author

wlpotter commented Mar 4, 2022

Okay, so I've begun working on this. I have a script that outputs the URIs, and some related info, for all the records that are currently on a given branch (including deprecated records, which I mark as such). The results of this for "master", "dev", and "new-places-from-Syriac-World" are here. The good news is that all persons, places, and works that are on dev have a record on master.

Syriac World places have not been pulled in, but see #873 which should be ready to go.

I think the next steps for tracking down potentially missing TEI records are as follows:

  1. go through the spreadsheets and draft repositories mentioned in Aleppo (diocese) #865 and check against records that are on master (perhaps collect all the various data locations into a central list on this issue, to verify that these are all the locations). Make a list of the ones that do not have a record
    • if the URI is from a draft repository we can potentially move it over with the revisionDesc status of "uncorrected-draft"
    • if it's from a spreadsheet we may need to track down if a record was created elsewhere, or if we need to add data and run a csv2srophe transform.
  2. Create a list of URIs that are referenced by our current data (e.g., in @ref or @passive attributes). Check this list of URIs (for persons, places, and works at least) against the list of records on master. This will tell us which records we are missing for which URIs are created and used.

So far I have left of taxonomy and bibl records from this report (and SPEAR since the URIs work differently). We may want to circle back and check on these at some point as well.

@wlpotter
Copy link
Contributor Author

wlpotter commented Mar 8, 2022

@dlschwartz @davidamichelson @jsaintlauren here is a fuller write up of what I sent via email this morning.

I believe I've made an exhaustive list of all URIs for persons, places, and works, here.

There are a few duplicated columns I need to clean up, but otherwise this report should be up to date.

The following sources were used in compiling this report:

Places

Persons

Works

@wlpotter wlpotter removed the MSS label Jul 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants