-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: adds datacite tests for Zenodo records, moves _detag function fr… #75
Fix: adds datacite tests for Zenodo records, moves _detag function fr… #75
Conversation
…om JATSParser to BaseBeautifulSoupParser and renames tagsets from JATS_ to HTML_ modified: adsingestp/parsers/base.py modified: adsingestp/parsers/datacite.py modified: adsingestp/parsers/jats.py new file: tests/stubdata/input/zenodo_test.xml new file: tests/stubdata/input/zenodo_test2.xml new file: tests/stubdata/input/zenodo_test3.xml new file: tests/stubdata/input/zenodo_test4.xml modified: tests/stubdata/output/datacite_schema3.1_example-full.json modified: tests/stubdata/output/datacite_schema4.1_example-full.json modified: tests/stubdata/output/datacite_schema4.1_example-software.json modified: tests/stubdata/output/datacite_schema4_example-habanero-pdsdataset.json new file: tests/stubdata/output/zenodo_test.json new file: tests/stubdata/output/zenodo_test2.json new file: tests/stubdata/output/zenodo_test3.json new file: tests/stubdata/output/zenodo_test4.json modified: tests/test_datacite.py
modified: adsingestp/parsers/datacite.py modified: adsingestp/parsers/jats.py modified: tests/stubdata/input/zenodo_test4.xml
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #75 +/- ##
==========================================
+ Coverage 88.54% 89.79% +1.25%
==========================================
Files 24 25 +1
Lines 2496 2616 +120
==========================================
+ Hits 2210 2349 +139
+ Misses 286 267 -19
☔ View full report in Codecov by Sentry. |
Also adds OA/license capture to to Datacite parser |
@@ -471,6 +471,35 @@ class BaseBeautifulSoupParser(IngestBase): | |||
out of the input XML stream. | |||
""" | |||
|
|||
fix_ampersand = re.compile(r"(&)(.*?)(;)") | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if you have ever encountered this in this context: I have seen cases where ampersands got encoded as __amp__amp;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you remember which publisher(s) specifically? I'm looking for an example to make a unit test with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file has an example: /proj/ads/references/sources/MNRAS/0423/iss4.wiley2.xml
modified: adsingestp/parsers/base.py new file: tests/test_base.py
modified: tests/test_base.py
…om JATSParser to BaseBeautifulSoupParser and renames tagsets from JATS_ to HTML_