Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing excess punctuation and unbalanced parentheses from affil strings #137

Merged
merged 8 commits into from
Oct 4, 2024

Conversation

mugdhapolimera
Copy link
Contributor

No description provided.

@codecov-commenter
Copy link

codecov-commenter commented Oct 2, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.24%. Comparing base (6039d91) to head (b91d5e0).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #137      +/-   ##
==========================================
+ Coverage   91.19%   91.24%   +0.04%     
==========================================
  Files          25       25              
  Lines        2839     2854      +15     
==========================================
+ Hits         2589     2604      +15     
  Misses        250      250              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mugdhapolimera
Copy link
Contributor Author

mugdhapolimera commented Oct 2, 2024

@seasidesparrow RE: putting the parentheses normalization in base.py :
I have the function (_remove_unbalanced_parentheses) currently in JatsAffils class so that the normalization can happen when the extraction of aff string is happening. if we really want this in base.py, I could call it here, but we would have to cycle through each affiliation for each author before storing the author info.
let me know if that is better in your opinion.

@seasidesparrow
Copy link
Member

@mugdhapolimera I'll accept your current solution -- I think we're getting the bulk of problematic records as jats. If other publisher formats continue to be a problem we can address it separately.

@mugdhapolimera mugdhapolimera self-assigned this Oct 3, 2024
Copy link
Member

@seasidesparrow seasidesparrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, merge when you're ready.

@seasidesparrow seasidesparrow merged commit ccb72f3 into adsabs:main Oct 4, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants