You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 18, 2021. It is now read-only.
typical parsed bibliography list item from Open Typesetting Stack at http://pkp-udev.lib.sfu.ca/ in JATS format (without authors):
<ref id="R24">
<element-citation>
<article-title>
Randomized Controlled Trial of Family Therapy in Advanced Cancer Continued Into Bereavement
</article-title>
<source>Journal of Clinical Oncology</source>
<year>2016-apr</year>
<fpage>1921</fpage>
<lpage>1927</lpage>
</element-citation>
</ref>
And here is this item from grobid transformation only:
<biblStruct coords="7,103.10,142.06,449.34,10.80;7,103.10,155.86,449.44,10.80;7,103.10,167.17,449.72,13.30" xml:id="b12">
<analytic>
<title level="a" type="main">Randomized Controlled Trial of Family Therapy in Advanced Cancer Continued Into Bereavement</title>
</analytic>
<monogr>
<title level="j">J Clin Oncol</title>
<imprint>
<biblScope unit="volume">1</biblScope>
<biblScope unit="issue">16</biblScope>
<biblScope unit="page" from="34" to="1621" />
<date type="published" when="2016" />
</imprint>
</monogr>
</biblStruct>
As you can see, information about volume and issue is lost in result JATS XML. I suppose grobid module parses this data from the doi or pubmed links, that are putted to all our bibliogrphic citation list items and they are lost on somewhere on stage tei to jats transformation. This is issue is relevant to all articles, that I have already processed with this online service (near 20).
Pages, Journal Title and Year info is also different. So maybe references comes from other module. In this case volume and issue can be grabbed from grobit.
The text was updated successfully, but these errors were encountered:
Hmm, as I see from grobit TEI to JATS xslt, it is not used for reference rendering at all. Not good, it maybe parses reference better than other modules :)
That's correct, we don't use Grobid for reference parsing -- either Cermine or meTypeset is used to detect the reference section, which is then sent to CrossRef to match known-good data, and ParsCit is used to parse any references that didn't have a DOI and couldn't be looked up. ParsCit still outperforms all other local reference parsing solutions that we've tried.
Sometimes Cermine do not see reference section right. For example in article that I have tested with Cermine-only first 2 references were lost. They have been parsed as article text. But that`s was not the case with Grobid. I have not done much tests with the last soft, so could not say for sure what is better. Also I am planning to parse all our articles with open typesetting stack and can compare the result reference section with Grobid analog to see the difference. If it will help you in development of course.
Nevertheless, it will be great to add volume and issue tags inside JATS upon transformation, because now it is manual work for us. Think, it is lost on the stage of rendering CrossRef data (this is a case when reference article has doi or pmid).
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
typical parsed bibliography list item from Open Typesetting Stack at http://pkp-udev.lib.sfu.ca/ in JATS format (without authors):
And here is this item from grobid transformation only:
As you can see, information about volume and issue is lost in result JATS XML. I suppose grobid module parses this data from the doi or pubmed links, that are putted to all our bibliogrphic citation list items and they are lost on somewhere on stage tei to jats transformation. This is issue is relevant to all articles, that I have already processed with this online service (near 20).
Pages, Journal Title and Year info is also different. So maybe references comes from other module. In this case volume and issue can be grabbed from grobit.
The text was updated successfully, but these errors were encountered: