Skip to content

Commit

Permalink
Add xsl for handling clinical trial numbers (#157)
Browse files Browse the repository at this point in the history
* Add xsl for handling clinical trial numbers

* Update kitchen sink and all other test cases

* Add readme description

* tweak whitespace in output

* add missing text in templates

* Add test case for new xsl and update all case
  • Loading branch information
fred-atherden authored Jan 15, 2025
1 parent bee27fd commit 30d9739
Show file tree
Hide file tree
Showing 26 changed files with 1,181 additions and 92 deletions.
39 changes: 39 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,45 @@ This xsl accounts for the capture of `<code>` in XML. Encoda correctly decodes t

We need support for `CodeBlock` added to EPP client. And we need Encoda to properly decode `<preformat>`.

### [/src/related-object-workaround.xsl](/src/related-object-workaround.xsl)

The `<related-object>` tag is used to capture clinical trial numbers (see tagging guidance [here](https://jats4r.niso.org/clinical-trials/)). Encoda does not adequately support for this element, either ignoring it or stripping the embedded link and attribute values when present.

For example, for a JATS4R complicant tagged clinical trial number in a structured abtsract:

```xml
<sec>
<title>Clinical trial number:</title>
<p><related-object content-type="pre-results" document-id="dummy-trial" document-id-type="clinical-trial-number" source-id="DRKS" source-id-type="registry-name" source-type="clinical-trials-registry" xlink:href="https://drks.de/search/en/trial/dummy-trial">dummy-trial</related-object>.</p>
</sec>
```

The output in Encoda is missing the link:

```json
[
...,
{
"type": "Heading",
"id": "",
"depth": 1,
"content": [
"Clinical trial number:"
]
},
{
"type": "Paragraph",
"content": [
"dummy-trial",
"."
]
}
...,
]
```

Since `related-object` can be used in numerous places, this xsl replaces the element with a hyperlink (`<ext-link>`) and if necessary moves it to a different locaiton in the text so that it can be surfaced by EPP.

# Modify bioRxiv XML in preparation for Encoda

Prerequisites:
Expand Down
117 changes: 117 additions & 0 deletions src/related-object-workaround.xsl
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xlink="http://www.w3.org/1999/xlink"
exclude-result-prefixes="xs"
version="3.0">

<xsl:output method="xml" encoding="UTF-8"/>

<xsl:template match="*|@*|text()|comment()|processing-instruction()">
<xsl:copy>
<xsl:apply-templates select="*|@*|text()|comment()|processing-instruction()"/>
</xsl:copy>
</xsl:template>

<!-- Only match on related-objects which are determined to be clinical trial numbers with a link -->
<xsl:template match="related-object[@xlink:href!='' and @document-id-type='clinical-trial-number']">
<xsl:choose>
<!-- This is in a structured abstract: replace with a link -->
<xsl:when test="parent::p/parent::sec and ancestor::abstract">
<xsl:element name="ext-link">
<xsl:attribute name="ext-link-type">uri</xsl:attribute>
<xsl:apply-templates select="@xlink:href|*|text()|comment()|processing-instruction()"/>
</xsl:element>
</xsl:when>
<!-- This is simply included in the narrative flow: replace with a link -->
<xsl:when test="parent::p or parent::th or parent::td">
<xsl:element name="ext-link">
<xsl:attribute name="ext-link-type">uri</xsl:attribute>
<xsl:apply-templates select="@xlink:href|*|text()|comment()|processing-instruction()"/>
</xsl:element>
</xsl:when>
<!-- else: do nothing, retain it as related-object -->
<xsl:otherwise>
<xsl:copy>
<xsl:apply-templates select="*|@*|text()|comment()|processing-instruction()"/>
</xsl:copy>
</xsl:otherwise>
</xsl:choose>
</xsl:template>

<!-- Introduce related-objects placed within article-meta into additional information sections
This case handles when an additional information section already exists in back -->
<xsl:template match="article[descendant::article-meta/related-object[@xlink:href!='' and @document-id-type='clinical-trial-number']]/back/sec[@sec-type='additional-information' or matches(lower-case(title[1]),'^additional information$')]">
<xsl:copy>
<xsl:apply-templates select="*|@*|text()|comment()|processing-instruction()"/>
<xsl:element name="sec">
<xsl:text>&#xa;</xsl:text>
<xsl:element name="p">
<xsl:text>Clinical trial number: </xsl:text>
<xsl:for-each select="ancestor::article//article-meta/related-object[@xlink:href!='' and @document-id-type='clinical-trial-number']">
<xsl:choose>
<xsl:when test="position() = 1">
<xsl:element name="ext-link">
<xsl:attribute name="ext-link-type">uri</xsl:attribute>
<xsl:apply-templates select="@xlink:href|*|text()|comment()|processing-instruction()"/>
</xsl:element>
</xsl:when>
<xsl:otherwise>
<xsl:text>; </xsl:text>
<xsl:element name="ext-link">
<xsl:attribute name="ext-link-type">uri</xsl:attribute>
<xsl:apply-templates select="@xlink:href|*|text()|comment()|processing-instruction()"/>
</xsl:element>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
<xsl:text>.</xsl:text>
</xsl:element>
<xsl:text>&#xa;</xsl:text>
</xsl:element>
<xsl:text>&#xa;</xsl:text>
</xsl:copy>
</xsl:template>

<!-- Introduce related-objects placed within article-meta into additional information sections
This case handles when an additional information section does not exist in back -->
<xsl:template match="article[descendant::article-meta/related-object[@xlink:href!='' and @document-id-type='clinical-trial-number']]/back[not(sec[@sec-type='additional-information' or matches(lower-case(title[1]),'^additional information$')])]">
<xsl:copy>
<xsl:apply-templates select="*|@*|text()|comment()|processing-instruction()"/>
<xsl:element name="sec">
<xsl:attribute name="sec-type">additional-information</xsl:attribute>
<xsl:text>&#xa;</xsl:text>
<xsl:element name="title">Additional information</xsl:element>
<xsl:text>&#xa;</xsl:text>
<xsl:element name="sec">
<xsl:text>&#xa;</xsl:text>
<xsl:element name="p">
<xsl:text>Clinical trial number: </xsl:text>
<xsl:for-each select="ancestor::article//article-meta/related-object[@xlink:href!='' and @document-id-type='clinical-trial-number']">
<xsl:choose>
<xsl:when test="position() = 1">
<xsl:element name="ext-link">
<xsl:attribute name="ext-link-type">uri</xsl:attribute>
<xsl:apply-templates select="@xlink:href|*|text()|comment()|processing-instruction()"/>
</xsl:element>
</xsl:when>
<xsl:otherwise>
<xsl:text>; </xsl:text>
<xsl:element name="ext-link">
<xsl:attribute name="ext-link-type">uri</xsl:attribute>
<xsl:apply-templates select="@xlink:href|*|text()|comment()|processing-instruction()"/>
</xsl:element>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
<xsl:text>.</xsl:text>
</xsl:element>
<xsl:text>&#xa;</xsl:text>
</xsl:element>
<xsl:text>&#xa;</xsl:text>
</xsl:element>
<xsl:text>&#xa;</xsl:text>
</xsl:copy>
</xsl:template>

</xsl:stylesheet>
18 changes: 14 additions & 4 deletions test/all/kitchen-sink.xml
Original file line number Diff line number Diff line change
Expand Up @@ -82,8 +82,14 @@
<license><license-p>The copyright holder for this pre-print is the author. All rights reserved. The material may not be redistributed, re-used or adapted without the author's permission.</license-p></license>
</permissions>
<self-uri xlink:href="223354.pdf" content-type="pdf" xlink:role="full-text"/>
<related-object content-type="pre-results" document-id="dummy-trial-2" document-id-type="clinical-trial-number" source-id="DRKS" source-id-type="registry-name" source-type="clinical-trials-registry" xlink:href="https://drks.de/search/en/trial/dummy-trial-2">dummy-trial-2</related-object>
<abstract>
<title>Abstract</title><p>T cell receptors (TCRs) are formed by stochastic gene rearrangements, theoretically generating &#x003E;10<sup>19</sup> sequences. They are selected during thymopoiesis, which releases a repertoire of about 10<sup>8</sup> unique TCRs per individual. How evolution shaped a process that produces TCRs that can effectively handle a countless and evolving set of infectious agents is a central question of immunology. The paradigm is that a diverse enough repertoire of TCRs should always provide a proper, though rare, specificity for any given need. Expansion of such rare T cells would provide enough fighters for an effective immune response and enough antigen-experienced cells for memory. We show here that human thymopoiesis releases a large population of CD8<sup>&#x002B;</sup> T cells harboring &#x03B1;/&#x03B2; paired TCRs that (i) have high generation probabilities and (ii) a preferential usage of some V and J genes, (iii) are shared between individuals and (iv) can each recognize and be activated by multiple unrelated viral peptides, notably from EBV, CMV and influenza. These polyspecific T cells may represent a first line of defense that is mobilized in response to infections before a more specific response subsequently ensures viral elimination. Our results support an evolutionary selection of polyspecific &#x03B1;/&#x03B2; TCRs for broad antiviral responses and heterologous immunity.</p>
<title>Abstract</title>
<p>T cell receptors (TCRs) are formed by stochastic gene rearrangements, theoretically generating &#x003E;10<sup>19</sup> sequences. They are selected during thymopoiesis, which releases a repertoire of about 10<sup>8</sup> unique TCRs per individual. How evolution shaped a process that produces TCRs that can effectively handle a countless and evolving set of infectious agents is a central question of immunology. The paradigm is that a diverse enough repertoire of TCRs should always provide a proper, though rare, specificity for any given need. Expansion of such rare T cells would provide enough fighters for an effective immune response and enough antigen-experienced cells for memory. We show here that human thymopoiesis releases a large population of CD8<sup>&#x002B;</sup> T cells harboring &#x03B1;/&#x03B2; paired TCRs that (i) have high generation probabilities and (ii) a preferential usage of some V and J genes, (iii) are shared between individuals and (iv) can each recognize and be activated by multiple unrelated viral peptides, notably from EBV, CMV and influenza. These polyspecific T cells may represent a first line of defense that is mobilized in response to infections before a more specific response subsequently ensures viral elimination. Our results support an evolutionary selection of polyspecific &#x03B1;/&#x03B2; TCRs for broad antiviral responses and heterologous immunity.</p>
<sec>
<title>Clinical trial number:</title>
<p><ext-link ext-link-type="uri" xlink:href="https://drks.de/search/en/trial/dummy-trial">dummy-trial</ext-link>.</p>
</sec>
<sec>
<title>Graphical abstract</title><fig id="figa1" position="float" orientation="portrait" fig-type="figure">
<label>Figure</label>
Expand All @@ -106,8 +112,6 @@
</counts>
</article-meta>
<notes>
<notes notes-type="competing-interest-statement">
<title>COMPETING INTEREST STATEMENT</title><p>The authors have declared no competing interest.</p></notes>
<fn-group content-type="external-links">
<fn fn-type="dataset"><p>
<ext-link ext-link-type="uri" xlink:href="https://vdjdb.cdr3.net">https://vdjdb.cdr3.net</ext-link>.
Expand Down Expand Up @@ -579,13 +583,19 @@ return :shrug:
<sec id="s12">
<title>Existing Termination Methods</title>
<p>Existing termination methods include physical damage, endovascular occlusion, Rose Bengal mediated photothrombosis, and chemical lesioning.</p>
<p>There are several well established techniques for mechanically damaging a small area of cortex to create a lesion, including blade lesioning <bold><italic><xref ref-type="bibr" rid="c46">Horsley and Schafer (1888</xref></italic></bold>); <bold><italic><xref ref-type="bibr" rid="c38">Sherrington (1893</xref></italic></bold>), vacuum aspiration <bold><italic><xref ref-type="bibr" rid="c24">Darling et al. (2016</xref></italic></bold>), vascular cauterization <bold><italic><xref ref-type="bibr" rid="c75">Nudo et al. (2003</xref></italic></bold>), and vascular ligation <bold><italic><xref ref-type="bibr" rid="c23">Rumajogee et al. (2016</xref></italic></bold>). All of these techniques require surgical access to cortex, which would likely disrupt an existing implanted microelectode array. Additionally, the sedation necessary for the surgery would prevent behavioral testing on the day of the lesion, precluding measurements of acute inactivation. Further, these techniques often create large lesions, and do not offer sub-millimeter precision.</p>
<p>There are several well established techniques for mechanically damaging a small area of cortex to create a lesion, including blade lesioning <bold><italic><xref ref-type="bibr" rid="c46">Horsley and Schafer (1888</xref></italic></bold>); <bold><italic><xref ref-type="bibr" rid="c38">Sherrington (1893</xref></italic></bold>), vacuum aspiration <bold><italic><xref ref-type="bibr" rid="c24">Darling et al. (2016</xref></italic></bold>), vascular cauterization <bold><italic><xref ref-type="bibr" rid="c22">Nudo et al. (2003</xref></italic></bold>), and vascular ligation <bold><italic><xref ref-type="bibr" rid="c23">Rumajogee et al. (2016</xref></italic></bold>). All of these techniques require surgical access to cortex, which would likely disrupt an existing implanted microelectode array. Additionally, the sedation necessary for the surgery would prevent behavioral testing on the day of the lesion, precluding measurements of acute inactivation. Further, these techniques often create large lesions, and do not offer sub-millimeter precision.</p>
<p>Endovascular techniques are commonly used as models of stroke. For example, endovascular physical occlusion of the middle cerebral artery (MCA) of one hemisphere is a common rodent model of stroke <bold><italic><xref ref-type="bibr" rid="c32">Fluri et al. (2015</xref></italic></bold>), but it is challenging to precisely control the extent of cortical damage. MCA occlusion could cause indiscriminate injury to a large regions of cortex, due to continued, widespread neuronal death after the occlusion. It could potentially damage the area in which the microelectrode array is implanted, preventing meaningful recordings. As one descends into smaller branches of the MCA, survivability, localization, and reproducibility of ischemic results improve <bold><italic><xref ref-type="bibr" rid="c55">Kuraoka et al. (2009</xref></italic></bold>); <bold><italic><xref ref-type="bibr" rid="c19">Clark et al. (2019</xref></italic></bold>), but it is technically challenging to be precise with the occlusion without coming close to the implanted array, again risking disrupting the implantation site.</p>
<p>Another endovascular technique is photothrombosis, which does not require an additional surgery to implement, limiting disruption of the microelectrode array. <bold><italic><xref ref-type="bibr" rid="c42">Gulati et al. (2015</xref></italic></bold>); <bold><italic><xref ref-type="bibr" rid="c28">Ramanathan et al. (2018</xref></italic></bold>); <bold><italic><xref ref-type="bibr" rid="c50">Khateeb et al. (2019</xref></italic></bold>). This approach uses rose bengal, a photosenstive dye, injected intravenously into the circulatory system. When 561nm green light is shined over a blood vessel, the dye undergoes a local conformational change and generates singlet oxygen, damaging arterial endothelial cells and initiating the clotting cascade&#x2013;resulting in damage resembling an ischemic stroke <bold><italic><xref ref-type="bibr" rid="c10">Watson et al. (1985</xref></italic></bold>); <bold><italic><xref ref-type="bibr" rid="c11">Carmichael (2005</xref></italic></bold>). This approach can be used to deliver a well-localized lesional boundary. In rodents, this can be done entirely non-invasively because green light penetrates through the thin layer of skull. In larger animals, a method of light delivery is needed. If an optical fiber is chronically implanted at the time of electrode array insertion, light can be delivered without surgery and lesions can be made without disrupting the array, but this chronically implanted fiber may act as a route for infection. Alternatively, the fiber could be placed though a burr hole made at the time of the lesion, but this may compromise localization accuracy like other burr-hole techniques.</p>
<p>Chemical lesioning is done by injecting a damaging chemical into the cortical region. These chemicals can be excitotoxic pharmacologic agents like ibotenic acid that selectively and directly damage neuronal cell bodies <bold><italic><xref ref-type="bibr" rid="c8">Murata et al. (2008</xref></italic></bold>), or they can be vasoconstrictors like endothelin-1 that create anoxic cortical injury <bold><italic><xref ref-type="bibr" rid="c23">Dai et al. (2017</xref></italic></bold>). These chemicals have the same potential drawbacks of other injection-based methods: either a permanent pathway is added to allow precise injection in the area of the microelectrode array, creating a route for infection, or injection is done through a burr hole, making it difficult to localize to the region of the array and disrupting experimental continuity. It is also difficult to control the spread of the chemicals, preventing precision in lesion extent.</p>
</sec>
</app>
</app-group>
<sec><title>Abbreviations</title><list list-type="simple"><list-item><p><italic>I</italic><sub><italic>Ks</italic></sub>: delayed cardiac rectifier K<sup>&#x002B;</sup> current</p></list-item><list-item><p>LQTS: long QT syndrome</p></list-item><list-item><p>DIDS: 4,4&#x2019;-diisothiocyano-2,2&#x2019;-stilbenedisulfonic acid</p></list-item><list-item><p>SITS: 4-acetamido-4&#x2019;-isothiocyanatostilbene-2,2&#x2019;-disulfonic acid</p></list-item><list-item><p>HMR1556: (3R,4S)-(&#x002B;)-N-[3-hydroxy-2,2-dimethyl-6-(4,4,4-trifluorobutoxy) chroman-4-yl]-N-methylmethanesulfonamide</p></list-item><list-item><p>G-V: conductance-voltage relationship</p></list-item><list-item><p><italic>k</italic>: slope factor</p></list-item><list-item><p>V<sub>1/2</sub>: voltage at half-maximal activation</p></list-item><list-item><p>WT: wild type</p></list-item><list-item><p>ps-<italic>I</italic><sub><italic>Ks</italic></sub>: pseudo-KCNE1-KCNQ1</p></list-item><list-item><p>MD: molecular dynamics</p></list-item><list-item><p>MM: molecular mechanics</p></list-item><list-item><p>PBSA: Poisson-Boltzmann surface area</p></list-item><list-item><p>GBSA: generalized Born surface area.</p></list-item></list></sec>
<sec sec-type="additional-information">
<title>Additional information</title>
<sec>
<p>Clinical trial number: <ext-link ext-link-type="uri" xlink:href="https://drks.de/search/en/trial/dummy-trial-2">dummy-trial-2</ext-link>.</p>
</sec>
</sec>
</back>
</article>
Loading

0 comments on commit 30d9739

Please sign in to comment.