Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

att.declaring and att.declarable need constraints and better explanation #1981

Open
sydb opened this issue Mar 8, 2020 · 16 comments
Open

att.declaring and att.declarable need constraints and better explanation #1981

sydb opened this issue Mar 8, 2020 · 16 comments
Assignees
Labels
Status: Pending pending action described in a comment, to return to discussion before further action will be taken TEI: Guidelines & Documentation TEI: Schema TEI: Schematron

Comments

@sydb
Copy link
Member

sydb commented Mar 8, 2020

Some of these issues are trivial; some may require tickets of their own.

  1. Typo in 15.3, “Associating Contextual Information with a Text” (#CCAH) — disagreement in number: “The TEI scheme allow for the following …”.
  2. I think the following bullet point should have articles, i.e. “There may be multiple occurrences of certain elements in either the corpus or a text header”.
  3. Similarly for the 2nd sentence of 15.3.2: “… a particular part of a text header or the corpus header by means of …‭”.
  4. The 3rd paragraph in #CCAS2 is problematic.
    1. The list of declarable elements should be generated.
    2. “All of the above elements may be multiply defined within a single header” should probably be “Each of the above elements is repeatable” or similar.
    3. “every declarable element must bear a unique identifier”
      • Does this really mean every declarable element must have an @xml:id (which would be nuts, but is what it says), or does it mean every element of the same type (which would make a lot of sense and is what 15.3.3 number 3 sort of implies), or does it mean every element of the same type that has a sibling of that type?
      • No matter which it means, would be good to have a constraint enforcing it.
    4. “… must be specified as the default, by means of the default attribute”: The only 2 ways to indicate something is the default are for it to be its parent’s only child of the given element type or to have a @default specified as "true". Given that the former is precluded, I think the prose here should just be specific and say the latter: “… must be specified as the default by having a @default attribute with the value "true"”.
  5. “Here is the structure for a text which does state otherwise:”: Should be something like “Here is the structure of a text in which a division does state otherwise:”, no?
  6. “the contents of the divisions D1 and D3 … and … division D2”: The values of the @xml:ids of the divisions in question are "d1", "d3", and "d2".
  7. “The identifier or identifiers specified by the @decls attribute …”: They are not identifiers; what is specified on @decls are pointers (URIs, to be precise). Besides, it is not the values of @decls that are restricted, but rather the things being pointed at by an @decls.
  8. First bullet point for this, “An identifier specifying an element which contains multiple instances of one or more other elements should be interpreted as if it explicitly identified the elements identified as the default in each such set of repeated elements”: I am not sure what this means. I think it means that if a declaring element points to declarable element A, then for any set of children of A that are of the same declarable element type, the child that is the default applies. Whether I’m right or wrong, this should be rewritten. (And, I presume that if the @decls that points to A also points to a non-default child of A, that one applies instead.)
  9. (Switching to the XML) for a text specifying <att>decls</att> as <q><val>ED2</val></q>, correction C2A, and normalization N2B will apply. should be more like for a text specifying <att>decls</att> as <val>#ED2</val>, correction C2A and normalization N2B will apply..
  10. Next paragraph: all of the pointers are missing their # characters.
  11. CCAS3 Summary, item 3: Same issue about articles as above; I think it should read “Where there are multiple occurrences of declarable elements within a text’s header or its corpus header” or some such.
  12. Second sub-bullet of above (currently reads one only must bear a <att>default</att> attribute with the value <!-- JC: 2018-07-20: This should be changed to 'true' --><val>YES</val>.): Obviously, @jamescummings is correct, as "YES" is not one of the possible values of @default. (BTW, this is the only occurrence of either “YES” or “NO” as a word in the Guidelines.) But furthermore, I think the more standard wording (at least in American English) would be “one and only one must bear …”. Thus my suggestion is one and only one must bear a <att>default</att> attribute with the value <val>true</val> (or <val>1</val>)..

constraints for att.declarable

For validation, I think someday we would like a mechanism for building a list of elements from class membership dynamically at build time. Until then, we could make due with Schematron abstract rules.

In att.declarable.xml:

  <sch:pattern id="declarable" abstract="true">
    <!-- parameter 'tde' is for "this declarable element (type)" -->
    <sch:rule context="tei:*[child::$tde[2]]">
      <sch:let name="declarableGI" value="name( $tde[1] )"/>
      <sch:report test="child::$tde[ not( @xml:id ) ]">
        When there is more than one <sch:value-of select="$declarableGI"/>, each must have an @xml:id
      </sch:report>
      <sch:assert test="count( child::$tde[ normalize-space( @default ) = ('1','true') ] ) eq 1">
        When there is more than one <sch:value-of select="$declarableGI"/>, one and only one must have a @default of 'true'.
      </sch:assert>
    </sch:rule>
  </sch:pattern>

(Note that the declaration of the variable $declarableGI fails when I try this using probatron, but it works in oXygen. If you replace the variable reference with the value everywhere it works in both.)

And then in each element that has <memberOf key="att.declarable"/> (see below for the current list), something like the following:

  <sch:pattern id="declarable_xenoData" is-a="declarable">
    <sch:param name="tde" value="tei:xenoData"/>
  </sch:pattern>

constraints for att.declaring

Probably in att.declaring.xml:

  <sch:pattern id="decls" abstract="false">
    <!--
      Use element as context, as some processors inappropriately barf when a context
      is an attribute node:
    -->
    <sch:rule context="tei:*[@decls]">
      <!-- sequence of decls pointers: -->
      <sch:let name="dptrs" value="tokenize( normalize-space( @decls ), '&#x20;')"/>
      <!-- sequence of the @xml:ids pointed at by each decls pointer: -->
      <sch:let name="dxids" value="for $dptr in $dptrs return substring-after( $dptr, '#')"/>
      <!-- sequence of the target elements of the decls pointers: -->
      <sch:let name="dtars" value="for $dptr in $dptrs return
        if ( starts-with( $dptr, '#') )
        then id( substring-after( $dptr, '#') )
        else if ( contains( $dptr, '#') )
        then doc( $dptr )/id( substring-after( $dptr, '#') )
        else current()"/>
      <!-- sequence of the element types of the taret elements of the decls ptrs: -->
      <sch:let name="dtGIs" value="for $dtar in $dtars return name( $dtar )"/>
      <!-- sequence of the children of the targets of the decls pointers: -->
      <sch:let name="dtars_kids" value="$dtars/tei:*"/>
      <!-- sequence of the names of the children: -->
      <sch:let name="dtars_kids_GIs" value="for $kid in $dtars_kids return name( $kid )"/>
      <!-- sequence of all the GIs (target and target children): -->
      <sch:let name="decls_GIs" value="( $dtGIs, $dtars_kids_GIs )"/>
      <sch:assert test="count( $decls_GIs ) eq count( distinct-values( $decls_GIs ) )">
        Two or more of the elements referred to either explicitly or implicitly by the @decls if this <sch:name/> (<sch:value-of
          select="$dptrs"/>) are the same kind of metadata element. 
      </sch:assert>
    </sch:rule>
  </sch:pattern>

Note that if a local element (i.e., same file) pointed to by @decls is not found, it is simply ignored; if a remote element (i.e., different file) pointed to by @decls is not found this does not fail gracefully, rather a 404 is raised.


list of declarable elements

tei:availability | tei:bibl | tei:biblFull | tei:biblStruct | tei:broadcast | tei:correction | tei:correspDesc | tei:editorialDecl | tei:equipment | tei:geoDecl | tei:hyphenation | tei:interpretation | tei:langUsage | tei:listApp | tei:listBibl | tei:listEvent | tei:listNym | tei:listObject | tei:listOrg | tei:listPerson | tei:listPlace | tei:metDecl | tei:normalization | tei:particDesc | tei:projectDesc | tei:punctuation | tei:quotation | tei:recording | tei:refsDecl | tei:samplingDecl | tei:scriptStmt | tei:segmentation | tei:settingDesc | tei:sourceDesc | tei:stdVals | tei:styleDefDecl | tei:textClass | tei:textDesc | tei:xenoData

@ebeshero
Copy link
Member

VF2F: Council greenlights a first stage of edits to clean up the prose, and then revisit to discuss what more may need to be done.

@martinascholger
Copy link
Member

Council at VF2F suggests to clean up the typos and unclear explanations first. @raffazizzi, @ju -- please open separate issues if necessary.

@raffazizzi
Copy link
Contributor

@sydb's point 4.iii

“every declarable element must bear a unique identifier”

  • Does this really mean every declarable element must have an @xml:id (which would be nuts, but is what it says), or does it mean every element of the same type (which would make a lot of sense and is what 15.3.3 number 3 sort of implies), or does it mean every element of the same type that has a sibling of that type?

I think it means what it says: every element must have an identifier. The bullet point that follows the one @sydb refers to indicates very specifically that a default should be indicated "for each different type of declarable element which occurs more than once within the same parent element". So, if the first bullet point meant elements of the same type it would have been as explicit as the second.

Having said that, I agree with @sydb that it's overkill to require ids everywhere and that it makes more sense to only enforce them for elements of the same type. But I think we need to discuss this as a group.

Otherwise, I'm working on the rest on a branch

@raffazizzi
Copy link
Contributor

VF2F agrees with adjusting the language so that not every declarable element must have @xml:id. See new wording in branch.

@raffazizzi
Copy link
Contributor

Updated and merged branch. 124531b

@raffazizzi
Copy link
Contributor

Branch was very behind. Reverted merge and will try to fix the branch before attempting merge again

@raffazizzi
Copy link
Contributor

Merged in only prose changes (f4c625d). @sydb to revise Schematron constraints soon.

@sydb
Copy link
Member Author

sydb commented Mar 22, 2023

@raffazizzi and I just had a long chat about this ticket. It is almost ready to be closed, but we cannot implement it because Schematron abstract patterns are not processed correctly in P5/antbuilder.xml (steps 8 & 9a only call iso_svrl_for_xslt2.xsl, but should call iso_dsdl_include.xsl and iso_abstract_expand.xsl, too).

So we should either

  1. fix antbuilder to use the skeleton implementation properly; OR
  2. replace the skeleton implementation with mausatron; OR
  3. give up on using abstract patterns, and instead invent a mechanism for having the @context of a rule be all the members of a class; OR
  4. give up on having any mechanism for applying a constraint specification to an entire class, and just enumerate the members of the class in the @context.

@sydb
Copy link
Member Author

sydb commented Aug 7, 2023

Having thought about this a bit (not a lot) I have decided that I like (2) the best and (1) next; (4) is not really that much worse than (2) or (1), but feels a lot worse, so I don’t like it; and (3) is at least very very hard if not outright impossible (it doesn’t seem that hard when you are dealing with a single customization of a given language, but if you have a customization chain it would get out of hand, I think).
So I am am planning to start poking at implementing (2) sometime.

@sydb
Copy link
Member Author

sydb commented Aug 7, 2023

I went to poke at antbuilder.xml a bit, and discovered (somewhat to my horror) that I have already implemented (1). However, it does not work, in that abstract patterns still cause problems. (The rest of the Schematron probably works fine, I did not test much. But the abstract patterns work so badly that the Guidelines won’t build.) So unless I did something wrong, (1) is not going to work, anyway.

@sydb
Copy link
Member Author

sydb commented Aug 10, 2023

I have now implemented (2), above, in branch issue1981bis. It passes all the current tests in a Docker environment. I have not actually checked in the abstract patterns for testing att.declarable, yet. (Remember, that was the reason we wanted to update the build process to use a more modern Schematron processor.)
I encourage anyone and everyone to check out this branch and see if it builds on your system.
The work so far is reflected in 3132022.

@sydb
Copy link
Member Author

sydb commented Aug 12, 2023

I have implemented (2) — use mausatron (aka schxslt) as our Schematron processor so we can use abstract patterns; and also checked the constraints on att.declarable elements expressed using an abstract pattern. (See commit 6044db6.)
However, this ticket is blocked by #2455, because the output of the test for @default is different depending on which version of rnv you use.

@sydb sydb mentioned this issue Aug 12, 2023
@raffazizzi raffazizzi modified the milestones: Guidelines 4.7.0, Guidelines 4.8.0 Nov 16, 2023
@sydb
Copy link
Member Author

sydb commented Nov 27, 2023

As #2455 has now been dealt with, I have merged dev into branch issue1981bis (which was a lot of work). So I think this may be ready to merge. To be honest, I do not even remember where we are in actually getting better constraints for att.declaring and att.declarable. At this point, the main item of interest is that the branch for this ticket includes updating our build process from the Schematron Skeleton implementation to @dmj’s SchXslt (which I call “mausatron”). That change is required for CMC to move forward.

@raffazizzi
Copy link
Contributor

@sydb I removed Status: Blocked since #2455 was resolved. I merged dev into issue1981bis without issue and pushed. Would you consider opening a PR?

@ebeshero
Copy link
Member

ebeshero commented Jul 1, 2024

@raffazizzi @sydb See this comment on the existing PR (#2509) , though: #2509 (comment)

In May Council decided we needed to consult someone about fixing NVDL...I don't think that has happened yet?

@raffazizzi
Copy link
Contributor

This issue got orphaned a little, but it's not too late to update the branch. @ebeshero do you think we need more discussion or can I (or @sydb) attempt a PR? Marking as Pending for now.

@raffazizzi raffazizzi added the Status: Pending pending action described in a comment, to return to discussion before further action will be taken label Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Pending pending action described in a comment, to return to discussion before further action will be taken TEI: Guidelines & Documentation TEI: Schema TEI: Schematron
Projects
None yet
Development

No branches or pull requests

7 participants