Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to control serialization options in outputs from pipelines? #95

Open
3 tasks
wendellpiez opened this issue Jan 9, 2024 · 0 comments
Open
3 tasks
Labels
enhancement New feature or request

Comments

@wendellpiez
Copy link
Collaborator

wendellpiez commented Jan 9, 2024

User Story:

Currently, pipelines are mainly set to emit using method='xml' or method='text' as appropriate, and/or to indent outputs. In particular, this means that encoding=utf-8 is typically set by default. So UTF-8 is used.

Can and should any of this be exposed better or parameterized? Either of these scenarios is not impossible to imagine:

  • A schema or XSLT is wanted in ASCII format, with character escapes to represent characters not in ASCII, such as ߞ for the em dash (Unicode U+2014), to make the schema easier to consume by tools that handle ASCII but not UTF-8
  • A schema is wanted in UTF-16 encoding, simply for efficiency reasons in a non-European language, as compared to UTF-8

If not with switches or controls at the interface, these settings can be made in XSLTs or XProcs.

E.g. calling Saxon with the command-line switch -!encoding=ASCII

Additionally, and separate from the question of how to (re)set the encoding, should output encoding settings be 'ASCII' by default? ASCII-amenability could be a big plus with little down side in expressiveness.

Goals:

Minimally: test and document the presently available options for controlling serialization settings, including character encoding and markup indenting.

Consider also updating to 'ASCII' outputs for some artifacts such as XML schemas and XSLTs produced by XSLT or XProc.

Dependencies:

None known. Also not known is whether (where) any improvement is very impactful.

Acceptance Criteria

  • All website and readme documentation affected by the changes in this issue have been updated. Changes to the website can be made in the docs/content directory of your branch.
  • A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
  • The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.

{The items above are general acceptance criteria for all User Stories. Please describe anything else that must be completed for this issue to be considered resolved.}

@wendellpiez wendellpiez added the enhancement New feature or request label Jan 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant