Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tree-sitter-xpath: initial XPath grammar/parser #2

Merged
merged 7 commits into from
Oct 18, 2023

Conversation

eyelidlessness
Copy link
Member

Branched from #1.

  • The grammar attempts to follow the XPath 1.0 spec as closely as possible. There are some notes mainly around accommodations for tree-sitter behaviors. In some cases syntax rules are named where they may not have been in the spec, mainly to aid actual usage of parse trees.
  • Adds an initial corpus of parse tests, exercised against all of the expressions which are evaluated in Enketo's openrosa-xpath-evaluator integration tests.
  • Adds a set of tests exercising the original use case for the grammar: identifying sub-expressions within parsed expressions, e.g. for the purpose of building a dependency graph.

Nearly all of this is aped from Astro. Several modifications were made, mainly to enhance strictness of available checks
There are a bunch of places where @typescript-eslint is making incorrect determinations of types in `update-example-versions.js`, will need to revisit!
Notes on the tree-sitter test corpus:

- The text file format is tree-sitter’s standard test suite format. It’s weird!
- All of the tested expressions are from openrosa-xpath-evaluator’s integration tests.
- Some effort was made to spot check the syntax tree representations (roughly, tree-sitter test’s assertions), but they *are currently* automatically generated by the `tree-sitter test —update` command.
- A more thoroughly checked test suite will be added in a subsequent commit, which exercises an early use case which inspired development of the grammar (querying expressions for sub-expression references, to be used in building an XForms dependency graph). That suite will include all of the same expressions from ORXE’s test suite, as well as several others which caught issues with the grammar during development.
This should be… automated! And checked in CI!
This exercises the use case which originally inspired development of `tree-sitter-xpath`. The query/filtering logic is one way we might approach identifying dependencies in XForms bindings’ computation expressions. For now, at least, it exercises the grammar’s ability to support that approach across a wide range of expressions.
@eyelidlessness
Copy link
Member Author

Whoops, looks like I forgot to include the pertinent build artifacts between the CI build and test phases. On it shortly.

const { describe } = test;
const it = test;

// TODO: This is currently exercising the WASM build of the grammar/parser, but
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed this up as-is mostly for expediency in showing progress. If we do end up using this technique, it’ll be implemented and probably tested against actual XForms domain logic in a separate package. The tests for now are just intended to provide some reassurance that the grammar works and has potential utility in so doing.

It does have the handy effect of testing the parser’s portability, which I’ve been ignoring for the most part. Although there’s also a Node binding in the build step which could in theory be used for server side use cases (and would ostensibly be faster than WASM, if it matters).

@eyelidlessness eyelidlessness force-pushed the init/tree-sitter-xpath branch 5 times, most recently from 5effb22 to 3f229ec Compare October 12, 2023 22:18
- Includes necessary build artifacts in cache for testing to proceed
- Fixes shell compatibility issue with typegen
@@ -0,0 +1,3 @@
# ODK Web Forms

Placeholder for now, more to come!(?)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😄

@eyelidlessness
Copy link
Member Author

As with #1, we agreed to go ahead and merge for now.

@eyelidlessness eyelidlessness merged commit 45132c6 into main Oct 18, 2023
16 checks passed
expected: ['div'],
},
{
expression: 'div div div',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UGH

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is a relative ref to a node named div, the div operator, another relative node ref?

(filter_expr
(function_call
name: (function_name) @function-name)
(#eq? @function-name "id"))) @path.filter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For XForms/ODK needs, I think we'd change id to instance. The id function is not part of our XPath subset. The instance function comes from XForms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants