-
-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regex generator generates invalid unicode sequences (and is unmaintained) #1848
Comments
Even generex doesn't seem to be directly responsible for this problem, this seems to originate in automaton: cs-au-dk/dk.brics.automaton#15 . But this issue also seems to exist for more than 5 years and generex doesn't even use the currently published automaton version. |
Just a note on the use of |
Yes, I tried that, but the problem is \w wouldn't create any non ASCII characters, which I want to do. |
I can try \w or other character classes as a workaround again, but I think I had issues with those too. |
We're using regular expressions like
.{0,24}
to limit the contents of string fields in their size and rely on the generator to generate corresponding example values. However, we're having trouble uploading the contracts to the pact broker, as the CLI (ruby) reports incomplete surrogate pairs (like\uDAA5
) in the contract.A month ago I reported this on the Pact Slack and the ruby repository as a suspected defect in the CLI. However, having had some time to take a closer look into what surrogates do I believe that the CLI is showing valid behavior. In my opinion the problem is in fact in the Pact-JVM Library for generating invalid unicode sequences, or more closely the Generex library. The later seems to be unmaintained since at least 5 years. Maybe it's time to switch to another generator, or implement an own?
The text was updated successfully, but these errors were encountered: