-
-
Notifications
You must be signed in to change notification settings - Fork 329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Needs help] Replace in-house lexer by HoaCompiler #712
Conversation
Hello @theofidry, Sorry for the late reply. I have a new job, not really free for open source right now. I will try to find time. I am not forgetting you at all :-). Please, feel free to ping me at anytime if I stay silent for too long. Thanks! |
No worries @Hywan, I'm not that free either lately :P, congrats on your new job 😄 |
Hello @theofidry :-), I have a little bit more free time right now. And as promised, I will help on this PR. So here is a little bit of vocabulary to speak the same language:
This is a very classical front-end and middle-end compilation workflow. There is other approach of course, but let's stick on this one. Based on that, what do you want?
You want to replace the lexer of Symfony Expression Language by
What is VO? I have started to implement your DSL grammar. However, before going further, I need clarifications here about your exact needs. If you already have an object model defined, then we can “map”/transform an AST to this object model, and everything else will roll. |
Thanks for looking into it @Hywan :) So basically right now we have a very simple ParserInterface: interface ParserInterface
{
/**
* Parses a value, e.g. 'foo' or '$username' to determine if is a regular value (like 'foo') or is a value that
* must be processed (like '$username'). If the value must be processed, it will be parsed to generate a value (a
* ValueInterface instance) ready for processing.
*
* @param string $value
*
* @throws ExpressionLanguageParseThrowable
*
* @return ValueInterface|string|array
*/
public function parse(string $value);
} which is what I called "Expression Language" although it's not exactly one and it's not Symfony one either. VO stands for Value Object, which here are So right now that parser is broken down in two parts:
The part I'm trying to get rid off by using HoaCompiler is the Lexer to replace this custom regex-based Lexer and tokens. One done the Parser needs to be adapted to consume the generated AST instead of tokens but I expect that to be rather trivial. The main issue I would say is the grammar itself. Alice DSL is described here but it's something that has been defined over time and grew organically, so I wouldn't be surprised if there's still a few edge cases where things are ambiguous. (But that's also a reason to move to HoaCompiler: it would expose such cases once for all) Basically what I did in this PR is the starting point: replacing the lexer itself, i.e. starting to define the grammar to transform the input string into an AST to be consumed later. Let me know if there's one point that is still a bit unclear |
OK, let's give it a try :-). |
I have few questions so far (I will have more later 😉):
|
From the doc:
and I can see a
Yes. Also as you can guess,
That's a risk we have with moving to HoaCompiler I guess as before the expression would have likely to be evaluated wrong and been considered invalid (simply because of the implementation limitations not due to an invalid syntax). If we identify such cases, I think it depends of what we have as there's multiple actions that can be taken: handle it in the grammar for that not happening or ensuring the system bails out in a graceful manner so that the user can easily identify the issue and fix it. Ask as many question as needed :P |
OK so basically, we can have any strings with the Alice language inside, is that correct? So |
Back from vacations with more free time. Work again on this issue :-). |
@Hywan I've updated the PR based on the latest state of the work you pushed of the GitLab repository. You should see the
Without any more faff |
Thanks! I appreciate your help :-). |
@theofidry @Hywan is this still the "hot" PR or was this topic dropped or superseded by something else? |
No the topic and PR are still relevant. The branch might be ohutdated but
the rebase should be relatively easy.
…On Mon 18 Feb 2019 at 20:52, Christoph Schmidt ***@***.***> wrote:
@theofidry <https://github.com/theofidry> @Hywan
<https://github.com/Hywan> is this still the "hot" PR or was this topic
dropped or superseded by something else?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#712 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AE76gaxRUCt2DrK-4mki7M8DLLL0y4sdks5vOwSZgaJpZM4M085b>
.
|
I've been absent for many months, but now I'm back, and I'm catching up everything. This project is down on my todo list, but the list is smaller every day ;-). |
I'll be closing this one since the HoaCompiler has been archived. |
Note: this message has been edited to avoid the read of the long discussion of here and the original issue (#601).
Alice ships with an Expression Language, which allows to interpret values such as
@user*
or<current()>
. This is currently done by an in-house Expression Language which uses a Lexer to tokenize the string input and then a Parser which go through those tokens to transform it in an understandable expression. For example:input:
'@user*'
tokens returned by the Lexer:
value returned by the Parser:
As explained in greater details here, the fixtures are first build into an understandable structure of objects and then they are evaluated to generate the object we wants.
The plan now is to replace the in-house lexer with HoaCompiler. The current one relies on regexes and different pass to try to tokenize the input and then the parser trying to create objects from it. HoaCompiler however works with a better structure: a "grammar" which is a set of rules on how to parse strings is created and return a list of nodes from it. That approach is way more robust and would allow to avoid a lot of edge cases where regexes are just not the tool for the job.
Implementation wise it's not all too complex. Right now most of the Expression Language is tagged as internal so we can afford BC breaks on that part of the library which gives enough freedom to do that work without the need of another major release.
The current PR provides a start of implementation. The major work to do is to write the grammar which @Hywan already started to match our needs. Once done we can then update the Parser to process those
Node
objects instead of the tokens from the in-house lexer (that part should be relatively easy).Most of the rules that the Expression Language implements are documented here besides the tests.