-
-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
recursive-leak: fix leak by using middle-man #494
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat trick!
src/recursive.rs
Outdated
pub fn define<P: Parser<'a, I, O, E> + Clone + MaybeSync + 'a + 'b>( | ||
parser: P, | ||
declaration: Recursive<Indirect<'a, 'b, I, O, E>, Declared>, | ||
) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this function be on Recursive<..., Declared>
and return a Recursive<..., Defined>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you could do that too. For type inference it doesn’t matter, but it could make more sense logically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say so, yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completed with 4572bb5
While this method seems nice, it also makes more complicated recursive patterns harder - you can't have a single context struct on which you both use and define the recursive parser, since the two things are separate types. While you may be able to re-architect code in most cases to avoid this, I'm curious if that's an intended limitation? |
No, it's not intended. We could wrap them both in an enum (somewhat like |
Given that those checks fail for reasons outside of this PR, how would you like me to proceed? |
You should be able to rebase to fix the failing CI now :) |
Finally got around to it. My bad for the mess that is the commit history in this pull request 😅 One would think that after using it for four+ years that I would be able to keep it somewhat clean |
I totally forgot about this PR, I'll take a final look at it tomorrow. |
Was the issue mentioned by CraftSpider above resolved? |
Is there any way I could help on this issue? |
I think it probably just needs a rebase and then it's ready to go. That said, I am a bit concerned about the issue @CraftSpider mentioned above: it will probably break the way some people are using |
I think that you can probably change it to always store one or the other in the state, but I haven't really gotten around to trying it out with this change. I'm not too worried personally. |
Right, in that case it's probably fine. I would appreciate the commit history being cleaned up a bit here before merging too (otherwise I think I'll squash). |
I need to look at my use of recursive tomorrow to see whether whether I can understand the limitation and whether it hurts my usage, I don't get it yet without staring at the code. |
I believe it's specifically for the case of |
So, my apologies for dropping off of the face of this earth for so long. If I get this rebased in the next coming days, would it be good to be merged, or is there anything else that it is needing? |
No worries, there's no time constraint here :) I think this change is probably the way to go, on reflection. It's unfortunate that it breaks certain uses of |
So... I'll be honest. I tried to rebase this thing for like an hour :') Merging took all of two seconds, so obviously I suck at rebasing. Edit: It seems to fail with an issue that is unrelated to this PR. Should I fix it up anyway and push the change? |
Maybe it would be better to just take the diff from Don't worry about unrelated CI stuff, I'll address that elsewhere :) |
Sounds good! I’ll take care of that after I get outta work! |
@zesterer Could we have a new alpha release once this gets merged? |
Wanted to ask, isn't it the same as using "Squash and merge" option when merging on GitHub? It seems like @Zij-IT got busy and it's a bit sad that this PR is essentially ready, but not merged. We want to use this library in production in our company, but this memory leak is a problem in our use case (long-running process that can parse thousands of inputs). |
I believe the behaviour is the same, yes: with the exception that the commit message will be a concatenation of every intermediate commit, which is undesirable. I would do it myself, but I don't want to be taking credit for @Zij-IT's work in the commit log. |
b1a9004
to
32c97f9
Compare
32c97f9
to
30a48ca
Compare
My apologies to everyone that has been waiting for me to make the change. I just did what was recommended by @zesterer, so this should be good to roll. The test that fails fails because what was a runtime error became a compiler error in the tests case. There isn't AFAIK a way to test that something fails to compile, so I can't write an equivalent test. As such, I just deleted the test :/ |
You can test something that fails to compile by writing a doctest: /// ```compile_fail
/// // ...
/// ```
fn test_name() {} A specific error code can be provided, separated by a comma: There might be some more setup required, but this is a method I've used in the past. This has the drawback of not being able to specify an expected error (other than its code, and not all errors have codes). A better solution would likely use |
src/recursive.rs
Outdated
pub fn define<P: Parser<'a, I, O, E> + Clone + MaybeSync + 'a + 'b>( | ||
parser: P, | ||
declaration: Self, | ||
) -> Recursive<Indirect<'a, 'b, I, O, E>, Defined> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for not having the declaration
argument be self
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly nothing in particular. I'll swap it over to having it be a method!
Edit: Actually, there is a small factor for why. If it's a method that consumes self, than you end up having to do something like:
let mut chain = Recursive::declare();
// Note the clone vvvvvvv
let result = chain.clone().define(just::<_, _, extra::Err<Simple<char>>>('+')
.then(chain.clone())
.map(|(c, chain)| Chain::Link(c, Box::new(chain)))
.or_not()
.map(|chain| chain.unwrap_or(Chain::End)));
// As opposed to:
let result = Recursive::define(just::<_, _, extra::Err<Simple<char>>>('+')
.then(chain.clone())
.map(|(c, chain)| Chain::Link(c, Box::new(chain)))
.or_not()
.map(|chain| chain.unwrap_or(Chain::End)), chain);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zesterer Which one do you prefer? I'll do it whichever way you'd prefer. I personally like the symmetry between Recursive::declare
and Recursive::define
and the second way makes it more obvious that Recursive::define
consumes the parser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think I prefer the first, personally. The extra parameter way off at the end of the definition feels very isolated and random if you're not already sure what it is. Also, putting it at the front makes clear what the target of the definition is.
I'm pretty sure GitHub decided the 7 commits before mine are different because the hashes are different... I try and do the right thing with a rebase I manage to create clutter :D Anyhow, this should be the final commit for the PR to be merged, right? |
I'll try to find time to take another pass of this today. Hopefully, yep! |
This introduces a middle man to prevent both memory leaks (#486) in recursive parsers, and prevents the user from foot-gunning themselves by calling the wrong
clone
(referring to craftspider'sweak_clone
) suggestion.