You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
provide a helper that strips an environment of a formula of all variables that are not referred to by the formula.
useful for i.e., PipeOpMutate and #410
The text was updated successfully, but these errors were encountered:
This is a good idea. One thing in particular I was worried about was things where the user constructs a formula where he refers to a certain variable, and then changes the variable value before executing the pipeop.
A few pointers:
this should be a helper for the user, I don't think we should be doing this internally
there is the all.vars() function that determines what variables are used inside an expression. The problem is, it does not recognize if the reference is as a function call or as an ordinary variable. This makes trouble in case the user has a function and a variable with both the same name (typical candidate here is c). I think we won't get around having to walk the AST to do this well.
a possible problem is that even if we recognize a value is being used inside a formula, it could be that the value is "masked" by the task. So the user does ~ x == y, and y occurs both in the environment of the formula, but also in the task, the task's y will have precedence, but the function won't know that at the time of packing
there is the general problem that variables in a formula could be "surprise masked" by the task, e.g. if the user doesn't expect there to be a y column in the task. Or if the task changes and suddenly the user has to rename his variables just because of name clashes. Maybe we want to offer a ..() function (like cd .. in a shell) that lets the user access things outside the formula. variables in ..() get precedence before columns in a task. Variables that are not in a ..() are dropped by pack_formula() and are only available if they are in a package (so function calls still work).
I think a way to implement this would be to walk the AST, check for ..()-calls, check if they are at the place of a function call or a value access, and get the respective value. The formula itself should not be changed, instead the ..() function should be an actual function that gets a value from the environment above the task column environment. This way
the pack_formula() doesn't actually modify the formula
it is not necessary to use pack_formula(), e.g. when the user doesn't worry about formula environments
the ..() thing is still useful even if pack_formula() is not used (i.e. to avoid surprise variable name clashes)
What do you think of the design?
Should I write this or do you feel up to the task?
provide a helper that strips an environment of a formula of all variables that are not referred to by the formula.
useful for i.e.,
PipeOpMutate
and #410The text was updated successfully, but these errors were encountered: