-
-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does functor
have the right semantics for Flux?
#49
Comments
The approach in Functors was chosen to make conservative estimates about the expected behavior of Julia code. In many cases, custom types use multiple dispatch and overload definitions exported by base or other packages to suit their needs. It includes overloading functions that may be used for different reasons but show up in ML frequently - like If we remove the consideration of when to stop recursing, we would not be dispatching to the right methods by default. This can produce very hard to debug cases. Also, To be flexible, we certainly need to be able to distinguish leaves from non leaves, and replacing that need with a different API would need something close to the current definition. It is possible to do with assuming that every object can be recursed into - it requires us to mark things differently, see https://blog.ploeh.dk/2018/08/06/a-tree-functor/ for examples of how reading collections and enumerations and so on require different implementations. |
I hope you're ready, Carlo, because this is (both technically and philosophically) one heck of a rabbit hole ;) To start, let me say that many of the questions around the design of Functors are reminiscient of those encountered during the design of ChainRules. For example, how to represent the functored form of a value is almost the same question as how to represent the tangent. You may be able to get more out of the wonderful documentation there than my ramblings below. In short, "functors" in Functors.jl really ought to be base functors of the values they represent. Why do we need base functors? Because not all custom types in Julia are fully generically parameterized and many algorithms for working with functor tree (or DAG, in our case) traversal require more type fluidity. You can see that similar libraries like Flatten.jl require fully generic types for this reason. However, making a proxy type for everything is both inefficient and unnecessary. Just like many primitive and array types are perfect natural tangent representations of themselves, so too are many of the same types already valid functors. Hence we can make the distinction between structural and natural functors, i.e. some hypothetical All that said, how do we create
Which brings us back to the topic of this issue. Originally, I too thought that doing anything but option 1 would be too risky. However, the success and relative lack of fires popping up with ecosystem-wide adoption of ChainRules seems to counter that idea. Here I would highly recommend skimming the epic, multi-issue discussion around natural vs structural tangents in ChainRulesCore, culminating in @willtebbutt's proposal in JuliaDiff/ChainRulesCore.jl#449. |
Thanks Brian, that was highly informative |
@mcabbott @darsnack and I are generally in favor of this, although it should be carefully tested before release. @ToucheSir? |
I'd be in favour of automatic synthesis as well. One worry I had at the time was structured arrays, but with #33 we appear to have a plan for those now. |
Structured arrays do seem like a concern, as #33 required a hand-written inverse, impossible for some types. We could easily exclude them from a traverse-anything scheme. |
Due to the default fallback
in Functors.jl, every custom type is considered a leaf (i.e. it has no children) and we have to sprinkle
@functor MyType
everywhere in Flux and in user code.We could remove all this boilerplate by having by default what
@functor MyType
currently does. Then 99% of people could live their life completely unaware of@functor/functor
(historically poorly documented and poorly understood) and only use the much clearertrainable(x::MyType)
in case they need to customize the parameter collection.Besides the transition, which I think could be made rather smooth, does anyone see any counterindication in changing the default?
The text was updated successfully, but these errors were encountered: