-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP+RFC add chunked mode #570
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@@ -7,7 +7,7 @@ using Compat: hasfield, hasproperty | |||
|
|||
export frule, rrule # core function | |||
# rule configurations | |||
export RuleConfig, HasReverseMode, NoReverseMode, HasForwardsMode, NoForwardsMode | |||
export RuleConfig, HasReverseMode, NoReverseMode, HasForwardsMode, NoForwardsMode, HasChunkedMode, NoChunkedMode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
export RuleConfig, HasReverseMode, NoReverseMode, HasForwardsMode, NoForwardsMode, HasChunkedMode, NoChunkedMode | |
export RuleConfig, | |
HasReverseMode, | |
NoReverseMode, | |
HasForwardsMode, | |
NoForwardsMode, | |
HasChunkedMode, | |
NoChunkedMode |
function frule(::RuleConfig{>:HasChunkedMode}, | ||
(Δf, Δx)::Tuple{Any,ProductTangent}, f, args...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
function frule(::RuleConfig{>:HasChunkedMode}, | |
(Δf, Δx)::Tuple{Any,ProductTangent}, f, args...) | |
function frule( | |
::RuleConfig{>:HasChunkedMode}, (Δf, Δx)::Tuple{Any,ProductTangent}, f, args... | |
) |
function rrule(::RuleConfig{>:HasChunkedMode}, args...) | ||
y, back = rrule(args...) | ||
return y, ApplyBack(back) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know I suggested this RuleConfig
idea, but I'm not sure it fits. If there is an un-chunked rrule which needs the configuration, then this rrule(args...)
can't call it.
I think it might be cleanest to have a new function, whose fallback is chunked_rrule(config, f, args...) = rrule(config, f, args...)
. Any AD which may make chunks should always call this, and any AD rule to exploit chunks will overload it.
Not sure how that intersects with @opt_out
. Maybe there are other weird corners to think about too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah. We might need it to be separate. It feels clunky, but this method also doesn't seem great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm ideally we want to hit the version with this particular trait dropped, and other traits preserved (like HasReverseMode etc)
We should be able to achieve that with invoke
, but can we do it elegantly?
struct ProductTangent{P} | ||
partials::P | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much is gained from this kind of chunking? I mean storing several partials in some tuple. For reverse mode, nothing, I think. For forward mode, maybe derivatives_using_output
has the same benefits.
The big gain seems to be from making a solid matrix to represent many vectors, and e.g. getting matrix multiplication. So I think the initial design ought to make that work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This design does this (I think). partials
here can either be a Tuple{Tuple}
or an eachrow(Matrix)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I guess a vector of vectors can have the same meaning as an EachCol
. Just a different path, by dispatch on ProductTangent{<:EachSlice}
.
Having some tests would prove illuminating I think |
based on discussion JuliaDiff/Diffractor.jl#54 and in slack. This is an initial implementation of forward and reverse chunked mode AD.