-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AbstractArray Dilemma docs #347
base: main
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #347 +/- ##
==========================================
- Coverage 90.01% 88.86% -1.16%
==========================================
Files 14 13 -1
Lines 541 530 -11
==========================================
- Hits 487 471 -16
- Misses 54 59 +5
Continue to review full report at Codecov.
|
Random though: I'm currently chopping and changing between "structured array type" and "sparse array type" quite a bit. |
Thanks for making a start. My essay on this is JuliaDiff/ChainRules.jl#337 (comment), and I'd appreciate your comments on what, if any, of that you disagree with. I wonder, on first skim, whether the parallel to the forward evaluation should be stressed more. On a generic AbstractMatrix, Re names, |
It's the next thing on my to-read list :) Wanted to get my thoughts down before working through it.
Agreed --
I'm not quite sure what you're saying here. When you say "the parallel to the forward" are you referring to the usual function evaluation, or the forwards pass of reverse-mode? I guess I'm not quite sure what you have in mind here. |
I mean "usual function evaluation", which is also usually the forward pass. The sparse matrix types you describe all know about |
Ahh I see. Yes, we should talk about that more. My position here is that all I want from AD is for it to be efficient if I've bothered to write the code to specialise the function, and I've written it in such a way that is "AD friendly". I'm perfectly happy for AD to produce slow rules if the function itself hits a slow fallback. I'm also happy to write rules which deal with AD-unfriendly, but specialised code. We have to do this quite a bit with I'm happy to opt-in to generic rrules by declaring "this array is the kind of array for which generic rules are likely to be a good idea", but I really don't want to have to opt out of them, and I really don't want someone to write a type-pirate-y rule that gets in the way of AD doing it's job when I load a particular package. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a few typos
If the semantics of AD are such that it is necessary to assume that the tangent of `D` | ||
might be non-diagonal, it is necessary for AD to treat `D` like a `Matrix` which happens to | ||
be diagonal, as discussed above. | ||
This code will clearly not run if that is the case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code will clearly not run if that is the case. | |
This code will clearly not run if that is the case, because N is so large. |
this only became clear to me after reading the next paragraph, I didn't really internalise the size of N
This property seems to arise from AD considering only considering a function at a single | ||
point, rather than the neighbourhood of a point. | ||
|
||
We've not used any contraversial `frule`s here, so this property must be something to do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've not used any contraversial `frule`s here, so this property must be something to do | |
We've not used any controversial `frule`s here, so this property must be something to do |
end | ||
``` | ||
When the input `A` is a `Matrix{Float64}` and the cotangent `ȳ` a `Vector{Float64}`, it | ||
seems uncontraversial. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems uncontraversial. | |
seems uncontroversial. |
|
||
### Takeaway | ||
|
||
There are a at least two possible things you could think to do as a consequence of this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a at least two possible things you could think to do as a consequence of this. | |
There are at least two possible things you could think to do as a consequence of this. |
This is probably going to need to go through a few rounds of redrafting before it's good to go.
I've tried to produce a statement of the facts, without imposing my opinion on the right solution. Inevitably, I won't have achieved this, and I'm very happy to rephrase things that aren't sufficiently unbiased.
@mcabbott @mzgubic @oxinabox
edit:
Things to include in the next pass:
norm
ChainRules.jl#337, Explain the Abstract Primals Problem #343 -- there are more