-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Docs on ProjectTo and embedded subspaces #412
base: main
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #412 +/- ##
=======================================
Coverage 92.12% 92.12%
=======================================
Files 14 14
Lines 775 775
=======================================
Hits 714 714
Misses 61 61 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I imagine there is more coming?
# Types that represent embedded subspaces | ||
_Taking Types Representing Embedded Subspaces Seriously™_ | ||
|
||
To paraphrase Stefan Karpinski: _"This does mean treating sparse matrixes not just as a representation of dense matrixes, but the alternative is just too horrible."_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is phrased strangely and hard to follow.
To paraphrase Stefan Karpinski: _"This does mean treating sparse matrixes not just as a representation of dense matrixes, but the alternative is just too horrible."_ | |
To paraphrase Stefan Karpinski: _"This does mean treating sparse matrixes not just as a representation of dense matrixes, but the alternative is just too horrible."_ |
_Taking Types Representing Embedded Subspaces Seriously™_ | ||
|
||
To paraphrase Stefan Karpinski: _"This does mean treating sparse matrixes not just as a representation of dense matrixes, but the alternative is just too horrible."_ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To paraphrase Stefan Karpinski: _"This does mean treating sparse matrixes not just as a representation of dense matrixes, but the alternative is just too horrible."_ | ||
|
||
|
||
Consider the following possible `rrule` for `*` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider the following possible `rrule` for `*` | |
Consider the following possible `rrule` for `*`: |
Consider the following possible `rrule` for `*` | ||
```julia | ||
function rrule(::typeof(*), a, b) | ||
mul_pullback(ȳ) = (NoTangent(), ȳ*b', a'*ȳ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should you use the Δ
convention here?
julia> pb(1.0) | ||
(NoTangent(), 3.0, 2.0) | ||
``` | ||
and for matrixes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and for matrixes | |
and for matrices: |
This seems even worse though: no only has it escaped the `Diagonal` type, it has even escaped the subspace it represents. | ||
|
||
Further, it is inconsistent with what we would get had we AD'd the function for `*(::Diagonal, ::Matrix)` directly. | ||
[That primal function](https://github.com/JuliaLang/julia/blob/f7f46af8ff39a1b4c7000651c680058e9c0639f5/stdlib/LinearAlgebra/src/diagonal.jl#L224-L245) boils down to: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this refer to a specific Julia version for simplicity?
*(a::Diagonal, b::AbstractMatrix) = a.diag .* b | ||
``` | ||
By reading that primal method, we know that that ADing that method would have zeros on the off-diagonals, because they are never even accessed | ||
(a similar argument can be made for the complex part of a real number). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(a similar argument can be made for the complex part of a real number). | |
(a similar argument can be made for the imaginary part of a real number). |
|
||
--- | ||
|
||
Consider the following possible `rrule` for `sum` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider the following possible `rrule` for `sum` | |
Consider the following possible `rrule` for `sum`: |
(NoTangent(), [1.0 1.0; 1.0 1.0]) | ||
``` | ||
That's not right -- not if us saying this was `Diagonal` meant anything. | ||
If you try and use that dense matrix, to do gradient descent on your `Diagonal` input, you will get a non-diagonal result: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you try and use that dense matrix, to do gradient descent on your `Diagonal` input, you will get a non-diagonal result: | |
If you try and use that dense matrix to do gradient descent on your `Diagonal` input, you will get a non-diagonal result: |
That's not right -- not if us saying this was `Diagonal` meant anything. | ||
If you try and use that dense matrix, to do gradient descent on your `Diagonal` input, you will get a non-diagonal result: | ||
`[2.0 1.0; 1.0 2.0]`. | ||
You have escape the subspace that the diagonal type represents. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have escape the subspace that the diagonal type represents. | |
You have escaped the subspace that the diagonal type represents. |
Yep. |
This is kinda the follow up to https://github.com/JuliaDiff/ChainRulesCore.jl/pull/347/files
but including our actual resolution of it.
It still needs to move some content over from that PR.