Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Docs on ProjectTo and embedded subspaces #412

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

oxinabox
Copy link
Member

This is kinda the follow up to https://github.com/JuliaDiff/ChainRulesCore.jl/pull/347/files
but including our actual resolution of it.
It still needs to move some content over from that PR.

@codecov-commenter
Copy link

codecov-commenter commented Jul 26, 2021

Codecov Report

Merging #412 (67e95af) into master (d68a5b8) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #412   +/-   ##
=======================================
  Coverage   92.12%   92.12%           
=======================================
  Files          14       14           
  Lines         775      775           
=======================================
  Hits          714      714           
  Misses         61       61           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d68a5b8...67e95af. Read the comment docs.

@sethaxen sethaxen self-requested a review July 26, 2021 16:07
Copy link
Member

@sethaxen sethaxen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine there is more coming?

# Types that represent embedded subspaces
_Taking Types Representing Embedded Subspaces Seriously™_

To paraphrase Stefan Karpinski: _"This does mean treating sparse matrixes not just as a representation of dense matrixes, but the alternative is just too horrible."_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is phrased strangely and hard to follow.

Suggested change
To paraphrase Stefan Karpinski: _"This does mean treating sparse matrixes not just as a representation of dense matrixes, but the alternative is just too horrible."_
To paraphrase Stefan Karpinski: _"This does mean treating sparse matrixes not just as a representation of dense matrixes, but the alternative is just too horrible."_

_Taking Types Representing Embedded Subspaces Seriously™_

To paraphrase Stefan Karpinski: _"This does mean treating sparse matrixes not just as a representation of dense matrixes, but the alternative is just too horrible."_

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

To paraphrase Stefan Karpinski: _"This does mean treating sparse matrixes not just as a representation of dense matrixes, but the alternative is just too horrible."_


Consider the following possible `rrule` for `*`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Consider the following possible `rrule` for `*`
Consider the following possible `rrule` for `*`:

Consider the following possible `rrule` for `*`
```julia
function rrule(::typeof(*), a, b)
mul_pullback(ȳ) = (NoTangent(), ȳ*b', a'*ȳ)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you use the Δ convention here?

julia> pb(1.0)
(NoTangent(), 3.0, 2.0)
```
and for matrixes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
and for matrixes
and for matrices:

This seems even worse though: no only has it escaped the `Diagonal` type, it has even escaped the subspace it represents.

Further, it is inconsistent with what we would get had we AD'd the function for `*(::Diagonal, ::Matrix)` directly.
[That primal function](https://github.com/JuliaLang/julia/blob/f7f46af8ff39a1b4c7000651c680058e9c0639f5/stdlib/LinearAlgebra/src/diagonal.jl#L224-L245) boils down to:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this refer to a specific Julia version for simplicity?

*(a::Diagonal, b::AbstractMatrix) = a.diag .* b
```
By reading that primal method, we know that that ADing that method would have zeros on the off-diagonals, because they are never even accessed
(a similar argument can be made for the complex part of a real number).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(a similar argument can be made for the complex part of a real number).
(a similar argument can be made for the imaginary part of a real number).


---

Consider the following possible `rrule` for `sum`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Consider the following possible `rrule` for `sum`
Consider the following possible `rrule` for `sum`:

(NoTangent(), [1.0 1.0; 1.0 1.0])
```
That's not right -- not if us saying this was `Diagonal` meant anything.
If you try and use that dense matrix, to do gradient descent on your `Diagonal` input, you will get a non-diagonal result:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If you try and use that dense matrix, to do gradient descent on your `Diagonal` input, you will get a non-diagonal result:
If you try and use that dense matrix to do gradient descent on your `Diagonal` input, you will get a non-diagonal result:

That's not right -- not if us saying this was `Diagonal` meant anything.
If you try and use that dense matrix, to do gradient descent on your `Diagonal` input, you will get a non-diagonal result:
`[2.0 1.0; 1.0 2.0]`.
You have escape the subspace that the diagonal type represents.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You have escape the subspace that the diagonal type represents.
You have escaped the subspace that the diagonal type represents.

@oxinabox
Copy link
Member Author

I imagine there is more coming?

Yep.
Was hoping on getting time for this today.
But instead did the last few things required to Zygote onto CRC 1.0, Nabla too 🎉

@mcabbott mcabbott added documentation Improvements or additions to documentation ProjectTo related to the projection functionality labels Nov 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation ProjectTo related to the projection functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants