-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add QR for non tall-skinny matrices and split=0
#1744
base: main
Are you sure you want to change the base?
Add QR for non tall-skinny matrices and split=0
#1744
Conversation
Thank you for the PR! |
1 similar comment
Thank you for the PR! |
…=0' of github.com:helmholtz-analytics/heat into features/1736-QR_for_non-tall-skinny_matrices_and_split=0
Thank you for the PR! |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1744 +/- ##
==========================================
+ Coverage 92.26% 92.27% +0.01%
==========================================
Files 84 84
Lines 12445 12471 +26
==========================================
+ Hits 11482 11508 +26
Misses 963 963
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
needs to be merged before #1697 |
Thank you for the PR! |
split=0
split=0
Thank you for the PR! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @mrfh92, great to have this, I only have a few minor comments.
raise ValueError( | ||
"A is split along the rows and the local chunks of data are rectangular with more rows than columns. \n Applying TS-QR in this situation is not reasonable w.r.t. runtime and memory consumption. \n We recomment to split A along the columns instead. \n In case this is not an option for you, please open an issue on GitHub." | ||
# check that data distribution is reasonable for TS-QR | ||
# we regard a matrix with split = 0 as suitable for TS-QR is largest local chunk of data has at least as many rows as columns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# we regard a matrix with split = 0 as suitable for TS-QR is largest local chunk of data has at least as many rows as columns | |
# we regard a matrix with split = 0 as suitable for TS-QR if largest local chunk of data has at least as many rows as columns |
column_idx = torch.cumsum(A.lshape_map[:, -2], 0) | ||
column_idx = column_idx[column_idx < A.shape[-1]] | ||
column_idx = torch.cat( | ||
( | ||
torch.tensor([0], device=column_idx.device), | ||
column_idx, | ||
torch.tensor([A.shape[-1]], device=column_idx.device), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use the DNDarray.counts_displs()
method here,
_, column_idx = A.counts_displs()
(this returns a tuple though, and I think the final item [A.shape[-1]] needs to be added)
R = A.copy() | ||
# Block-wise Gram-Schmidt orthogonalization, applied to groups of columns | ||
offset = 1 if A.shape[-1] <= A.shape[-2] else 2 | ||
for k in range(len(column_idx) - offset): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, each iteration needs A_copy[..., :, column_idx[k] :]
only. Would it make sense to free up memory progressively here by only keeping the necessary slice of A_copy?
Due Diligence
benchmarks: created for new functionality(currently not available)benchmarks: performance improved or maintained(currently not available)Description
see title
Issue/s resolved: #1736
Does this change modify the behaviour of other functions? If so, which?
no