Bug fix: printing non-distributed data #1756

ClaudiaComito · 2024-12-30T07:38:48Z

Due Diligence

General:
- title of the PR is suitable to appear in the Release Notes
Implementation:
- unit tests: all split configurations tested
- unit tests: multiple dtypes tested
- ~~benchmarks: created for new functionality~~ does not apply
- benchmarks: performance improved or maintained
- documentation updated where needed

Description

I noticed that printing out moderately large DNDarrays in non-distributed mode (i.e. interactive session or split=None) takes a disproportionate amount of time (see below) compared to printing the underlying tensor. Culprit is the Formatter call.

I've changed the printing module to bypass Formatter if the input dndarray is not distributed. Torch takes care of the data formatting, tests pass.

Example:

data =  ht.random.randn(91392, 52, 4)
print(data)

On 1 process:
main branch: 84 seconds
this PR: 0.01 seconds

Issue/s resolved: #

Changes proposed:

let torch handle formatting of non-distributed data

Type of change

Bug fix (non-breaking change which fixes an issue)

Memory requirements

NA

Performance

see above

Does this change modify the behaviour of other functions? If so, which?

no

github-actions · 2024-12-30T07:43:53Z

Thank you for the PR!

codecov · 2024-12-30T08:16:46Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.25%. Comparing base (87f2812) to head (3edaca4).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1756      +/-   ##
==========================================
- Coverage   92.26%   92.25%   -0.01%     
==========================================
  Files          84       84              
  Lines       12445    12447       +2     
==========================================
+ Hits        11482    11483       +1     
- Misses        963      964       +1

Flag	Coverage Δ
unit	`92.25% <100.00%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

github-actions · 2025-01-13T09:01:16Z

Thank you for the PR!

ClaudiaComito added 4 commits December 30, 2024 07:05

make 1-proc print great again

9df7d4e

fix tabs size

bb18872

skip formatter on non-distr data

dac1457

remove time import

2fb1618

ClaudiaComito added the bug Something isn't working label Dec 30, 2024

ClaudiaComito added this to the 1.5.1 milestone Dec 30, 2024

github-actions bot added features backport release core labels Dec 30, 2024

ClaudiaComito added the PR talk label Jan 13, 2025

ClaudiaComito requested a review from mrfh92 January 13, 2025 08:55

Merge branch 'main' into features/default_local_print

3edaca4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug fix: printing non-distributed data #1756

Bug fix: printing non-distributed data #1756

ClaudiaComito commented Dec 30, 2024 •

edited

Loading

github-actions bot commented Dec 30, 2024

codecov bot commented Dec 30, 2024 •

edited

Loading

github-actions bot commented Jan 13, 2025

Bug fix: printing non-distributed data #1756

Are you sure you want to change the base?

Bug fix: printing non-distributed data #1756

Conversation

ClaudiaComito commented Dec 30, 2024 • edited Loading

Due Diligence

Description

Changes proposed:

Type of change

Memory requirements

Performance

Does this change modify the behaviour of other functions? If so, which?

github-actions bot commented Dec 30, 2024

codecov bot commented Dec 30, 2024 • edited Loading

Codecov Report

github-actions bot commented Jan 13, 2025

ClaudiaComito commented Dec 30, 2024 •

edited

Loading

codecov bot commented Dec 30, 2024 •

edited

Loading