-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update torchdata.nodes docs, use sphinx for API #1396
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/data/1396
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 054e6a6 with merge base e316c5c (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets goo!!!!
@andrewkho actually since you have this PR already open, do you want to add a quick section about perf in migrate_to_nodes_from_utils.rst ? |
There's a section in "why torchdata nodes", do you think we should add a link to it? Also is your WP down? |
@andrewkho yeah, there seems to be an outage. |
@andrewkho we discussed yesterday to include some perf related info in the docs, so that all the "informational pieces" are visible together in the docs. yeah we can include a pointer to the readme, or include a brief blurb here as well. up to you? |
torchdata/nodes/base_node.py
Outdated
Must call super().reset(initial_state) | ||
* next() -> T - logic for returning the next value in the sequence, or throw StopIteration | ||
* get_state(self) -> dict: returns a dictionary representing state that may be passed to reset() | ||
"""BaseNodes are the base class for creating composable dataloading dags in ``torchdata.nodes``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question, should it be DAGs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's pretty un-ambiguous but you're correct, i'll update it
|
||
We'll demonstrate how to achieve the most common DataLoader features, re-use existing samplers and datasets, | ||
and load/save dataloader state. It performs at least as well as ``DataLoader`` and ``StatefulDataLoader``, | ||
see :ref:`how-does-nodes-perform`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@divyanshk added xref here
IterableDatasets, each worker needs to figure out (through | ||
``torch.utils.data.get_worker_info``) what data it should be returning. | ||
|
||
.. _how-does-nodes-perform: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@divyanshk it links here
# also provides state_dict and load_state_dict methods. | ||
return tn.Loader(node) | ||
|
||
Now let's test this out with a useless dataset, and demonstrate how state management works. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: replace useless
with simple
or something else :D ?
* update docstrings for sphinx * update docstrings for sphinx * add migration from torch.utils.data * add performance section * add xref to performance in migrate guide * minor pr comments --------- Co-authored-by: Andrew Ho <[email protected]>
* update docstrings for sphinx * update docstrings for sphinx * add migration from torch.utils.data * add performance section * add xref to performance in migrate guide * minor pr comments --------- Co-authored-by: Andrew Ho <[email protected]>
Use sphinx for torchdata.nodes api reference
split existing torchdata.nodes into "What is torchdata.nodes?" and "getting started"
This looks like a lot of line changes but it's mostly docstring updates, and splitting the existing docfile into two
Tested locally:
API page:
Getting started:
Migrating from torch.utils.data
What is torchdata.nodes?