-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set epoch_seed_change
attribute on SimulationDataset
#840
Set epoch_seed_change
attribute on SimulationDataset
#840
Conversation
c78a1c7
to
64cd184
Compare
Hey @srstevenson, we actually don't want to call StreamingDataset's constructor here -- SimulationDataset is meant to run on single process unlike StreamingDataset so there's some logic that's different between the two init methods. The |
Thanks, @snarayan21, that makes sense. I've marked this PR as draft for now, and will update it to set |
64cd184
to
148157b
Compare
epoch_seed_change
attribute on SimulationDataset
@snarayan21 I've updated this to just set |
Hey @srstevenson, just getting back to this PR now. Before I approve, can you ensure that you can correctly run the simulator with this change now? Just to make sure that we can resolve issue #831 |
This was added to the `StreamingDataset` which the `SimulationDataset` inherits, so also needed to be added here. Without this, the code attempts to access the missing attribute when running a simulation: ``` AttributeError: 'SimulationDataset' object has no attribute 'epoch_seed_change' Traceback: File "/home/scott/projects/streaming/.venv/lib64/python3.12/site-packages/streamlit/runtime/scriptrunner/exec_code.py", line 88, in exec_func_with_error_handling result = func() ^^^^^^ File "/home/scott/projects/streaming/.venv/lib64/python3.12/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 579, in code_to_exec exec(code, module.__dict__) File "/home/scott/projects/streaming/simulation/interfaces/sim_ui.py", line 409, in <module> submit_jobs(shuffle_quality, dataset, time_per_sample, node_internet_bandwidth, File "/home/scott/projects/streaming/simulation/interfaces/sim_ui.py", line 110, in submit_jobs for output in gen_sim: ^^^^^^^ File "/home/scott/projects/streaming/simulation/core/main.py", line 110, in simulate samples_per_node = dataset.get_samples_per_node(epoch, 0) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/scott/projects/streaming/simulation/core/sim_dataset.py", line 367, in get_samples_per_node partition = generate_work(self.batching_method, self, self.world, epoch, sample_in_epoch) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/scott/projects/streaming/streaming/base/batching/__init__.py", line 45, in generate_work return get(dataset, world, epoch, sample_in_epoch) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/scott/projects/streaming/streaming/base/batching/random.py", line 49, in generate_work_random_batching shuffle_units, small_per_big = dataset.resample_streams(epoch) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/scott/projects/streaming/streaming/base/dataset.py", line 878, in resample_streams epoch, self.epoch_seed_change) ^^^^^^^^^^^^^^^^^^^^^^ ``` Closes #831
148157b
to
ec9c4c4
Compare
Yes, the simulator is running without errors from this branch (rebased on main as of 63e8907): |
@srstevenson perfect thanks! lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Set
epoch_seed_change
attribute onSimulationDataset
This was added to the
StreamingDataset
which theSimulationDataset
inherits, so also needed to be added here. Without this, the code attempts to access the missing attribute when running a simulation:Closes #831