You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fault_parameter_name: | This is a prompt. This is a prompt. This is a prompt. This is a prompt. This is a prompt.
Despite not changing anything, dvc sees the step as changed, but then realizes the step was cached and loads it from cache.
# Running for the first time, expectedbash-5.2$ dvc repro -s faulty_stageRunning stage 'faulty_stage':
> echo imruneverytime > faulty.txtUpdating lock file 'dvc.lock' Use `dvc push` to send your updates to remote storage.
# Nothing changes here, yet it tries to run, but finally does not, because of the previous run being cachedbash-5.2$ dvc repro -s faulty_stageStage 'faulty_stage' is cached - skipping run, checking out outputs Updating lock file 'dvc.lock' Use `dvc push` to send your updates to remote storage.bash-5.2$
Why it occurs:
When long lines are dumped to dvc.lock the line gets wrapped.
The issue happens (only sometimes I think), when this wrap occurs after "\n" character.
When the yaml is then loaded it contains and additional space.
You can see this happen directly in the ruamel.yaml
This can be worked around by adding this line, which makes it so that string is dumped in one line.
Setting the width might change the existing dvc.lock files. Although setting width may seem to fix it, the issue is really with ruamel-yaml not round-tripping properly.
Correct, although since currently there seems to be no work being done ruamel-yaml's side, I sill would consider simply changing YAML's settings to avoid this issue.
Especially since I don't see a way to work around it.
Everytime I do dvc commit I need to confirm that I want to commit since params.yaml changed, when in reality they didn't. This adds a lot of confusion.
As for exisiting dvc.lock:
I'm pretty sure this change will not affect loading existing dvc.lock files.
The first time you recreate dvc.lock with this change, it will create a different dvc.lock, however when loaded, both old and new files should yield identical objects.
So I think it should not cause any troubles?
Let me know if you have doubts, so I can test for them.
Bug Report
repro: long strings dumped to dvc.lock contain extra space upon load
Edit:
In PR, I replaced the
float("inf")
withsys.maxsize
as suggested in #9397 (comment)Description
The actual bug seems to be originating from
ruamel.yaml
, but we should mitigate it here.How it manifests:
I have this stage in
dvc.yaml
and this in
params.yaml
Despite not changing anything, dvc sees the step as changed, but then realizes the step was cached and loads it from cache.
Why it occurs:
When long lines are dumped to
dvc.lock
the line gets wrapped.The issue happens (only sometimes I think), when this wrap occurs after "\n" character.
When the yaml is then loaded it contains and additional space.
You can see this happen directly in the
ruamel.yaml
This can be worked around by adding this line, which makes it so that string is dumped in one line.
How it can be solved.
This can be probably solved be adding this line
here
dvc/dvc/utils/serialize/_yaml.py
Line 48 in 7d14acb
This solves my issue, and while I didn't do any real testing of this, I don't think this should cause any issues elsewhere.
Output of
dvc doctor
:This will hopefully be a one line fix, and I can add a PR for it today.
Best regards!
The text was updated successfully, but these errors were encountered: