Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect line and column numbering in validation errors #10109

Closed
hqdncw opened this issue Nov 24, 2023 · 1 comment
Closed

Incorrect line and column numbering in validation errors #10109

hqdncw opened this issue Nov 24, 2023 · 1 comment
Labels
bug Did we break something? p2-medium Medium priority, should be done, but less important

Comments

@hqdncw
Copy link

hqdncw commented Nov 24, 2023

Bug Report

Description

The line and column numbering feature introduced in #6285 has a reliability problem, which can cause incorrect error messages to appear in certain cases.

Reproduce

  1. Initialize DVC.
dvc init --no-scm
  1. Create dvc.yaml
tee -a dvc.yaml << END
stages:
  train:
    cmd:
      - python train.py
    deps:
      - config.cfg
    outs:
      models/
END
  1. List the contents of the repository
$ dvc ls .
'./dvc.yaml' validation failed.

expected a list, in stages -> train -> outs, line 3, column 5
  2   train:                                                                                                                                                              
  3 │   cmd:                                                                                                                                                              
  4 │     - python train.py

As you can see, DVC reports the error as occurring on line 3, column 5. However, the actual error is located on line 7, column 5. The reported line number is off by four lines, making it difficult to pinpoint the source of the error.

Expected

When validating a YAML file, DVC should provide accurate line and column numbers for any errors that occur. This ensures that users can quickly identify and fix problems in their YAML files.

Environment information

Output of dvc doctor:

$ dvc doctor
DVC version: 3.30.1
-------------------
Platform: Python 3.11.2 on <REDACTED>
Subprojects:
        dvc_data = 2.22.0
        dvc_objects = 1.1.0
        dvc_render = 0.6.0
        dvc_task = 0.3.0
        scmrepo = 1.4.1
Supports:
        azure (adlfs = 2023.10.0, knack = 0.11.0, azure-identity = 1.15.0),
        gdrive (pydrive2 = 1.17.0),
        gs (gcsfs = 2023.9.2),
        hdfs (fsspec = 2023.9.2, pyarrow = 14.0.1),
        http (aiohttp = 3.9.0, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.9.0, aiohttp-retry = 2.8.3),
        oss (ossfs = 2021.8.0),
        s3 (s3fs = 2023.9.2, boto3 = 1.28.17),
        ssh (sshfs = 2023.10.0),
        webdav (webdav4 = 0.9.8),
        webdavs (webdav4 = 0.9.8),
        webhdfs (fsspec = 2023.9.2)
Config:
        Global: /home/sid/.config/dvc
        System: /etc/xdg/dvc
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/sda9
Caches: local
Remotes: None
Workspace directory: ext4 on /dev/sda9
Repo: dvc (no_scm)
Repo.site_cache_dir: /var/tmp/dvc/repo/f3aaa8fd4f4e85f507ddd8996bd9016b

Additional Information (if any):

Related to #10102

hqdncw added a commit to hqdncw/dvc that referenced this issue Nov 24, 2023
We understand that certain data types may lack a `lc` property, so we utilize the `key()` and `item()` methods of the `lc` property to obtain the line and column details for such types. Even though we still need to navigate to the parent mapping or sequence to access the `lc` property, these methods enable us to handle diverse data types seamlessly and display accurate line and column information.

Fixes iterative#10109

Signed-off-by: hqdncw <[email protected]>
@dberenbaum dberenbaum added bug Did we break something? p2-medium Medium priority, should be done, but less important labels Dec 1, 2023
hqdncw added a commit to hqdncw/dvc that referenced this issue Dec 26, 2023
@skshetry
Copy link
Member

Closing, as we don't have capacity to work on this.

@skshetry skshetry closed this as not planned Won't fix, can't repro, duplicate, stale Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Did we break something? p2-medium Medium priority, should be done, but less important
Projects
None yet
Development

No branches or pull requests

3 participants