-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Alpine parsing to actually extract commit instead of the checksum #241
Conversation
wantResultMetric: stats.FileExtractedResultSuccess, | ||
wantInventory: nil, | ||
wantErr: cmpopts.AnyError, | ||
wantResultMetric: stats.FileExtractedResultErrorUnknown, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there a reason invalid apk installed files still gets successfully extracted?
This does change the behavior to make this extractor fail on invalid files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some cases like NPM extraction we opted to skip invalid formats and not return an error, e.g. https://github.com/google/osv-scalibr/blob/main/extractor/filesystem/language/javascript/packagejson/extractor.go#L177
The reasoning was that it's not the extractor that's the problem but the file, so the scan shouldn't be marked "failed" just because there was an invalid file somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be handled at the scanner level? The extraction still failed, it should be up to the caller to determine what to do in those cases. We can add a "ErrInvalidFile" error to be returned that can be handled by the caller?
This is something we would want to configure in osv-scanner, where if there is an invalid file we generally want to notify the user that we didn't fully complete the scan and let them correct the file. (Though I also do see how having a log entry also kind of accomplishes this, it would be nice to have more control on the caller side)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Historically the idea was that the plugin is the best place to determine if this is maybe not the intended file or the file itself is invalid.
Example for a file which FileRequired says yes, but Extract should ignore and not through an error:
- For go binaries FileRequired returns true based on some heuristics, ignoring the file content to be efficient. Extract then parses the file and might figure it's not a go binary. Thus this plugin should just ignore it.
- package.json is used by multiple applications (e.g. vscode extensions). FileRequired just checks the filename. Extract can then determine if this is actually a javascript package or not.
Example for invalid file, which should error:
- DPKG location is
/var/lib/dpkg/status
. If the parser finds an error, it's very likely because the file is corrupted. It's unlikely that there is any other app using that path for something else.
So in case of APK, which also has a quite unique location (lib/apk/db/installed
), I'd say an invalid file should fail, yes.
c77c726
to
5128d01
Compare
// check both that line is empty and we have filled out data in group | ||
// this avoids double empty lines returning early | ||
if line == "" && len(group) > 0 { | ||
// scanner.Err() could only be non nil when Scan() returns false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd remove this comment. I don't see why it's here. It's the expected behavior from my point of view. The Scan=false and Err=nil on EOF is more unexpected. But the code handles this nicely
Migrate parts of osv-scanner's alpine extractor logic here to replace textproto, as textproto MIMEHeaders assumes the key is case insensitive which is not the case.
(There doesn't seem like there's a way to preserve the case with textproto from what I can find)