-
-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updater pipeline and initial work on the breadth-first update #374
Conversation
- At the start of the update process, an error message used to be shown saying that the default branch could not be determined. There is not need to attempt to determine the default branch if the repository is not yet cloned. - During the update process, the updater clones the remote reposiory to a temp folder and this temp folder's name used to be used when logging. E.g. 327fshkewrew/law would get set as the repository's name. In order to be able to reference a repository by a different name when logging, added a new attribute called alias. So now we can reference this temp repository by a more descriptive name.
… appropriate error
…ture 1. Introduced a `Pipeline` framework to structure the update process as a series of distinct stages. 2. Implemented the following pipeline stages: - `clone_remote_and_run_tuf_updater` - `validate_out_of_band_and_update_type` - `clone_or_fetch_users_auth_repo` - `load_target_repositories` - `get_targets_data_from_auth_repo`
…atino of setting targets_data
…commits and pipeline fixes
…nitial state is out-of-sync
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the pipeline steps and the codebase looks a lot more readable! I've left some comments that I've found when testing.
CHANGELOG.md
Outdated
@@ -1045,7 +1035,6 @@ and this project adheres to [Semantic Versioning][semver]. | |||
[0.13.0]: https://github.com/openlawlibrary/taf/compare/v0.12.0...v0.13.0 | |||
[0.12.0]: https://github.com/openlawlibrary/taf/compare/v0.11.2...v0.12.0 | |||
[0.11.1]: https://github.com/openlawlibrary/taf/compare/v0.11.1...v0.11.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[0.11.1]: https://github.com/openlawlibrary/taf/compare/v0.11.1...v0.11.2 | |
[0.11.1]: https://github.com/openlawlibrary/taf/compare/v0.11.0...v0.11.1 |
taf/updater/updater_pipeline.py
Outdated
@log_on_start( | ||
INFO, "Cloning target repositories which are not on disk...", logger=taf_logger | ||
) | ||
@log_on_start(INFO, "Finished cloning target repositories", logger=taf_logger) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be?
@log_on_start(INFO, "Finished cloning target repositories", logger=taf_logger) | |
@log_on_end(INFO, "Finished cloning target repositories", logger=taf_logger) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
taf/updater/updater_pipeline.py
Outdated
if self.only_validate: | ||
taf_logger.warning( | ||
"Target repositories must already exist when only validating repositories" | ||
) | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran taf repo validate --path .
on a repository which doesn't have a target repository. I got this error:
Traceback (most recent call last):
File "d:\oll\taf\taf\updater\updater.py", line 581, in validate_repository
_update_named_repository(
File "d:\oll\taf\taf\updater\updater.py", line 366, in _update_named_repository
) = _update_current_repository(
File "d:\oll\taf\taf\updater\updater.py", line 532, in _update_current_repository
updater_pipeline.run()
File "d:\oll\taf\taf\updater\updater_pipeline.py", line 107, in run
self.handle_error(e)
File "d:\oll\taf\taf\updater\updater_pipeline.py", line 118, in handle_error
raise e
File "d:\oll\taf\taf\updater\updater_pipeline.py", line 101, in run
update_status = step()
File "C:\Users\nikol\Envs\oll\lib\site-packages\logdecorator\decorator.py", line 19, in wrapper
return self.execute(fn, *args, **kwargs)
File "C:\Users\nikol\Envs\oll\lib\site-packages\logdecorator\decorator.py", line 89, in execute
return super().execute(fn, *args, **kwargs)
File "C:\Users\nikol\Envs\oll\lib\site-packages\logdecorator\decorator.py", line 13, in execute
return fn(*args, **kwargs)
File "d:\oll\taf\taf\updater\updater_pipeline.py", line 486, in get_target_repositories_commits
repository.fetch(branch=branch)
File "d:\oll\taf\taf\git.py", line 930, in fetch
self._git("fetch {} {}", remote, branch, log_error=True)
File "d:\oll\taf\taf\git.py", line 226, in _git
raise error
taf.exceptions.GitError: Repo tmchippewa/law-static-assets: D:\OLL\oll-test-repos\tmchippewa\law-static-assets does not exist or is not a git repository
So it seems to be failing at repository.fetch(branch=branch)
on L486 (different pipeline step).
I think we could either turn this warning into an error and handle the error in this pipeline step, or remove if self.only_validate
altogether since error is raised in another pipeline step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, when validating, we should not proceed if the repository does not exit.
fetched_commits = repository.all_commits_on_branch( | ||
branch=f"origin/{branch}" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I managed to trigger an error but didn't get a stacktrace when the error occured, so I put breakpoint to figure out where the issue popped up. Looks like it's in all_commits_on_branch
. Current updater output:
An error occurred while running step get_target_repositories_commits: 'NoneType' object has no attribute 'target'
Update of tmchippewa/law failed due to error: 'NoneType' object has no attribute 'target'
Could we add error handling in this pipeline step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you remember which repository you were updating when this happened? This is old code that was not updated as a part of this rework
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sent via slack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, the problem was caused by the incorrect fetching logic
taf/updater/updater_pipeline.py
Outdated
repository.name | ||
].items(): | ||
last_validated_commit = validated_commits[-1] | ||
# TODO what to do if an error occurred while validating that branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we create issues for leftover TODOs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this TODO and I think that the check if the update was fully successful is sufficient. If this part of the code is reached, there were no error and the branch was validated successfully
for ( | ||
branch, | ||
validated_commits, | ||
) in self.state.validated_commits_per_target_repos_branches[ | ||
repository.name | ||
].items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly validation is done BFS, but merging is DFS. Is it possible that during DFS merge a target repository merge fails (because of some conflict)? If so, will the next time we run updater, the target repository state affect target validation? Since one target repo would have multiple merged commits but other targets wouldn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not modify the merge logic. If, for any reason, the state of one or more target repositories is not synced with the authentication repository's state, the validation should start from scratch:
taf/taf/updater/updater_pipeline.py
Line 366 in cd24900
def determine_start_commits(self): |
@log_on_start( | ||
INFO, "Validating out of band commit and update type", logger=taf_logger | ||
) | ||
def _validate_out_of_band_and_update_type(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran taf repo update --url https://github.com/openlawlibrary/open-law
to test out clone for cloning dependencies as well. It mostly worked, but got the following errors at the end. Here's the interesting output:
Validating out of band commit and update type
Update of test/test failed. One or more referenced authentication repositories could not be validated:
'NoneType' object is not subscriptable
'NoneType' object is not subscriptable
'NoneType' object is not subscriptable
'NoneType' object is not subscriptable
'NoneType' object is not subscriptable
'NoneType' object is not subscriptable
'NoneType' object is not subscriptable
Called LifecycleStage.REPO handler. Event: failed
Called LifecycleStage.UPDATE handler. Event: failed
Update of test/test failed due to error: Update of test/test failed. One or more referenced authentication repositories could not be validated:
'NoneType' object is not subscriptable
'NoneType' object is not subscriptable
'NoneType' object is not subscriptable
'NoneType' object is not subscriptable
'NoneType' object is not subscriptable
'NoneType' object is not subscriptable
'NoneType' object is not subscriptable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
1d244b9
to
8918757
Compare
f9ca062
to
5974c0c
Compare
Description (e.g. "Related to ...", etc.)
Refactored the updater by implementing a pipeline. The biggest advantage of this approach is that this pipeline is a class, meaning that it maintains a state attribute and that it is not necessary to pass a large number of variables to various functions. Also split the process into a bigger number of separate steps. The idea is to greatly improve the readability of the code. Also improved error messages and logging.
Partly implemented breadth-first update. The authentication repository will be validated first (and validation will fail if it is only valid up to a certain commit), but target repositories will then be validated in a breath-first manner. So, iterate over authentication repository's commit and for each commit, validate each of the target repositories. If an error is detected, partially update all repositories (so up to the last valid authentication commit and target commits specified in that commit of the authentication repository)
Closes #366
Closes #364
Closes #152
See #281
Code review checklist (for code reviewer to complete)