-
-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attempt to create sandbox data release in nightly builds. #3158
Merged
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -78,6 +78,11 @@ function copy_outputs_to_distribution_bucket() { | |
aws s3 cp "$PUDL_OUTPUT/" "s3://intake.catalyst.coop/$GITHUB_REF" --recursive | ||
} | ||
|
||
function zenodo_data_release() { | ||
echo "Creating a new PUDL data release on Zenodo." | ||
~/devtools/zenodo/zenodo_data_release.py --publish --env sandbox --source-dir $PUDL_OUTPUT | ||
} | ||
|
||
|
||
function notify_slack() { | ||
# Notify pudl-builds slack channel of deployment status | ||
|
@@ -125,9 +130,14 @@ if [[ $ETL_SUCCESS == 0 ]]; then | |
if [ $GITHUB_ACTION_TRIGGER = "push" ] || [ $GITHUB_REF = "dev" ]; then | ||
copy_outputs_to_distribution_bucket | ||
ETL_SUCCESS=${PIPESTATUS[0]} | ||
zenodo_data_release 2>&1 | tee -a $LOGFILE | ||
ETL_SUCCESS=${PIPESTATUS[0]} | ||
fi | ||
fi | ||
|
||
# This way we also save the logs from latter steps in the script | ||
gsutil cp $LOGFILE ${PUDL_GCS_OUTPUT} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay I added a re-copy of the logfile after everything. |
||
|
||
# Notify slack about entire pipeline's success or failure; | ||
# PIPESTATUS[0] either refers to the failed ETL run or the last distribution | ||
# task that was run above | ||
|
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given our current logic, the logs from the release script will show up in log file that is sent to slack but they won't show up in the log file copied to the GCS bucket. I think
copy_outputs_to_gcs
should be called towards the end of the script.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a problem here now that we're doing some post-processing for distribution -- removing files we don't want to distribute, gzipping the SQLite DBs. What we copy to GCS is for forensic purposes, which is different from what we're currently shipping to AWS, Kaggle, Zenodo, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah that's right. I guess it's not a huge deal if we don't include the zenodo release logs in the GCS bucket. We could also change the script to dump the outputs to GCS after the ETL runs then dump the logs after most of the post-processing/distribution logic has happened.
p.s. would love to rework this system so all of the logs are collected for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it'd be better to capture the additional steps in the logs, or only save the outputs we distribute to the nightly build buckets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an overhaul of the nightly build system is definitely on the docket for next year.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I understand your question:
Are you asking which logs should be written to the log file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind I was seeing it as either or based on the ordering, but re-saving the logfile fixes everything.