-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Export additional columns from requests collection #52
base: develop
Are you sure you want to change the base?
Export additional columns from requests collection #52
Conversation
Update projection for the requests query to pull additional fields
remove blank spaces
run linter to format the changes
@dav3r Here is the PR with the updated projection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe with this pull request the projection becomes everything per the schema. If that is the case we should consider using an empty projection.
Since we are pulling all columns from the collection we can just pass an empty projection
aws_jobs/cyhy-data-extract.py
Outdated
@@ -390,6 +390,8 @@ def main(): | |||
"agency.location": True, | |||
"agency.name": True, | |||
"agency.type": True, | |||
"agency.contacts": True, | |||
"key": True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key is sensitive data, which is why it was never exported in the past. Please provide confirmation via email that the CyHy data owner and the owner of the receiving system (the Analytics Environment) understand and accept the risks of sharing this data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With e1f73a9, the key
field is no longer explicitly mentioned, but the concern above still stands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reached out to get that approval
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DJensen94 Do you have any updates on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm requesting that a comment be added.
Add comment to note that the empty projection exports all fields, which can contain sensitive data Co-authored-by: Shane Frasier <[email protected]>
update projection to exclude data that wasn't necessary in the mini data lake
Alphabetize projection fields
reformat spacing via the linter
@DJensen94 - You needn't run the linter manually. See the instructions for setting up |
extract the snapshots to give historical counts to the data lake
fix spacing in dictionary
@DJensen94 Please update the PR description to clearly specify which new fields and collections are being added in this PR, as well as the reason(s) why they are now required. Thanks! |
we are removing snapshots pull and putting it into its own PR
Regarding 9b4879e, why are you putting the snapshots change in a separate PR? |
I was asked to move this to a separate pull request since we are still awaiting approval for the key and contacts and the snapshots pull is more time sensitive. |
@DJensen94 @climber-girl @jessiebeals - Is this PR still needed? |
🗣 Description
In order to keep cyhy data in sync with the mini data lake we need to pull additional fields. This PR adds additional fields to the projection from the requests table.
💭 Motivation and context
This will allow the mini data lake to keep in sync with the cyhy database, and it will also allow the ASM Visibility Dashboard to use these additional fields when generating dashboards and creating reports
🧪 Testing
Verified all added projections are fields in the requests collection
✅ Pre-approval checklist
in code comments.
to reflect the changes in this PR.
✅ Pre-merge checklist
✅ Post-merge checklist