-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wastewater proxy merge #29
base: main
Are you sure you want to change the base?
Conversation
…y and summarized the data that feed into the emis
…ote: there is an issue where the proxy emis do not match with the GHGI emis
…ote: there is an issue where the proxy emis do not match with the GHGI emis
… code is integrated into existing ECHO / GHGRP data allocation
… up final proxy creation step.
…ndle proxy allocation. meat and poultry is the last remaining proxy sector to finalize
…y complete other than potential geospatial de-deduplication improvements.
I went through the updated script to address merge conflicts with the existing script, it should be ready for merge now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good Chris. Some of the helper functions would be great to pull out and put in utils and I'd like to clean up the geospatial operations to use geopandas with meter based distances.
return industry_df | ||
|
||
|
||
|
||
def convert_state_names_to_codes(df, state_column): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
us_state_to_abbrev <- check out this function, I think John wrote it and this might be a duplicate of that function?
}) | ||
|
||
# Extract coordinates | ||
echo_coords = np.array([(lat, lon) for lat, lon in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to revise this using the geospatial functions in geopandas instead of doing it manually. I can advise on it.
dist = np.sqrt((ghgrp_row['latitude'] - echo_row['Facility Latitude'])**2 + | ||
(ghgrp_row['longitude'] - echo_row['Facility Longitude'])**2) | ||
if dist < 0.025 and ghgrp_row['Year'] == echo_row['Year']: | ||
dist = np.sqrt((ghgrp_row['latitude'] - echo_row['latitude'])**2 + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine to continue to use RSS here, but I do wonder for readability if it makes sense to use the CRS defined in config.py
(EQ_AREA_CRS = "ESRI:102003"
) or an equidistant CRS to do calculations in meters? This might also help the clarity of the ECHO vs FRS distance tolerance / filtering above doing it in meters instead of DD.
return result_df | ||
|
||
|
||
def geocode_address(df, address_column): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love this. Could you move it to utils.py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a complex one. Let'see how much we can extract common functions and get the outputs wrapped into a set of functions with defined input and output files for clarify.
@@ -1,18 +1,25 @@ | |||
# %% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's get ahead of John's code review an add a header for this file.
Pull completed wastewater proxy script into main. This script writes all of the industrial wastewater proxies and the nonseptic domestic proxy.