-
Notifications
You must be signed in to change notification settings - Fork 1
Data Updates
- Python (2.x.x)
- Pip (comes with Python 2.7.9)
- csvfix
- Add the Python and csvfix installation directories to your path (mac, windows)
- Two python scripts from Development Seed:
csv-manipulation.py
&es_populator.py
- Two utility
.csv
files:aggregation_commodity.csv
&aggregation_region.csv
- One small text file:
requirements.txt
- IMPACT Output data prepared in the format described below
- All of these commands should be run in the
raw
folder of this respository
Run python csv-manipulation.py [filename]
in the command line where [filename] is a csv output of multiple scenarios in the following format:
impactparameter | scenario | commodity | region | year | Val |
---|---|---|---|---|---|
PopXAgg -- Population | SSP2_GFDL | cbeef | VEN | 2050 | 46.2749 |
Data should be be aggregated so as not to include the production type variable.
The console will show the following (or similar based on the file) while running:
separating target file: 4DevSeed.csv
creating 5 files
creating file: SSP2-GFDL.csv
creating file: SSP2-HGEM.csv
creating file: SSP2-IPSL.csv
creating file: SSP2-MIROC.csv
creating file: SSP2-NoCC.csv
The processing slightly transforms the data to fit a specific schema:
- Only the first word of the
impactparameter
is used - In all other fields, any space (
-
) is replaced with an underscore (_
) - All text is converted to lowercase
- Commodity and region group/aggregate names are added according to
aggregate_commodity.csv
andaggregate_region.csv
The created scenario csvs will automatically be moved to the scenarios/
folder.
Run the next two commands in the command line where [username] and [password] are those provided for editing the elasticsearch
cluster.
pip install -r requirements.txt
python es_populator.py [username] [password]
Expect the process to take ~10 minutes per scenario. Once the script is complete, all of the scenarios will be uploaded to the Heroku instance at https://ad21a5a8cb0789e9b73c2142d3c83e43.us-east-1.aws.found.io:9243
Scenarios can be easily deleted from the cluster using delete.py
:
python delete.py --scenario [scenario_name] [username] [password]
or to delete all scenarios:
python delete.py --delete-all [username] [password]