Data Updates

Requirements

Python (2.x.x)
Pip (comes with Python 2.7.9)
csvfix
Add the Python and csvfix installation directories to your path (mac, windows)
Two python scripts from Development Seed: csv-manipulation.py & es_populator.py
Two utility .csv files: aggregation_commodity.csv & aggregation_region.csv
One small text file: requirements.txt
IMPACT Output data prepared in the format described below
All of these commands should be run in the raw folder of this respository

Data Processing

Run python csv-manipulation.py [filename] in the command line where [filename] is a csv output of multiple scenarios in the following format:

impactparameter	scenario	commodity	region	year	Val
PopXAgg -- Population	SSP2_GFDL	cbeef	VEN	2050	46.2749

Data should be be aggregated so as not to include the production type variable.

The console will show the following (or similar based on the file) while running:

separating target file: 4DevSeed.csv
creating 5 files
creating file: SSP2-GFDL.csv
creating file: SSP2-HGEM.csv
creating file: SSP2-IPSL.csv
creating file: SSP2-MIROC.csv
creating file: SSP2-NoCC.csv

The processing slightly transforms the data to fit a specific schema:

Only the first word of the impactparameter is used
In all other fields, any space ( ) or dash (-) is replaced with an underscore (_)
All text is converted to lowercase
Commodity and region group/aggregate names are added according to aggregate_commodity.csv and aggregate_region.csv

Upload to Elasticsearch

The created scenario csvs will automatically be moved to the scenarios/ folder.

Run the next two commands in the command line where [username] and [password] are those provided for editing the elasticsearch cluster.

pip install -r requirements.txt
python es_populator.py [username] [password]

Expect the process to take ~10 minutes per scenario. Once the script is complete, all of the scenarios will be uploaded to the Heroku instance at https://ad21a5a8cb0789e9b73c2142d3c83e43.us-east-1.aws.found.io:9243

Deleting from Elasticsearch

Scenarios can be easily deleted from the cluster using delete.py:

python delete.py --scenario [scenario_name] [username] [password]

or to delete all scenarios:

python delete.py --delete-all [username] [password]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data Updates

Requirements

Data Processing

Upload to Elasticsearch

Deleting from Elasticsearch

Clone this wiki locally