Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add functionality to add more aggregate regions (like EU27) to historical mif #136

Open
fschreyer opened this issue Oct 18, 2021 · 6 comments
Assignees

Comments

@fschreyer
Copy link
Contributor

fschreyer commented Oct 18, 2021

Dear RSE,

it would be great to have more aggregate regions in the historical mif which are not REMIND regions. The reason is that we also more and more look at REMIND output which is aggregated from a couple of REMIND regions. Most importantly, we need this to do all of our EU-Analysis with the aggregated region EU27 (which is the aggregation of all EU subregions). This EU27 region already exists in the reporting but not in the historical mif so we currently cannot compare to historical data. Could you add functionality to add this region to the historical mif and potentially further aggregated regions in the future?

You might have had some chat with @Renato-Rodrigues about this already. Let us know if/how we can help.

Thanks,
Felix

@LaviniaBaumstark @pfuehrlich-pik

@Renato-Rodrigues
Copy link
Member

Me and David brief talked about this some time ago, and from the tests I made the current framework - creating additional columns in the region mapping to define further region aggregations - works for creating historical mif files that include EU28, EU27, and so on.

The main problem is that this change also change the hash code for the region mapping file, causing the need to do a lot of changes all over the code and input data handling after any change is made to the aggregations columns.

I would request if possible:

(1) to split the information about region aggregations columns from the model region mapping columns in two different files

or

(2) to calculate the hash of the region mapping file based only on the the first three columns of the mapping file, which remain unchanged no matter if we add additional region aggregation columns or not.

@dklein-pik
Copy link
Member

dklein-pik commented Nov 30, 2021

Dear Felix and Renato,

as Renato described above their is already the option to add new columns with aggregated regions to the regionmapping.csv. See for example the column "missingH12" in /p/projects/rd3mod/inputdata/mappings/regional/regionmapping_21_EU11.csv

As described by Renato adding such a column changes the resulting hash of the regionmapping.csv that is used in a few instances in the REMIND code.

Unfortunately we can generate a hash for an entire file only and not for parts of a file. Splitting the region aggregations columns from the model region mapping columns in two different files would require a new mapping between these two files and also a kind of hashing to keep these files consistent with each other.

Assuming that the additional region columns do not get updated often and only a few files need to be updated manually we suggest to include all additional region mappings we know of so far in the regionmapping.csv and change the REMIND code accordingly. After adding new columns to the regionscode.csv the following files need to get updated:

Would you agree with this approach?

@Renato-Rodrigues
Copy link
Member

Renato-Rodrigues commented Nov 30, 2021

Hi David, thank you for the reply!

To me there is a first best solution that is a bit different from what you mention.
Let me try to explain better below:

(1) I would have a single file valid for all region aggregations responsible to define possible, and useful, additional results regional aggregations.

It could either follow the same structure as we have now, using the extra columns in the region mapping file or, preferable, use a simplified formulation like the iiasa db uses here for example, but adapted to R instead of python as they use: https://github.com/openENTRANCE/openentrance/blob/main/mappings/remind_2.1.yaml

For the first alternative, a excel file containing as first column the country codes (or names), and additional columns with the definition of desirable region aggregations (e.g. EU27, EU28, ...) would suffice.

For the second alternative we can simplify this file including only directly lists of REMIND regions that can potentially sum up to another desirable region:
Ex (just illustrative):

EU27 = c(DEU, ECE, ECS, ENC, ESC, ESW, EWN, FRA)
EU28 = c(EUR)
EU28 = c(DEU, ECE, ECS, ENC, ESC, ESW, EWN, FRA,UKI)
#AR6 region aggregations
R5Asia = c(CHA, IND, OAS)
R5MiddleEastandAfrica = c(SSA, MEA)
R5OECDandEU = c(DEU,ECE,ENC,ECS,ESC,ESW,EWN,FRA,UKI,JPN,USA,CAZ,NEN,NES)

(2) In the calc output, for any region aggregation you use, you would load this single additional region aggregation file, and test if it is possible to apply the aggregation. This is a simple test to check if all elements of the aggregation are present in independent native regions regions from remind. If that is not the case, you would just ignore the aggregation, otherwise, the historical mif file would include also the additional region aggregations defined in this auxiliar file.

The advantages of that solution from my point of view are:

  • We would never need to change again a region mapping file keeping the hash values unchanged and all references to previous calibration files unchanged also.
  • This will make easier to include useful region aggregations directly to our results report without changing anything else in the model or in the run, for example creating historial mifs compatible with AR6 R5 region aggregations without effort (see R5 regions above in the example).
  • In the next couple of months I will be working in further extending the EU regions disaggregation. The proposed method will simplify adding support for these upcoming new region mappings and possible aggregations.
  • This will also make easier to use our compare scenarios code to compare against the changes that further disaggregated model runs create, as it will be easy to define new regional aggregations to take into account, even if we use that only temporarily to validate a newer version against an existent one that has less regions.

@dklein-pik
Copy link
Member

Hi Renato,

thanks for this suggestion. I just re-discovered that in madrat there is the option of providing an additional mapping, called extramappings to the calcoutput function via the cfg. I guess this should do the job you described under 2) but I am not sure. In my understanding this means you provide the regular regionmapping via regionmapping in the config, and the additional columns that were so far also in the regionmapping file you provide separately via the file specified via extramapping in the cfg. If you want to give it a try feel free. I will try to make a test today or tomorrow.

@dklein-pik
Copy link
Member

Hi Renato,

Ok, I tried out the extramapping (see above), but it does not work so far in madrat. I started a discussion with Jan about it.

@dklein-pik
Copy link
Member

dklein-pik commented Mar 10, 2022

As of mrremind 0.117.0 it will be possible to provide additional regional mappings for the validation data (historical.mif). The additional mappings are supplied via the extramappings argument in

retrieveData(model="VALIDATIONREMIND", regionmapping = mapping[["regionmapping"]], extramappings = mapping[["extramappings_historic"]], rev = revision)

As can be seen from the code above this is already part of the REMIND pre-processing. So, all you need to do is: add the name of your additional mapping to the extramappings column in the mappingslist in start.R.

Please note:

  1. Duplicated columns will be ignored: please use unique names (i.e. not RegionCode) for the columns containing the additional regions. Colums with column names that also exist in the main mapping will be ignored from the additional mapping. A warning will be thrown.
  2. Currently only the main region mapping (given in regionmapping) is used to calcualte the region hash. The extramapping will be ignored. This will change soon. Then the resulting input data files will also reflect the region hash of the additional mapping.

Simple example:

You could have a regular regionmapping named A.csv

X              CountryCode  RegionCode
Afghanistan    AFG          MEA       
Aland Islands  ALA          ENC       
Albania        ALB          NES       
Algeria        DZA          MEA       

and an additional mapping B.csv

X              CountryCode  missingH12
Afghanistan    AFG          rest
Aland Islands  ALA          EUR
Albania        ALB          NEU
Algeria        DZA          rest

and supply them as follows in the REMIND preprocessing:

mappinglist <- list(c(regionmapping = "A.csv", extramappings_historic = "B.csv"))

The result will be the same as for a single regionmapping file that looks like this:

X              CountryCode  RegionCode   missingH12
Afghanistan    AFG          MEA          rest
Aland Islands  ALA          ENC          EUR
Albania        ALB          NES          NEU
Algeria        DZA          MEA          rest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants