-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add .tools.sankey
and tutorial
#770
Conversation
Thanks for this PR :) |
Thanks @daymontas1 for stepping up here, much appreciated :) You should be able to test the PR as is and see how you like it's usefulness, but there are a few things that I would like to see happen before we merge it:
Please feel free to ask about any of these steps if you need help :) |
Hi all, to simplify here, I would simply limit this functionality to pyam > 2 - we can do a version check for this on the fly and raise an error. |
message_ix/report/__init__.py
Outdated
@@ -141,6 +141,7 @@ | |||
"message::costs", | |||
"message::emissions", | |||
), | |||
("message::sankey", "concat", "out::pyam", "in::pyam"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would get rid of this, and simply do the concatenation outside of reporter (e.g., in the jupyter notebook here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kept this for now as I don't know what the Reporter actually does and it seems useful to just get the dataframe format we need from the Reporter. But please elaborate :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have created a new file, sankey.py, within the report directory, which extracts Sankey diagram functionality from init.py. Fridolin (@glatterf42), could you please confirm if this is what we want? Additionally, I have updated the westeros_sankey.ipynb file to ensure compatibility with the new sankey.py file in the report directory. I have annotated these changes in the westeros_sankey.ipynb below. Also, in this case, the following line inside the init.py file must be either removed or commented out:
("message::sankey", "concat", "out::pyam", "in::pyam")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @daymontas1, these changes sound good to me. For the line that has become redundant, please remove it rather than commenting it out, this keeps the code clean.
Could you please push your changes to the branch so that we can take a look and see how the tests are doing?
If you don't have write access to @mabudz's fork, please let us know so that we can figure out a solution :)
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #770 +/- ##
=====================================
Coverage 95.6% 95.6%
=====================================
Files 46 48 +2
Lines 4338 4394 +56
=====================================
+ Hits 4148 4204 +56
Misses 190 190
|
be56f01
to
c45ad62
Compare
In today's team meeting, @SiddharthJoshi-Git mentioned that he is fine with merging this PR as is and providing feedback later on. So I'm happy to merge this once the tests pass, which ... seem kind of confused about which versions of Python they should be using. For message_ix, that is still 3.8 and not 3.13, so maybe we'll have to merge #881 first. |
* In recent pyam versions, get_variable_components moved to pyam.str
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this is a welcome contribution. Several things that I will “request” as changes but actually myself implement on the branch:
- The additional steps between
rep.get("message::sankey")
andfig.show()
that appear in the tutorial can be added to theReporter.add_sankey()
method and controlled with keyword arguments. This saves the user having to manually re-write these steps for every such diagram they want to create. - For (1), but also generally,
plotly
as an added dependency appears required to use the feature(s) as intended/recommended. So I would make a new group of optional dependenciesmessage_ix[sankey] = [plotly]
, and then makemessage_ix[sankey]
a recursive dependency in thetutorials
group. - The tests should be expanded to cover the whole range of usage.
- The new methods and functions must appear in the documentation; currently from the preview build they do not. (A tutorial does not substitute for API documentation.)
- The release note line should link directly to the tutorial and to the documentation of the new function(s).
Other comments:
-
The history of the branch appears to have some duplicate commits (probably due to failure to rebase/inconsistent merging with and rebasing on
main
) and commit names. We should thus squash-and-merge once this is ready to go. -
In order to take advantage of the sankey features via pyam, the data appear to be used via the IAMC structure in which "variable" names like "in|secondary|electricity|grid|standard" are constructed by mashing together labels for distinct dimensions, and then these are again deconstructed using
.util.sankey.set_start_and_end_index()
/get_variable_components()
. This is both complicated and fragile: if the user chooses to construct "variable" names differently, it will not work.For other purposes we've discussed, like drawing RES diagrams using NetworkX/graphviz/others (FYI @macflo8), it will be simpler to use the data at their full/original resolution and construct node IDs using unique combinations of labels for certain dimensions, like (level, commodity), (technology, mode), or others. This may also end up providing a more direct way to construct Sankey diagrams, but we will see once we get there.
- Add 'sankey' optional dependencies set. - Reporter.add_sankey() - Sort methods in alpha order. - Include all steps for figure generation. - Expand docstring. - .tools.sankey —rename from .util.sankey - Sort methods in order. - Simplify type hints. - Remove year= parameter from map_for_sankey() - Add warning if map_for_sankey() gives an empty result. - Reorganize tutorial to align with simplified interface. - Simplify tests - Update docs - Add doc/tools/sankey.rst - Add plotly to intersphinx config. - Remove trailing whitespace in tutorial/README.rst - Link docs, tutorial in release notes.
Create sankey diagram and add tutorial.
The approach here creates the sankey diagram by using the sankey plot function of the
pyam
package. This function requires a mapping dictionary. The mapping dict is created by using the newly created automatic reportmessage::sankey
and a utility functionsankey_mapper()
map_for_sankey
.This pull request is related to the draft pull request in message_data , which can be removed, since the sankey diagram has been implemented in the
pyam
package.Note
The new key is now only added when calling
Reporter.add_sankey()
.How to review
General approach should be reviewed. Also, I am not sure if the util function
sankey_mapper()
map_for_sankey()
is at the right place.PR checklist