Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PROM DG advection #18

Merged
merged 34 commits into from
Oct 30, 2023
Merged

PROM DG advection #18

merged 34 commits into from
Oct 30, 2023

Conversation

siuwuncheung
Copy link
Member

@siuwuncheung siuwuncheung commented Sep 21, 2023

  • Add the signatures for BasisGenerator
  • Add the example for PROM DG advection

Sidenotes:

  • Unlike C++, it seems that the conversion from libROM.Matrix to mfem.DenseMatrix does not need tranpose
  • loadSamples seems much slower in Python (build with Docker) than C++. In the parametric case, it takes 2000 seconds.

Update:
On M1 chip macbook, merge phase in pylibROM (built with C++ docker container for ARM) in 62 seconds, while the C++ takes 64 seconds. The overhead is probably due to using Docker built from a different architecture.

@siuwuncheung siuwuncheung added the WIP Work in progress label Sep 21, 2023
@siuwuncheung siuwuncheung self-assigned this Sep 21, 2023
@siuwuncheung siuwuncheung added RFR Ready for review and removed WIP Work in progress labels Oct 12, 2023
@@ -23,7 +23,7 @@ RUN sudo git clone https://github.com/LLNL/libROM.git
WORKDIR ./libROM
RUN sudo git pull
# pylibROM is currently based on a specific commit of the libROM.
RUN sudo git checkout mfem_fix
RUN sudo git checkout 0809d7d09dc24f0963c38fc8c0a2649948142ba0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this change was actually part of #17, but should this be changed to be the head of the main branch of libROM? This commit SHA seems to be outdated since a few PRs have been merged in libROM. Maybe this should be changed in a separate PR to make sure the two repos stay in sync.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we are using a static commit for libROM to keep pylibROM stable and avoid any unpredicted breakdowns due to the actively changing C++ libROM. From time to time I saw Kevin change the SHA, after some necessary changes were made in C++ libROM. @dreamer2368 , may I ask if it is accurate?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct. We'd like to update SHA occasionally, but at the same time specify SHA explicitly which can be information for debugging. I think mfem_fix branch is now merged to main branch of the libROM, so it's okay to update this SHA now.

@ckendrick
Copy link
Collaborator

@siuwuncheung Do you have any ideas why loadSamples might be so much slower in Python than in C++? I wonder if there is an issue with the bindings for that method. I noticed the test case for the loadSamples method was commented out. Since this seems to be much slower than the C++ version, it might be good to open an issue to investigate this further.

Copy link
Collaborator

@ckendrick ckendrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me and seems to reproduce the C++ example.

@siuwuncheung
Copy link
Member Author

@siuwuncheung Do you have any ideas why loadSamples might be so much slower in Python than in C++? I wonder if there is an issue with the bindings for that method. I noticed the test case for the loadSamples method was commented out. Since this seems to be much slower than the C++ version, it might be good to open an issue to investigate this further.

Thanks for the review and asking the question, @ckendrick. I had not checked that what exactly causes it to be slow, but I also suspect it is something due to the bindings. I saw in Kevin's Poisson PROM example that it sets a maximum of 5 snapshots used in loadSamples. @dreamer2368 , may I ask if you had the same issue about loadSamples being slow and the maximum number of snapshots is for overcoming that?

@dreamer2368
Copy link
Collaborator

dreamer2368 commented Oct 18, 2023

Thanks for the review and asking the question, @ckendrick. I had not checked that what exactly causes it to be slow, but I also suspect it is something due to the bindings. I saw in Kevin's Poisson PROM example that it sets a maximum of 5 snapshots used in loadSamples.

Was it on your local macbook? The performance can vary from optimum, especially because the docker image is based on different architecture.

@dreamer2368 , may I ask if you had the same issue about loadSamples being slow and the maximum number of snapshots is for overcoming that?

I don't exactly remember how the maximum number was set there. I should check again, but it should be simply a translation from c++ example.

@siuwuncheung
Copy link
Member Author

Thanks for the review and asking the question, @ckendrick. I had not checked that what exactly causes it to be slow, but I also suspect it is something due to the bindings. I saw in Kevin's Poisson PROM example that it sets a maximum of 5 snapshots used in loadSamples.

Was it on your local macbook? The performance can vary from optimum, especially because the docker image is based on different architecture.

@dreamer2368 , may I ask if you had the same issue about loadSamples being slow and the maximum number of snapshots is for overcoming that?

I don't exactly remember how the maximum number was set there. I should check again, but it should be simply a translation from c++ example.

It was on my M1 chip macbook. Can you try to run the parametric predictive example on your machine and see how much time the merge phase takes?

  Arguments of parametric predictive case:
  Offline phase: dg_advection_global_rom.py -offline -ff 1.0 -id 0
                 dg_advection_global_rom.py -offline -ff 1.1 -id 1
                 dg_advection_global_rom.py -offline -ff 1.2 -id 2
  Merge phase: dg_advection_global_rom.py -merge -ns 3
  FOM solution: dg_advection_global_rom.py -fom -ff 1.15
  Online phase: dg_advection_global_rom.py -online -ff 1.15
  Outputs of parametric predictive case:
  Relative L2 error of ROM solution = 4.33318E-04

@dreamer2368
Copy link
Collaborator

dreamer2368 commented Oct 23, 2023

Running on quartz with singularity (docker container) returned the same result. The merging time was about 120 seconds, which is shorter than 2000 seconds, but still longer than expected for 3 samples. Will run without container as well.

update
Also ran without container on quartz. The merge phase still takes ~180 seconds. @siuwuncheung , can you confirm that c++ libROM version would also take the similar time?

If not, we should narrow down what python-wrapped function incurs this overhead cost, and investigate if it's possible to reduce the overhead.

@siuwuncheung
Copy link
Member Author

Running on quartz with singularity (docker container) returned the same result. The merging time was about 120 seconds, which is shorter than 2000 seconds, but still longer than expected for 3 samples. Will run without container as well.

update Also ran without container on quartz. The merge phase still takes ~180 seconds. @siuwuncheung , can you confirm that c++ libROM version would also take the similar time?

If not, we should narrow down what python-wrapped function incurs this overhead cost, and investigate if it's possible to reduce the overhead.

It takes ~60 seconds in C++ libROM on Quartz. I guess it's safe to say that some Python-wrapped function incurs the extra cost, but the syntax is correct in this PR? It seems to me that it is just we've not had a case with that many snapshots to reveal that loadSamples is more time consuming in Python than C++.

@siuwuncheung
Copy link
Member Author

siuwuncheung commented Oct 27, 2023

Update:
On M1 chip macbook, merge phase in pylibROM (built with C++ docker container for ARM) takes 62 seconds, while the C++ takes 64 seconds. The overhead is probably due to using Docker built from a different architecture.

@siuwuncheung siuwuncheung merged commit 0c0ca7c into main Oct 30, 2023
10 checks passed
@siuwuncheung siuwuncheung deleted the prom_dg_advection branch November 12, 2023 22:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFR Ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants