Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tblakes tutorial #241

Open
wants to merge 25 commits into
base: devel
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,8 @@ navbar:
href: articles/hydro.html
- text: "Priors exploration"
href: articles/priors.html
- text: Tibetan Lakes
href: articles/tibetan_lakes.html
- text: ---
- text: "FAQ"
- text: INLA Crash FAQ
Expand Down
89 changes: 89 additions & 0 deletions vignettes/covid.bib
Original file line number Diff line number Diff line change
@@ -1,3 +1,92 @@
@article{
,
title={The ERA5 global reanalysis},
author={Hersbach, Hans and Bell, Bill and Berrisford, Paul and Hirahara, Shoji and Hor{\'a}nyi, Andr{\'a}s and Mu{\~n}oz-Sabater, Joaqu{\'\i}n and Nicolas, Julien and Peubey, Carole and Radu, Raluca and Schepers, Dinand and others},
journal={Quarterly Journal of the Royal Meteorological Society},
volume={146},
number={730},
pages={1999--2049},
year={2020},
publisher={Wiley Online Library},
doi={10.1002/qj.3803}
}

@article{lehner2013global,
title={Global river hydrography and network routing: baseline data and new approaches to study the world's large river systems},
author={Lehner, Bernhard and Grill, G{\"u}nther},
journal={Hydrological Processes},
volume={27},
number={15},
pages={2171--2186},
year={2013},
publisher={Wiley Online Library},
doi={doi.org/10.1002/hyp.9740}
}

@article{pekel2016high,
title={High-resolution mapping of global surface water and its long-term changes},
author={Pekel, Jean-Fran{\c{c}}ois and Cottam, Andrew and Gorelick, Noel and Belward, Alan S},
journal={Nature},
volume={540},
number={7633},
pages={418--422},
year={2016},
publisher={Nature Publishing Group UK London},
doi={10.1038/nature20584}
}

@article{brun2020limited,
title={Limited contribution of glacier mass loss to the recent increase in Tibetan Plateau lake volume},
author={Brun, Fanny and Treichler, D{\'e}sir{\'e}e and Shean, David and Immerzeel, Walter W},
journal={Frontiers in Earth Science},
volume={8},
pages={582060},
year={2020},
publisher={Frontiers Media SA},
doi={10.3389/feart.2020.582060}
}

@article{twentythreeproblems,
author = {Günter Blöschl, Marc F.P. Bierkens, Antonio Chambel, Christophe Cudennec, Georgia Destouni, Aldo Fiori, James W. Kirchner, Jeffrey J. McDonnell, Hubert H.G. Savenije, Murugesu Sivapalan, Christine Stumpp, Elena Toth, Elena Volpi, Gemma Carr, Claire Lupton, Josè Salinas, Borbála Széles, Alberto Viglione, Hafzullah Aksoy, Scott T. Allen, Anam Amin, Vazken Andréassian, Berit Arheimer, Santosh K. Aryal, Victor Baker, Earl Bardsley, Marlies H. Barendrecht, Alena Bartosova, Okke Batelaan, Wouter R. Berghuijs, Keith Beven, Theresa Blume, Thom Bogaard, Pablo Borges de Amorim, Michael E. Böttcher, Gilles Boulet, Korbinian Breinl, Mitja Brilly, Luca Brocca, Wouter Buytaert, Attilio Castellarin, Andrea Castelletti, Xiaohong Chen, Yangbo Chen, Yuanfang Chen, Peter Chifflard, Pierluigi Claps, Martyn P. Clark, Adrian L. Collins, Barry Croke, Annette Dathe, Paula C. David, Felipe P. J. de Barros, Gerrit de Rooij, Giuliano Di Baldassarre, Jessica M. Driscoll, Doris Duethmann, Ravindra Dwivedi, Ebru Eris, William H. Farmer, James Feiccabrino, Grant Ferguson, Ennio Ferrari, Stefano Ferraris, Benjamin Fersch, David Finger, Laura Foglia, Keirnan Fowler, Boris Gartsman, Simon Gascoin, Eric Gaume, Alexander Gelfan, Josie Geris, Shervan Gharari, Tom Gleeson, Miriam Glendell, Alena Gonzalez Bevacqua, María P. González-Dugo, Salvatore Grimaldi, A. B. Gupta, Björn Guse, Dawei Han, David Hannah, Adrian Harpold, Stefan Haun, Kate Heal, Kay Helfricht, Mathew Herrnegger, Matthew Hipsey, Hana Hlaváčiková, Clara Hohmann, Ladislav Holko, Christopher Hopkinson, Markus Hrachowitz, Tissa H. Illangasekare, Azhar Inam, Camyla Innocente, Erkan Istanbulluoglu, Ben Jarihani, Zahra Kalantari, Andis Kalvans, Sonu Khanal, Sina Khatami, Jens Kiesel, Mike Kirkby, Wouter Knoben, Krzysztof Kochanek, Silvia Kohnová, Alla Kolechkina, Stefan Krause, David Kreamer, Heidi Kreibich, Harald Kunstmann, Holger Lange, Margarida L. R. Liberato, Eric Lindquist, Timothy Link, Junguo Liu, Daniel Peter Loucks, Charles Luce, Gil Mahé, Olga Makarieva, Julien Malard, Shamshagul Mashtayeva, Shreedhar Maskey, Josep Mas-Pla, Maria Mavrova-Guirguinova, Maurizio Mazzoleni, Sebastian Mernild, Bruce Dudley Misstear, Alberto Montanari, Hannes Müller-Thomy, Alireza Nabizadeh, Fernando Nardi, Christopher Neale, Nataliia Nesterova, Bakhram Nurtaev, Vincent O. Odongo, Subhabrata Panda, Saket Pande, Zhonghe Pang, Georgia Papacharalampous, Charles Perrin, Laurent Pfister, Rafael Pimentel, María J. Polo, David Post, Cristina Prieto Sierra, Maria-Helena Ramos, Maik Renner, José Eduardo Reynolds, Elena Ridolfi, Riccardo Rigon, Monica Riva, David E. Robertson, Renzo Rosso, Tirthankar Roy, João H.M. Sá, Gianfausto Salvadori, Mel Sandells, Bettina Schaefli, Andreas Schumann, Anna Scolobig, Jan Seibert, Eric Servat, Mojtaba Shafiei, Ashish Sharma, Moussa Sidibe, Roy C. Sidle, Thomas Skaugen, Hugh Smith, Sabine M. Spiessl, Lina Stein, Ingelin Steinsland, Ulrich Strasser, Bob Su, Jan Szolgay, David Tarboton, Flavia Tauro, Guillaume Thirel, Fuqiang Tian, Rui Tong, Kamshat Tussupova, Hristos Tyralis, Remko Uijlenhoet, Rens van Beek, Ruud J. van der Ent, Martine van der Ploeg, Anne F. Van Loon, Ilja van Meerveld, Ronald van Nooijen, Pieter R. van Oel, Jean-Philippe Vidal, Jana von Freyberg, Sergiy Vorogushyn, Przemyslaw Wachniew, Andrew J. Wade, Philip Ward, Ida K. Westerberg, Christopher White, Eric F. Wood, Ross Woods, Zongxue Xu, Koray K. Yilmaz and Yongqiang Zhang},
title = {Twenty-three unsolved problems in hydrology (UPH) – a community perspective},
journal = {Hydrological Sciences Journal},
volume = {64},
number = {10},
pages = {1141-1158},
year = {2019},
publisher = {Taylor & Francis},
doi = {10.1080/02626667.2019.1620507},
}


@article{immerzeel2020importance,
title={Importance and vulnerability of the world’s water towers},
author={Immerzeel, Walter W and Lutz, Arthur F and Andrade, Marcos and Bahl, A and Biemans, Hester and Bolch, Tobias and Hyde, Sam and Brumby, S and Davies, BJ and Elmore, AC and others},
journal={Nature},
volume={577},
number={7790},
pages={364--369},
year={2020},
publisher={Nature Publishing Group UK London},
doi={10.1038/s41586-019-1822-y}
}

@article{zhang2017grl,
author = {Zhang, Guoqing and Yao, Tandong and Shum, C. K. and Yi, Shuang and Yang, Kun and Xie, Hongjie and Feng, Wei and Bolch, Tobias and Wang, Lei and Behrangi, Ali and Zhang, Hongbo and Wang, Weicai and Xiang, Yang and Yu, Jinyuan},
title = {Lake volume and groundwater storage variations in Tibetan Plateau's endorheic basin},
journal = {Geophysical Research Letters},
volume = {44},
number = {11},
pages = {5550-5560},
keywords = {lake volume, mass balance, groundwater storage, Tibetan Plateau},
doi = {https://doi.org/10.1002/2017GL073773},
url = {https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1002/2017GL073773},
eprint = {https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1002/2017GL073773},
abstract = {Abstract The Tibetan Plateau (TP), the highest and largest plateau in the world, with complex and competing cryospheric-hydrologic-geodynamic processes, is particularly sensitive to anthropogenic warming. The quantitative water mass budget in the TP is poorly known. Here we examine annual changes in lake area, level, and volume during 1970s–2015. We find that a complex pattern of lake volume changes during 1970s–2015: a slight decrease of −2.78 Gt yr−1 during 1970s–1995, followed by a rapid increase of 12.53 Gt yr−1 during 1996–2010, and then a recent deceleration (1.46 Gt yr−1) during 2011–2015. We then estimated the recent water mass budget for the Inner TP, 2003–2009, including changes in terrestrial water storage, lake volume, glacier mass, snow water equivalent (SWE), soil moisture, and permafrost. The dominant components of water mass budget, namely, changes in lake volume (7.72 ± 0.63 Gt yr−1) and groundwater storage (5.01 ± 1.59 Gt yr−1), increased at similar rates. We find that increased net precipitation contributes the majority of water supply (74\%) for the lake volume increase, followed by glacier mass loss (13\%), and ground ice melt due to permafrost degradation (12\%). Other term such as SWE (1\%) makes a relatively small contribution. These results suggest that the hydrologic cycle in the TP has intensified remarkably during recent decades.},
year = {2017}
}



@misc{who2020world,
title={World {H}ealth {O}rganization: {C}oronavirus {D}isease 2019 ({COVID}-19) {S}ituation {R}eport},
Expand Down
196 changes: 196 additions & 0 deletions vignettes/tibetan_lakes.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,196 @@
---
title: "Tibetan Lakes Example"
output:
bookdown::html_document2:
base_format: rmarkdown::html_vignette
fig_caption: yes
bibliography: "covid.bib"
link-citations: yes
vignette: >
%\VignetteIndexEntry{Tibetan Lakes Example}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---

The Tibetan Plateau and the High Mountain Asia region provide water resources to approximately 1 billion people and provide for their health, economies, and agriculture. These natural "water towers" are highly sensitive to climate change and therefore changes in regional water resources must be understood (@immerzeel2020importance). Indeed, understanding how climate change will alter cold region water resources has been described as a "grand challenge" of hydrology (@twentythreeproblems). Recent changes in lakes in the Tibetan Plateau endorheic basin have led to increases in lake areas. Conventionally attributed to increases in glacier melt due to climate change, a recent investigation has demonstrated that glacier mass loss contributes only a small fraction (19%) to the lake volume increase (@brun2020limited). Thus, an unresolved question in the Tibetan Plateau endorheic basin lakes is, what physical processes drive lake growth in the region and at what fraction?

```{r error=TRUE}
fdmr::retrieve_tutorial_data(dataset = "tibetan_lakes")

data_filepath <- fdmr::get_tutorial_datapath(dataset = "tibetan_lakes", file = "tibetan_lakes_data.csv")
data <- read.csv(data_filepath)

summary(data)
```

We have compiled a dataset for 20 years of lake growth using the global surface water survey (@pekel2016high), the hydrobasins catchment dataset (@lehner2013global), and ERA5-land precipitation data (@hersbach2020era5). The global surface water survey is a data set using LANDSAT data to classify pixels where lakes are annually. We have converted this to a time series by looking at how lakes have changed between years. This is done in Google Earth Engine [using the following codes](https://code.earthengine.google.com/3304560943f796c82f935403dcc54909). We then aggregate data to changes that have occured within a catchment using the hydrobasins catchment definitions (level 6). We do the same for the ERA5-land precipitation data. This is compiled into a CSV that is presented here.

Let's look at this data and see what is going on. First lets look at the overall changes in lake areas.

```{r error=TRUE, fig.width=8, fig.height=4, fig.align = "center"}
library(tidyverse)

# Split data into positive and negative, then summarize
summary_data <- data %>%
mutate(sign = ifelse(water_balance_m3 > 0, "Lake Growth", "Lake Decline")) %>%
group_by(year, sign) %>%
summarise(total = sum(water_balance_m3, na.rm = TRUE)) %>%
ungroup()

# Summarize the data by year
sums <- data %>%
group_by(year) %>%
summarise(
precip_sum = sum(precip, na.rm = TRUE),
growth_ratio_sum = sum(growth_ratio, na.rm = TRUE),
decline_ratio_sum = sum(decline_ratio, na.rm = TRUE)
)

# TODO : combine these plots
ggplot() +
geom_col(data = summary_data, aes(x = year, y = total, fill = sign), position = "stack")

ggplot() +
geom_line(data = sums, aes(x = year, y = precip_sum * 100), color = "cyan") # Scale
```

We can see here that while lakes both grow and decline, the overall trend is that lakes are growing more than they are shrinking in the Tibetan plateau. Why is this? perhaps it is because of the rain. Our ERA5-land data set gives us precipitation per year (after aggregation). But we see that the largest year that lakes grow, is not the largest amount of rain. What is going on here?

```{r error=TRUE, fig.width=8, fig.height=8, fig.align = "center"}
ggplot(data, aes(x = precip, y = water_balance_m3, color = as.factor(year))) +
geom_point(alpha = 0.5) +
scale_color_viridis_d() +
theme_minimal()
```

Here we can see across the 20 year observation period a clear spatial distinction in lakegrowth however it is not as clear cut for precipitation.
```{r error=TRUE, fig.width=8, fig.height=8, fig.align='center'}
# TODO : this should include plotting the polygons of the catchments
library(leaflet)

# Summarize the data
summary_data <- data %>%
group_by(centroid_lon, centroid_lat) %>%
summarise(
total_balance = sum(water_balance_m3, na.rm = TRUE),
total_precip = sum(precip, na.rm = TRUE)
)

# Create a map
leaflet(summary_data) %>%
addTiles() %>%
addCircles(
lng = ~centroid_lon,
lat = ~centroid_lat,
radius = ~ 50 * total_balance,
color = "blue",
stroke = FALSE,
fillOpacity = 0.6
) %>%
addCircles(
lng = ~centroid_lon,
lat = ~centroid_lat,
radius = ~ 500 * total_precip,
color = "red",
stroke = FALSE,
fillOpacity = 0.6
)
```

Let's try and model this. We will use 4DModeller and inlabru.

[inlabru](https://inlabru-org.github.io/inlabru/index.html) is designed to calculate fixed and random effects as well as continuous spatially and temporally distributed processes using SPDEs. In order to make these computationally solveable, the continous processes are discretized on a finite element mesh. This mesh represents the spatial distribution of the process under study. At each node of the mesh, the model is used to calculate the outcome variable (in this case lake area changes). The mesh provides the spatial awareness to the model. Different processes can have different length scales, i.e., how far away does a process need to occur before it has no effect on the occurence of the process at my current location. These are controlled by priors (see below). One major assumption in this method is that the process being modeled is continuous. This has the benefit that the spatial field being modeled can be calculated for regions where there is no data.

4DModeller is designed to make these model design decisions visual, interpretable, and accessible to users who may not have a strong background in R programming or bayesian spatiotemporal modeling. 4DModeller has a set of shiny apps that help the user write the codes necessary to implement the full model and evaluate it's performance.

First we will use the mesh_builder to build a reasonable mesh.

> **_NOTE:_** Due to the size of the mesh being created the initial load of the app may take some time

```{r error=TRUE, eval=FALSE}
fdmr::mesh_builder(spatial_data = data, obs_data = data, crs = "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs", longitude_column = "centroid_lon", latitude_column = "centroid_lat")
```

Using the mesh_builder we have the code to build the mesh for the inlabru model.

```{r error=TRUE}
mesh <- fmesher::fm_mesh_2d_inla(
loc = data[, c("centroid_lon", "centroid_lat")],
max.edge = c(3, 7.9),
cutoff = 0.9,
offset = c(0.2, 4)
)

fdmr::plot_mesh(mesh)
```

This mesh now represents where the model will calculate the lake changes for the tibetan lakes region. Now that we have the mesh we can use the priors app to specify the model.

```{r error=TRUE}
library(dplyr)
# Define a function to calculate z-scores, except for specified columns
calculate_zscore <- function(df, group_col, ignore_cols) {
df %>%
group_by(across(all_of(group_col))) %>%
mutate(across(
where(is.numeric) & !all_of(ignore_cols),
list(zscore = ~ (. - mean(., na.rm = TRUE)) / sd(., na.rm = TRUE)),
.names = "zscore_{col}"
)) %>%
ungroup()
}

# Specify columns to ignore
ignore_cols <- c("centroid_lon", "centroid_lat", "year")

# Apply the function to the data
data_with_zscores <- calculate_zscore(data, "HYBAS_ID", ignore_cols)
sp::coordinates(data_with_zscores) <- c("centroid_lon", "centroid_lat")
```

Now we're ready to use `model_builder` to investigate how changing priors affects model predictions.

```{r error=TRUE, eval=FALSE}
fdmr::model_builder(spatial_data = data_with_zscores, measurement_data = data_with_zscores, mesh = mesh, time_variable = "year")
```

Now we use the code from the model builder app to make the model:

```{r error=TRUE}
group_index <- data_with_zscores$year
n_groups <- length(unique(data_with_zscores$year))
spde <- INLA::inla.spde2.pcmatern(
mesh = mesh,
prior.range = c(0.05, 0.1),
prior.sigma = c(0.05, 0.2)
)

alphaprior <- list(theta = list(
prior = "pccor1",
param = c(-0.2, 0.8)
))

formula2 <- zscore_water_balance_m3 ~ 0 + Intercept(1) + precipitation(data_with_zscores$zscore_precip, model = "linear") + temperature(data_with_zscores$zscore_t2m) + residuals(data_with_zscores[, c("centroid_lon", "centroid_lat")], model = "spde", group = group_index, ngroup = n_groups, control.group = list(model = "ar1", hyper = alphaprior))

model_output <- inlabru::bru(formula2,
data = data_with_zscores,
family = "gaussian",
E = NULL,
control.family = NULL,
options = list(
verbose = FALSE
)
)

summary(model_output)
```

now we can use the model viewer to view the output of the model and create some plots

```{r error=TRUE, eval=FALSE}
fdmr::model_viewer(model_output = model_output, mesh = mesh, measurement_data = data_with_zscores, data_distribution = "Gaussian")
```

```{r error=TRUE}
hist(model_output$summary.fitted.values$mean, breaks = seq(-0.3, 0.3, l = 100))
```
Loading