4DModeller · gareth-j · Nov 20, 2023 · Nov 3, 2023 · Nov 3, 2023 · Nov 3, 2023
diff --git a/vignettes/data/Bayelva_snow.nc b/vignettes/data/Bayelva_snow.nc
diff --git a/vignettes/data/Bayelva_snow_geo.nc b/vignettes/data/Bayelva_snow_geo.nc
diff --git a/vignettes/snow.Rmd b/vignettes/snow.Rmd
@@ -0,0 +1,312 @@
+---
+title: "Spatial distribution of snow - exploratory tutorial"
+output: rmarkdown::html_vignette
+vignette: >
+  %\VignetteIndexEntry{Spatial distribution of snow - exploratory tutorial}
+  %\VignetteEngine{knitr::rmarkdown}
+  %\VignetteEncoding{UTF-8}
+---
+
+## Introduction
+
+This tutorial explores whether 4Dmodeller can be used to model the spatial distribution of snow in the Bayelva area on Svalbard, based on once-a-year point-based manual snow depth measurements for several years and snow cover duration derived from time lapse camera imagery (Aalstad et al., 2020). This data has been used before for snow data assimilation (Aalstad et al., 2018) and permafrost modeling (Zweigel et al., 2021) using modeling and data assimilation methods (see references). Additionally, we included elevation and other derived topographic parameters (slope, aspect, topographic position index) as these are known to influence the snow depth distribution. These topographic parameters were derived from a high resolution digital elevation model (Boike et al., 2019). The dataset used in this tutorial has been prepared by Kristoffer Aalstad as a netCDF by combining and processing the following publicly available datasets that have been used in several studies (see literature references):
+
+Westermann, S. (2020).Snow depth and ground surface temperature at Bayelva research site, Svalbard [Data set]. Norstore. <https://doi.org/10.11582/2020.00024>
+
+Aalstad, K., and Westermann. S. (2021). ACS_Bayelva_class: 302 high-resolution snow cover maps covering the 2012-2017 snowmelt seasons in the Bayelva catchment (Svalbard, Norway) (2.0) [Data set]. Zenodo. <https://doi.org/10.5281/zenodo.5010944>
+
+Boike, J., Juszak, I., Lange, S., Chadburn, S., Burke, E., Overduin, P. P., Roth, K., Ippisch, O., Bornemann, N., Stern, L., Gouttevin, I., Hauber, E., & Westermann, S. (2018). HRSC-AX data products (DEM and multi channel) from aerial overflights in 2008 over Bayelva (Brøggerhalvøya peninsula, Spitsbergen). (Version 1) [Data set]. Zenodo. <https://doi.org/10.1594/PANGAEA.884730>
+
+The dataset is originally projected (UTM zone 33X) and available in the data sub-folder in this repository (data/Bayelva_snow.nc). As UTM was not supported for all steps we anticipated while developing this tutorial (see below), we also created also a lat/lon version of the same data (data/Bayelva_snow_geo.nc).
+
+### Literature references
+
+Aalstad, K., Westermann, S., Schuler, T. V., Boike, J., and Bertino, L. (2018). Ensemble-based assimilation of fractional snow-covered area satellite retrievals to estimate the snow distribution at Arctic sites. The Cryosphere, 12, 247--270, <https://doi.org/10.5194/tc-12-247-2018>
+
+Boike, J., Nitzbon, J., Anders, K., Grigoriev, M., Bolshiyanov, D., Langer, M., Lange, S., Bornemann, N., Morgenstern, A., Schreiber, P., Wille, C., Chadburn, S., Gouttevin, I., Burke, E., and Kutzbach, L.: A 16-year record (2002--2017) of permafrost, active-layer, and meteorological conditions at the Samoylov Island Arctic permafrost research site, Lena River delta, northern Siberia: an opportunity to validate remote-sensing data and land surface, snow, and permafrost models, Earth Syst. Sci. Data, 11, 261--299, <https://doi.org/10.5194/essd-11-261-2019>, 2019.
+
+Aalstad, K., Westermann, S., and Bertino, L. (2020). Evaluating satellite retrieved fractional snow-covered area at a high-Arctic site using terrestrial photography. Remote Sensing of Environment, 239, <https://doi.org/10.1016/j.rse.2019.111618>
+
+Zweigel, R. B., Westermann, S., Nitzbon, J., Langer, M., Boike, J., Etzelmüller, B., and Vikhamar Schuler, T. (2021). Simulating snow redistribution and its effect on ground surface temperature at a high-Arctic site on Svalbard. Journal of Geophysical Research: Earth Surface, 126, e2020JF005673. <https://doi.org/10.1029/2020JF005673>
+
+## Import and load the data
+
+We use the library ncdf4 to open the snow dataset:
+
+```{r modules}
+# load modules used in this tutorial
+library(ncdf4)
+```
+
+```{r dataload}
+# Find where your data is stored locally
+ncpath <- dirname(rstudioapi::getSourceEditorContext()$path)
+ncname <- "Bayelva_snow" 
+ncfname <- paste(ncpath,'/data/', ncname, ".nc", sep="")
+
+#open data 
+ncin <- nc_open(ncfname) 
+#head(ncin)
+```
+
+### Get relevant features
+
+The netcdf file has both point measurements of snow depth (ds, or translated into SWE: Ds) for several years, and continuous raster data with snow depth duration for several years and topographic features. We start with the snow depth points only:
+
+```{r dataselect}
+# specify the longitude and latitude columns. Note that the coordinates are utm!
+longitude <- ncvar_get(ncin,"xs") 
+latitude <-ncvar_get(ncin,"ys")
+
+# Get the first year of snow depth (ds)
+ds <- ncvar_get(ncin,"ds")[,1]
+
+# Create a data frame
+df <- data.frame(LONG = longitude,LAT = latitude,ds) 
+head(df)
+```
+
+### Build the mesh
+
+When trying to build a mesh whit our UTM projected dataset, we don't get a result as the shiny app does not seem to work with UTM:
+
+```{r, eval = FALSE}
+fdmr::mesh_builder(spatial_data = df)
+```
+
+Using fmesher directly (non-interactively, without the fdmr app added functionalities) works and we can plot the resulting mesh outside the geographical context that would have been provided by mesh_builder.
+
+```{r plotmesh}
+# Define a proj string for our UTM zone
+crs <- "+proj=utm +zone=33" 
+
+# use fmesher directly rather than the fdmr wrapper:
+mesh <- fmesher::fm_mesh_2d_inla(loc = df[,c(1,2)], 
+                                 max.edge = c(20,40),
+                                 crs = crs ) 
+
+# plot the result
+plot(mesh)
+```
+
+When trying to plot the mesh with fdmr::plot_mesh, we run into the same problem as above: The plot_mesh function seems to assume lat/lon coordinates and thus zooms out to max.
+
+```{r, eval = FALSE}
+
+fdmr::plot_mesh(mesh=mesh, spatial_data = df[,c(1,2)])
+```
+
+***TO DO:*** *We would like to also show our data points and see this in a geographical context to check whether everything is looking OK. mesh_builder and plot_mesh suffer from the same problem and can't deal with UTM coordinates. fdmr should be updated to beable to deal with our UTM coordinates. This is especially important in the Arctic, where geographical coordinates result in highly distorted maps in most map grids. (Issue created: <https://github.com/4DModeller/fdmr/issues/248#issue-1974643323>)*
+
+## Redo this with geographical coordinates
+
+Let's see whether our approach works if our data is in geographical coordinates instead. Redo all the steps from above:
+
+```{r latlonload}
+ncname_ll <- "Bayelva_snow_geo"
+ncfname_ll <- paste(ncpath, '/data/', ncname_ll, ".nc", sep="")
+# Open data
+ncin_ll <- nc_open(ncfname_ll)
+
+# Create a data frame with the  first year of snow depth (ds)
+df_ll <- data.frame(LONG = ncvar_get(ncin_ll,"lons"),
+                 LAT = ncvar_get(ncin_ll,"lats"),
+                 #elev = ncvar_get(ncin_ll,"zs"),
+                 DS = ncvar_get(ncin_ll,"ds")[,1],
+                 DS_log = log(ncvar_get(ncin_ll,"ds")[,1]))
+
+## Store the point locations in a separate data frame:
+location_data_ll = df_ll[,c(1,2)]
+```
+
+```{r, eval = FALSE}
+# Try with the interactive mesh_builder
+fdmr::mesh_builder(location_data_ll)
+```
+
+The mesh_builder loads the shiny widget and creates a mesh, however there are several issues:
+
+-   On several machines where we tried this and the covid tutorial, the map part of the window covers the entire window, so the rulers disappear. ***TO DO:*** *fix this bug (Issue created: <https://github.com/4DModeller/fdmr/issues/243#issue-1974424767>)*
+
+-   The mesh is far too coarse for the small area we are looking at (1km2). ***TO DO:*** *Make standard mesh size dependent on area.*
+
+-   On one machine, closing the widget window doesn't stop the underlying processes to run, consequently RStudio is blocked and needs to be restarted, as ctrl + c or interrupting by clicking the STOP button don't have any effect. *(Bug report filed: <https://github.com/4DModeller/fdmr/issues/255#issue-1976392121>)*
+
+Let's try to specify the mesh manually using values for max.edge and cutoff that are reasonable for our domain and length scales.
+
+```{r latlonmesh_manually}
+# look at the range of the data to derive the mesh parameters
+initial_range <- diff(range(location_data_ll[, "LONG"])) / 5
+max_edge <- initial_range / 8
+offset <- initial_range/2
+
+# make the mesh from the locations
+mesh <- fmesher::fm_mesh_2d(loc = location_data_ll[, c("LONG", "LAT")],
+                                 max.edge = c(1, 2) * max_edge,
+                                 offset = c(1,1.5)*offset,
+                                 cutoff = max_edge/10
+                                 )
+fdmr::plot_mesh(mesh,spatial_data = location_data_ll) 
+```
+
+This mesh is already a better starting point. There are two things we are not happy about:
+
+-   The mesh is obviously distorted in N/S direction: With latitude/longitude units representing substantially different spatial distances at polar latitudes, edge lengths should be defined differently in E/W and N/S. **\
+    *TO DO:*** *Account for unequal distances in x/y. And as above, make it possible to use UTM (which is currently not supported for the visualisation tools).*
+
+-   Is the mesh dense enough to represent snow depth variability between the measurement points, and is the mesh in physically useful locations? Terrain and derivatives are strong drivers of spatial snow depth distribution, and they should thus ideally be considered for the mesh generation.
+
+## Start defining the priors
+
+Let's try to predict a spatially distributed snow depth map from our data points.
+
+We create a prior with the Matérn function, a penalized complexity prior defined by range and standard deviation *(**TO DO**: [cite Finn])*. The range parameter defines the length of the influence of each observation.
+
+In this small experiment, we have experimented by creating two different priors with two different range. The model is a stochastic partial differential equation (SPDE).
+
+```{r}
+# The prior range is the distance that the process should stop effecting, so in this case it is currently 20km away from the node center (initial_range defined above).
+
+spde1 <- INLA::inla.spde2.pcmatern(
+  mesh = mesh,
+  prior.range = c(initial_range, 0.5),
+  prior.sigma = c(1, 0.01)
+)
+
+# Try with a different prior range - the entire initial_range
+spde2 <- INLA::inla.spde2.pcmatern(
+  mesh = mesh,
+  prior.range = c(initial_range*5, 0.5),
+  prior.sigma = c(1, 0.01)
+)
+
+```
+
+We need to convert our data frame into a SpatialPointsDataFrame.
+
+```{r}
+sp::coordinates(df_ll) <- c("LONG", "LAT")
+```
+
+Define two models (for the two spde versions defined above, respectively):
+
+```{r}
+
+# Model 1 is linear and uses the gaussian model 1 defined above:
+formula1 <- DS ~ 0 + Intercept(1)+
+  f(
+    main = df_ll,
+    model = spde1
+  )
+
+inlabru_model1 <- inlabru::bru(formula1,
+                               data = df_ll,
+                               family = "gaussian",
+                               options = list(
+                                 verbose = FALSE
+                               )
+)
+
+# The only difference for Model 2 is the different initial prior range set above (spde2):
+
+formula2 <- DS ~ 0 + Intercept(1)   +
+  f(
+    main = df_ll,
+    model = spde2
+  )
+
+inlabru_model2 <- inlabru::bru(formula2,
+                               data = df_ll,
+                               family = "gaussian",
+                               options = list(
+                                 verbose = FALSE
+                               )
+)
+```
+
+Show the output of the first model, and plot the fit:
+
+```{r inlabruoutput1}
+summary(inlabru_model1)
+
+
+plot(exp(inlabru_model1$summary.fitted.values$mean[1:nrow(df_ll@data)]),exp(df_ll@data$DS))
+
+
+plot(exp(inlabru_model1$summary.fitted.values$mean[1:nrow(df_ll@data)]),exp(df_ll@data$DS))
+
+```
+
+In the environment where this tutorial was rendered, the model_viewer widget window unfortunately crashes RStudio.
+
+```{r modelviewer1, eval=FALSE}
+
+fdmr::model_viewer(
+  model_output = inlabru_model1,
+  mesh = mesh,
+  measurement_data = df_ll,
+  data_distribution = "Gaussian"
+)
+
+
+```
+
+Show the output of the second model, and plot the fit:
+
+```{r inlabruoutput2}
+summary(inlabru_model2)
+
+plot(exp(inlabru_model2$summary.fitted.values$mean[1:nrow(df_ll@data)]),exp(df_ll@data$DS))
+
+plot(exp(inlabru_model2$summary.fitted.values$mean[1:nrow(df_ll@data)]),exp(df_ll@data$DS))
+```
+
+```{r modelviewer2, eval=FALSE}
+
+fdmr::model_viewer(
+  model_output = inlabru_model2,
+  mesh = mesh,
+  measurement_data = df_ll,
+  data_distribution = "Gaussian"
+)
+
+
+```
+
+# TO DO: planned outcome of the hackathon/tutorial structure
+
+## Design the prior and the model
+
+-   Use the UTM-projected data and do everything in UTM coordinates
+
+-   Load both the point measurements and the rasters. There are topographic descriptors and snow duration data. Snow data is available for several years.
+
+-   Use both the snow depth measurements and the different raster data to generate the mesh: we want to account for spatial variability of snow depth in the mesh. In areas where the terrain doesn't vary much the mesh doesn't need to be that fine, and the contrary is the case in rough topography with greatly varying snow depth. This information is indirectly available in the topography feature rasters or in a different way in the snow duration rasters.
+
+## Fitting the model
+
+Get model results considering some or all observation data and topographic predictors. Consider that snow duration could be seen as both a predictor and observation.
+
+-\> do this for one year, and/or several years.
+
+How do the results look like? Change parameters and adjust until the snow depth map matches snow duration (best validation we have, spatially) and locally the snow depth point measurements.
+
+Do the results for all years look different, or the same?
+
+## How to deal with real zeros? 
+
+What happens when/where there is zero snow depth? The model needs to distinguish between no data and zero snow depth, and for temporal evolution also be able to predict zero snow depth after snow melt-out.
+
+## Spatial covariance as prior for data assimilation
+
+We think 4DM could be a nice complementary tool for SWE retrievals, in particular for the prior covariance in spatio-temporal problems. For this
+
+-   we want the prior to be anisotropic. What we mean with this: the correlations should be allowed to behave differently in each dimension of the feature space where we define distances between points. Features/predictors in this example/tutorial are the provided terrain parameters.
+
+-   we wanted to export the spatial covariance matrix resulting from this tutorial and test wheter we can use it to define the prior covariance in a modelling study, for example within our data assimilation framework MuSA (<https://github.com/ealonsogzl/MuSA>, see Alonso-González et al., 2022; <https://doi.org/10.5194/gmd-15-9127-2022> ). That would necessarily be a separate study from this tutorial but results could eventually be added if this works and results are promising.
+
+## Temporal evolution: snow map time series
+
+How can we get temporal evolution in here? Snow depth/SWE distribution map is a result that's only useful if snow depth measurements are available at maximum snow time (which is the case for this data, but most often unfeasible). If it should be able to compete against other (modelling) approaches, we expect 4Dmodeller to be able to make map time series, as we do with our data assimilation approach. For this to work, what other predictor data needs to be added in addition to meltout date? Adding the day of the first snow fall may lead to linear interpolation until max snow, and then linear interpolation until meltout...? Here, we would expect that some physical/meachanistic model would come in, at least a simple degree day model, using precipitation/temperature from ERA5-land, for example..?
+
+##