Skip to content

Commit

Permalink
Chapter 7 (#5)
Browse files Browse the repository at this point in the history
* add content

* Added more content to chapter 7

* updated content for meeting
  • Loading branch information
AmandaRP authored Nov 23, 2023
1 parent 016dbc0 commit d0c2d51
Showing 1 changed file with 118 additions and 8 deletions.
126 changes: 118 additions & 8 deletions 07_modules.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -2,23 +2,133 @@

**Learning objectives:**

- THESE ARE NICE TO HAVE BUT NOT ABSOLUTELY NECESSARY
Learn about *modules* with focus on `nn_linear()`, `nn_squential()`, and `nn_module()`

## SLIDE 1 {-}
## Built-in modules

- ADD SLIDES AS SECTIONS (`##`).
- TRY TO KEEP THEM RELATIVELY SLIDE-LIKE; THESE ARE NOTES, NOT THE BOOK ITSELF.
**What are modules?**

## Meeting Videos {-}
: - an object that encapsulates state
- can be of any complexity (e.g. layer, or models consisting of layers)

### Cohort 1 {-}
Examples of `{torch}` modules:

- linear: `nn_linear()`
- convolutional: `nn_conf1d()`, `nn_conf2d()`, `nn_conv_3d()`
- recurrent: `nn_lstm()`, `nn_gru()`
- embedding: `nn_embedding()`
- multi-head attention: `nn_multihead_attention()`
- See [torch documentation](https://torch.mlverse.org/docs/reference/#neural-network-modules) for others

## Linear Layer: `nn_linear()`

Consider the [linear layer](https://torch.mlverse.org/docs/reference/nn_linear):

```{r}
library(torch)
l <- nn_linear(in_features = 5, out_features = 16) #bias = TRUE is default
l
```

Comment about size: We expect `l` to be $5 \times 16$ (i.e for matrix multiplication: $X_{50\times5}* \beta_{5 \times 16}$). We see below that it is $16 \times 5$, which is due to the underlying C++ implementation of `libtorch`. For performance reasons, the transpose is stored.

```{r}
l$weight$size()
```

Apply the module:

```{r}
#Generate data: generated from the normal distribution
x <- torch_randn(50, 5)
# Feed x into layer:
output <- l(x)
output$size()
```

When we use built-in modules, `requires_grad = TRUE` is [*not*]{.underline} required in creation of the tensor (unlike previous chapters). It's taken care of for us.

## Sequential Models: `nn_squential()`

[`nn_squential()`](https://torch.mlverse.org/docs/reference/nn_sequential) can be used for models that propagate straight through the layers. A Multi-Layer Perceptron (MLP) is an example (i.e. a network consisting only of linear layers). Below we build an MLP using this method:

```{r}
mlp <- nn_sequential( # all arguments should be modules
nn_linear(10, 32),
nn_relu(),
nn_linear(32,64),
nn_relu(),
nn_linear(64,1)
)
```

Apply this model to random data:

```{r}
output <- mlp(torch_randn(50, 10))
```

## General Models: `nn_module()`

[`nn_module()`](https://torch.mlverse.org/docs/reference/nn_module) is "factory function" for building models of arbitrary complexity. More flexible than the sequential model. Use to define:

- weight initialization

- model structure (forward pass), including identification of model parameters using `nn_parameter()` .

Example:

```{r}
my_linear <- nn_module(
initialize = function(in_features, out_features){
self$w <- nn_parameter(torch_randn(in_features, out_features)) # random normal
self$b <- nn_parameter(torch_zeros(out_features)) # zeros
},
forward = function(input){
input$mm(self$w) + self$b
}
)
```

Next instantiate the model with input and output dimensions:

```{r}
l <- my_linear(7, 1)
l
```

Apply the model to random data (just like we did in the previous section):

```{r}
output <- l(torch_randn(5, 7))
output
```

That was the forward pass. Let's define a (dummy) loss function and compute the gradient:

```{r}
loss <- output$mean()
loss$backward() # compute gradient
l$w$grad #inspect result
```

##

## Meeting Videos {.unnumbered}

### Cohort 1 {.unnumbered}

`r knitr::include_url("https://www.youtube.com/embed/URL")`

<details>
<summary> Meeting chat log </summary>

```
<summary>Meeting chat log</summary>

```
LOG
```

</details>

0 comments on commit d0c2d51

Please sign in to comment.