Skip to content

Commit

Permalink
Added helper function for initialising a neural estimator
Browse files Browse the repository at this point in the history
  • Loading branch information
msainsburydale committed Dec 4, 2023
1 parent 38c5ed7 commit d1f67ff
Show file tree
Hide file tree
Showing 8 changed files with 131 additions and 7 deletions.
2 changes: 1 addition & 1 deletion docs/src/API/architectures.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Pages = ["architectures.md"]

## Architectures

Although the user is free to construct their neural estimator however they see fit, `NeuralEstimators` provides several useful architectures described below.
Although the user is free to construct their neural estimator however they see fit (i.e., using arbitrary `Flux` code), `NeuralEstimators` provides several useful architectures described below that are specifically relevant to neural estimation. See also the convenience constructor [`initialise_estimator`](@ref).

```@docs
DeepSet
Expand Down
2 changes: 1 addition & 1 deletion docs/src/API/core.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ The data should be stored as a `Vector{A}`, where each element of the vector is
## Types of estimators

See also [Architectures and activations functions](@ref) that are often used
when constructing neural estimators.
when constructing neural estimators, and the convenience constructor [`initialise_estimator`](@ref).

```@docs
NeuralEstimator
Expand Down
2 changes: 2 additions & 0 deletions docs/src/API/utility.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ estimateinbatches
expandgrid
initialise_estimator
loadbestweights
removedata
Expand Down
2 changes: 1 addition & 1 deletion docs/src/workflow/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Next, we implicitly define the statistical model with simulated data. In `Neural
simulate(parameters, m) = [θ[1] .+ θ[2] .* randn(1, m) for θ ∈ eachcol(parameters)]
```

We now design architectures for the inner and outer neural networks, $\boldsymbol{\psi}(\cdot)$ and $\boldsymbol{\phi}(\cdot)$ respectively, in the [`DeepSet`](@ref) framework, and initialise the neural estimator as a [`PointEstimator`](@ref) object.
We now design architectures for the inner and outer neural networks, $\boldsymbol{\psi}(\cdot)$ and $\boldsymbol{\phi}(\cdot)$ respectively, in the [`DeepSet`](@ref) framework, and initialise the neural estimator as a [`PointEstimator`](@ref) object. Note that this can be done directly using `Flux` code (as below), or with the helper function [`initialise_estimator`](@ref).

```
d = 1 # dimension of each replicate
Expand Down
2 changes: 1 addition & 1 deletion docs/src/workflow/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ To develop a neural estimator with `NeuralEstimators.jl`,

- Sample parameters from the prior distribution: the parameters are stored as $p \times K$ matrices, with $p$ the number of parameters in the model and $K$ the number of samples (i.e., parameter configurations) in the given parameter set (i.e., training, validation, or test set).
- Simulate data from the assumed model over the parameter sets generated above. These data are stored as a `Vector{A}`, with each element of the vector associated with one parameter configuration, and where `A` depends on the representation of the neural estimator (e.g., an `Array` for CNN-based estimators, or a `GNNGraph` for GNN-based estimators).
- Initialise a neural network, `θ̂`, that will be trained into a neural Bayes estimator.
- Initialise a neural network, `θ̂`, that will be trained into a neural Bayes estimator (see, e.g., convenience constructor [`initialise_estimator`](@ref)).
- Train `θ̂` under the chosen loss function using [`train`](@ref).
- Assess `θ̂` using [`assess`](@ref). The resulting object of class [`Assessment`](@ref) can be used to assess the estimator with respect to the entire parameter space by estimating the risk function with [`risk`](@ref), or used to plot the empirical sampling distribution of the estimator.
- Apply `θ̂` to observed data (once its performance has been checked in the above step). Bootstrap-based uncertainty quantification is facilitated with [`bootstrap`](@ref) and [`interval`](@ref).
Expand Down
107 changes: 107 additions & 0 deletions src/Estimators.jl
Original file line number Diff line number Diff line change
Expand Up @@ -346,3 +346,110 @@ end
# Clean printing:
Base.show(io::IO, pe::PiecewiseEstimator) = print(io, "\nPiecewise estimator with $(length(pe.estimators)) estimators and sample size change-points: $(pe.breaks)")
Base.show(io::IO, m::MIME"text/plain", pe::PiecewiseEstimator) = print(io, pe)


# ---- Helper function for initialising an estimator ----

"""
initialise_estimator(p::Integer; ...)
Initialise a neural estimator for a statistical model with `p` unknown parameters.
The estimator is couched in the DeepSets framework so that it can be applied to data with an arbitrary number of independent replicates (including the special case of a single replicate).
# Keyword arguments
- `architecture::String`: for unstructured data, one may use a densely-connected neural network ("DNN"); for data collected over a grid, a convolutional neural network ("CNN"); and for graphical or irregular spatial data, a graphical neural network ("GNN").
- `d::Integer = 1`: dimension of the response variable (e.g., `d = 1` for univariate processes).
- `estimator_type::String = "point"`: the type of estimator; either `"point"` or `"interval"`.
- `depth = 3`: the number of hidden layers. Either a single integer or an integer vector of length two specifying the depth of inner (summary) and outer (inference) network of the DeepSets framework. Since there is an input and an output layer, the total number of layers in is `sum(depth) + 2`.
- `width = 32`: a single integer or an integer vector of length `sum(depth)` specifying the width (or number of convolutional filters/channels) in each layer.
- `activation::Function = relu`: the (non-linear) activation function of each hidden layer.
- `activation_final::Function = relu`: the activation function of the output layer.
- `kernel_size`: (applicable only to CNNs) a vector of length `depth[1]` containing integer tuples of length `D`, where `D` is the dimension of the convolution (e.g., `D = 2` for two-dimensional convolution).
- `weight_by_distance::Bool = false`: (applicable only to GNNs) flag indicating whether the estimator will weight by spatial distance; if true, a `WeightedGraphConv` layer is used in the propagation module; otherwise, a regular `GraphConv` layer is used.
# Examples
```
p = 2
initialise_estimator(p, architecture = "DNN")
initialise_estimator(p, architecture = "GNN")
initialise_estimator(p, architecture = "CNN", kernel_size = [(10, 10), (5, 5), (3, 3)])
```
"""
function initialise_estimator(
p::Integer;
architecture::String,
d::Integer = 1,
estimator_type::String = "point",
depth::Union{Integer, Vector{<:Integer}} = 3,
width::Union{Integer, Vector{<:Integer}} = 32,
activation::Function = relu,
activation_output::Function = identity,
kernel_size = nothing,
weight_by_distance::Bool = false
)

@assert p > 0
@assert d > 0
@assert architecture ["DNN", "CNN", "GNN"]
@assert estimator_type ["point", "interval"]
@assert all(depth .> 0)
@assert length(depth) == 1 || length(depth) == 2
if isa(depth, Integer) depth = [depth] end
if length(depth) == 1 depth = repeat(depth, 2) end
@assert all(width .> 0)
@assert length(width) == 1 || length(width) == sum(depth)
if isa(width, Integer) width = [width] end
if length(width) == 1 width = repeat(width, sum(depth)) end
# henceforth, depth and width are integer vectors of length 2 and sum(depth), respectively

if architecture == "CNN"
@assert !isnothing(kernel_size) "The argument `kernel_size` must be provided when `architecture = 'CNN'`"
@assert length(kernel_size) == depth[1]
kernel_size = coercetotuple(kernel_size)
end

L = sum(depth) # total number of hidden layers

# mapping (outer) network
ϕ = Chain(
[Dense(width[l-1] => width[l], activation) for l (depth[1]+1):L]...,
Dense(width[L] => p, activation_output)
)

# summary (inner) network
if architecture == "DNN"
ψ = Chain(
Dense(d => width[1], activation),
[Dense(width[l] => width[l+1], activation) for l 1:depth[1]]...
)
elseif architecture == "CNN"
ψ = Chain(
Conv(kernel_size[1], d => width[1], activation),
[Conv(kernel_size[l], width[l] => width[l+1], activation) for l 1:depth[1]]...,
Flux.flatten
)
elseif architecture == "GNN"
propagation = weight_by_distance ? WeightedGraphConv : GraphConv
# propagation_module = GNNChain(
# propagation(d => width[1], activation),
# [propagation(width[l] => width[l+1], relu) for l ∈ 1:depth[1]]...
# )
# readout_module = GlobalPool(mean)
# return GNN(propagation_module, readout_module, ϕ) # return more-efficient GNN object
ψ = GNNChain(
propagation(d => width[1], activation, aggr = mean),
[propagation(width[l] => width[l+1], relu, aggr = mean) for l 1:depth[1]]...,
GlobalPool(mean) # readout module
)
end

θ̂ = DeepSet(ψ, ϕ)

if estimator_type == "interval"
θ̂ = IntervalEstimator(θ̂, θ̂)
end

return θ̂
end

coercetotuple(x) = (x...,)
5 changes: 2 additions & 3 deletions src/NeuralEstimators.jl
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ export CholeskyCovariance, CovarianceMatrix, CorrelationMatrix
export vectotril, vectotriu
include("Architectures.jl")

export NeuralEstimator, PointEstimator, IntervalEstimator, IntervalEstimatorCompactPrior, PointIntervalEstimator, QuantileEstimator, PiecewiseEstimator
export NeuralEstimator, PointEstimator, IntervalEstimator, IntervalEstimatorCompactPrior, PointIntervalEstimator, QuantileEstimator, PiecewiseEstimator, initialise_estimator
include("Estimators.jl")

export GNN, UniversalPool, adjacencymatrix, WeightedGraphConv, maternclusterprocess
Expand Down Expand Up @@ -69,14 +69,13 @@ include("bootstrap.jl")
export stackarrays, expandgrid, loadbestweights, loadweights, numberreplicates, nparams, samplesize, drop, containertype, estimateinbatches
include("utility.jl")


export NeuralEM, removedata, encodedata # TODO unit testing for NeuralEM
include("missingdata.jl")

end

#TODO
# - Add helper functions for censored data and write an example in the documentation.
# - Add helper functions for censored data and write an example in the documentation.
# - Plotting from Julia (which can act directly on the object of type assessment).
# - Examples:
# o Add some figures to the examples in the documentation (e.g., show the sampling distribution in univariate example).
Expand Down
16 changes: 16 additions & 0 deletions test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -668,3 +668,19 @@ end
ci = interval(estimator, Z, parameter_names = parameter_names)
@test size(ci[1]) == (p, 2)
end


@testset "initialise_estimator" begin
p = 2
initialise_estimator(p, architecture = "DNN")
initialise_estimator(p, architecture = "GNN")
initialise_estimator(p, architecture = "CNN", kernel_size = [(10, 10), (5, 5), (3, 3)])

@test typeof(initialise_estimator(p, architecture = "DNN", estimator_type = "interval")) <: IntervalEstimator
@test typeof(initialise_estimator(p, architecture = "GNN", estimator_type = "interval")) <: IntervalEstimator
@test typeof(initialise_estimator(p, architecture = "CNN", kernel_size = [(10, 10), (5, 5), (3, 3)], estimator_type = "interval")) <: IntervalEstimator

@test_throws Exception initialise_estimator(0, architecture = "DNN")
@test_throws Exception initialise_estimator(p, d = 0, architecture = "DNN")
@test_throws Exception initialise_estimator(p, architecture = "CNN")
end

0 comments on commit d1f67ff

Please sign in to comment.