Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

64 simple mpi support #65

Merged
merged 21 commits into from
Feb 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
2d06968
#64 Allow not using a t-mask with non-periodic boundary conditions.
hiker Feb 14, 2022
4500abd
#64 Added support for gathering a distributed field into a Fortran ar…
hiker Feb 24, 2022
28a5647
#64 Updated documentation.
hiker Feb 24, 2022
319129f
#64 Moved MPI into parallel_utils_mod, so that field is independent o…
hiker Feb 25, 2022
e2af613
#64 Reduce buffer size to optimise communication, added more comments.
hiker Feb 25, 2022
0f77d94
#1623 Ignore build directory.
hiker Aug 21, 2023
483a0a3
#64 Use only the .o files for the required library version (not all .…
hiker Aug 23, 2023
129beaa
#64 Added TODO back in.
hiker Jan 7, 2024
0bd0905
#64 Checked allocate return status.
hiker Jan 7, 2024
35f556f
#64 Added tests for new reduction and initialisation functionality.
hiker Jan 8, 2024
33f1abd
#64 Added tests for new reduction and initialisation functionality.
hiker Jan 8, 2024
18b3ae4
Merge branch '64_simple_mpi_support' of github.com:stfc/dl_esm_inf in…
hiker Jan 8, 2024
86c442a
#64 Improved documentation.
hiker Jan 8, 2024
b6a727f
#64 omp-parallelise initialisation loop.
hiker Jan 29, 2024
be87b09
#64 Updated comments.
hiker Jan 29, 2024
f3ce7f6
#64 Added license to grid_mod, and updated years.
hiker Jan 29, 2024
dd9eab1
#64 Abort early if PBC and no tmask is specified. Create dummy tmask …
hiker Jan 31, 2024
a7d7479
#64 Allow periodic boundary condition in non-distributed memory setti…
hiker Jan 31, 2024
79bc758
#64 Updated comments.
hiker Jan 31, 2024
685f19f
#64 Updatd documentation and comments.
hiker Jan 31, 2024
18fed02
#64 Updated documentation conf.py file to remove warnings.
hiker Jan 31, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
_build
29 changes: 23 additions & 6 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,13 +49,19 @@ Three types of boundary condition are currently supported:
Name Description
=============== =========================================
GO_BC_NONE No boundary conditions are applied.
GO_BC_EXTERNAL Some external forcing is applied. This must be implemented by a kernel. The domain must be defined with a T-point mask (see :ref:`gocean1.0-grid-init`).
GO_BC_EXTERNAL Some external forcing is applied. This must be implemented by a kernel.
arporter marked this conversation as resolved.
Show resolved Hide resolved
The domain can be specified using a ``tmask``, but if no ``tmask`` is
specified, a dummy ``tmask`` is created that will define an all ocean
domain.
GO_BC_PERIODIC Periodic boundary conditions are applied.
=============== =========================================

The infrastructure requires this information in order to determine the
extent of the model grid.

Note that at this stage ``GO_BC_PERIODIC`` is not supported when
using distributed memory. This is tracked in issue #54.

The index offset is required because a model (kernel) developer has
choice in how they actually implement the staggering of variables on a
grid. This comes down to a choice of which grid points in the vicinity
Expand Down Expand Up @@ -98,11 +104,12 @@ object. This is done via a call to the ``grid_init`` subroutine::
!! wet (1), dry (0) or external (-1).
integer, dimension(m,n), intent(in), optional :: tmask


If no T-mask is supplied then this routine configures the grid
appropriately for an all-wet domain with periodic boundary conditions
in both the *x*- and *y*-dimensions. It should also be noted that
currently only grids with constant resolution in *x* and *y* are
supported by this routine.
appropriately for an all-wet domain by allocating a default
T-mask. It should also be noted that currently only grids with
constant resolution in *x* and *y* are supported by this routine.


.. _gocean1.0-fields:

Expand All @@ -128,11 +135,21 @@ constructor::
sshn_v = r2d_field(model_grid, GO_V_POINTS)
sshn_t = r2d_field(model_grid, GO_T_POINTS)

The constructor takes two arguments:
The constructor takes two mandatory and two optional arguments:

1. The grid on which the field exists
2. The type of grid point at which the field is defined
(``GO_U_POINTS``, ``GO_V_POINTS``, ``GO_T_POINTS`` or ``GO_F_POINTS``)
3. ``do_tile``: If the field should be tiled among all threads, or if only
a single field should be allocated (which is not currently
supported by PSyclone).
4. ``init_global_data``: an optional global 2D Fortran array, which must be
provided on each rank. On each rank the field will be initialised
with the data from the corresponding subdomain. This is just a convenience
for users with a small problem size, since typically for large data sets
using a global array will create scalability problems. In general, it is
the responsibility of the user to initialise an array with the required
local data.

arporter marked this conversation as resolved.
Show resolved Hide resolved
Note that the grid object must have been fully configured (by a
call to ``grid_init`` for instance) before it is passed into this
Expand Down
6 changes: 3 additions & 3 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = None
# language = None

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
Expand All @@ -89,7 +89,7 @@
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
# html_static_path = ['_static']

# Custom sidebar templates, must be a dictionary that maps document names
# to template names.
Expand Down Expand Up @@ -182,7 +182,7 @@
# -- Options for intersphinx extension ---------------------------------------

# Example configuration for intersphinx: refer to the Python standard library.
intersphinx_mapping = {'https://docs.python.org/': None}
intersphinx_mapping = {'python': ('https://docs.python.org/', None)}

# -- Options for todo extension ----------------------------------------------

Expand Down
2 changes: 1 addition & 1 deletion finite_difference/src/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ all: ${API_LIB}

# Create the archive.
${API_LIB}: ${MODULES}
${AR} ${ARFLAGS} ${API_LIB} *.o
${AR} ${ARFLAGS} ${API_LIB} $(MODULES)

install:
${MAKE} -C .. install
Expand Down
112 changes: 108 additions & 4 deletions finite_difference/src/field_mod.f90
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
! OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
!------------------------------------------------------------------------------
! Author: A. R. Porter, STFC Daresbury Laboratory
! Modified: J. Henrichs, Australian Bureau of Meteorology

!> Module for describing all aspects of a field (which exists on some
!! grid).
Expand Down Expand Up @@ -161,6 +162,7 @@ end subroutine write_to_device_f_interface
procedure, pass :: read_from_device
procedure, pass :: write_to_device
procedure, public :: halo_exchange
procedure, public :: gather_inner_data
end type r2d_field

!> Interface for the copy_field operation. Overloaded to take
Expand Down Expand Up @@ -237,9 +239,10 @@ end subroutine write_to_device_f_interface

!===================================================

function r2d_field_constructor(grid, &
function r2d_field_constructor(grid, &
grid_points, &
do_tile) result(self)
do_tile, &
init_global_data) result(self)
use parallel_mod, only: go_decompose, on_master
!$ use omp_lib, only : omp_get_max_threads
implicit none
Expand All @@ -252,6 +255,9 @@ function r2d_field_constructor(grid, &
!> a single field should be allocated (which is not currently
!> supported by PSyclone)
logical, intent(in), optional :: do_tile
!> An optional global array with which the field in each domain
!> will be initialsed
real(go_wp), dimension(:,:), intent(in), optional :: init_global_data
! Local declarations
type(r2d_field), target :: self
logical :: local_do_tiling = .false.
Expand All @@ -261,7 +267,7 @@ function r2d_field_constructor(grid, &
!> The upper bounds actually used to allocate arrays (as opposed
!! to the limits carried around with the field)
integer :: upper_x_bound, upper_y_bound
integer :: itile, nthreads, ntilex, ntiley
integer :: itile, nthreads, ntilex, ntiley, dx, dy

if (present(do_tile)) then
local_do_tiling = do_tile
Expand Down Expand Up @@ -349,7 +355,7 @@ function r2d_field_constructor(grid, &
end if

! Since we're allocating the arrays to be larger than strictly
! required we explicitly set all elements to -999 in case the code
! required we explicitly set all elements to 0 in case the code
! does access 'out-of-bounds' elements during speculative
! execution. If we're running with OpenMP this also gives
! us the opportunity to do a 'first touch' policy to aid with
Expand All @@ -368,6 +374,19 @@ function r2d_field_constructor(grid, &
else
self%data(:,:) = 0.0_go_wp
end if

if (present(init_global_data)) then
dx = grid%subdomain%global%xstart - self%internal%xstart
dy = grid%subdomain%global%ystart - self%internal%ystart

arporter marked this conversation as resolved.
Show resolved Hide resolved
!$OMP PARALLEL DO schedule(runtime), default(none), &
!$OMP private(ji,jj), shared(self, grid, init_global_data, dx, dy)
do jj = grid%subdomain%internal%ystart, grid%subdomain%internal%ystop
do ji = grid%subdomain%internal%xstart, grid%subdomain%internal%xstop
self%data(ji, jj) = init_global_data(ji+dx, jj+dy)
end do
end do
end if
end function r2d_field_constructor

!===================================================
Expand Down Expand Up @@ -506,6 +525,8 @@ subroutine write_to_device(self, startx, starty, nx, ny, blocking)
end subroutine write_to_device


!===================================================

function get_data(self) result(dptr)
!> Getter for the data associated with a field. If the data is on a
! device it ensures that the host copy is up-to-date with that on
Expand Down Expand Up @@ -1287,6 +1308,89 @@ end function array_checksum

!===================================================

!> Collect a (distributed) field into a global array
!! on the master node.
subroutine gather_inner_data(self, global_data)
use parallel_utils_mod, only: get_num_ranks, gather
use parallel_mod, only: on_master

class(r2d_field), intent(in) :: self
real(go_wp), dimension(:,:), &
allocatable, intent(out) :: global_data

real(go_wp), dimension(:), allocatable :: send_buffer, recv_buffer

integer :: dx, dy, ji, jj, i, n, rank, halo_x, halo_y
integer :: x_start, x_stop, y_start, y_stop, ierr

allocate(global_data(self%grid%global_nx, self%grid%global_ny), &
stat=ierr)
if(ierr /= 0)then
call gocean_stop('gather_inner_data failed to allocate global result array')
end if

! No MPI (or single process), just copy the data out
if (get_num_ranks() == 1) then
! Compute size of inner area only
dx = self%internal%xstart - 1
dy = self%internal%ystart - 1
do jj= self%internal%ystart, self%internal%ystop
do ji = self%internal%xstart, self%internal%xstop
global_data(ji-dx,jj-dy) = self%data(ji,jj)
end do
end do
return
endif

! Determine maximum size of data to be sent. We don't need
! to sent the halo, so reduce max_width and max_height by
! 2*halo size.
halo_x = self%internal%xstart-1
halo_y = self%internal%ystart-1
n = (self%grid%decomp%max_width - 2*halo_x) * &
(self%grid%decomp%max_height - 2*halo_y)
allocate(send_buffer(n), stat=ierr)
if(ierr /= 0)then
call gocean_stop('gather_inner_data failed to allocate send buffer')
end if
allocate(recv_buffer(n*get_num_ranks()), stat=ierr)
if(ierr /= 0)then
call gocean_stop('gather_inner_data failed to allocate receive buffer')
end if

! Copy data into 1D send buffer.
i = 0
do jj= self%internal%ystart, self%internal%ystop
do ji = self%internal%xstart, self%internal%xstop
i = i + 1
send_buffer(i) = self%data(ji,jj)
end do
end do

! Collect all send_buffers on the master:
call gather(send_buffer, recv_buffer)

if (on_master()) then
! Copy the data from each process into the global array
do rank=1, get_num_ranks()
x_start = self%grid%decomp%subdomains(rank)%global%xstart
x_stop = self%grid%decomp%subdomains(rank)%global%xstop
y_start = self%grid%decomp%subdomains(rank)%global%ystart
y_stop = self%grid%decomp%subdomains(rank)%global%ystop
i = (rank-1) * n
do jj= y_start, y_stop
do ji = x_start, x_stop
i = i + 1
global_data(ji, jj) = recv_buffer(i)
end do
end do
enddo
endif

end subroutine gather_inner_data

!===================================================

subroutine init_periodic_bc_halos(fld)
implicit none
class(field_type), intent(inout) :: fld
Expand Down
Loading
Loading