Detailed Description

MPI support for nda::basic_array and nda::basic_array_view objects.

nda uses the TRIQS/mpi library to provide functions to broadcast, gather, reduce and scatter arrays and views over MPI processes.

The following example demonstrates some of these features:

#include <mpi/mpi.hpp>
#include <nda/mpi.hpp>
#include <nda/nda.hpp>
 
#include <iostream>
 
int main(int argc, char **argv) {
  // initialize MPI environment
  mpi::environment env(argc, argv);
  mpi::communicator comm;
 
  // create a 2x2 array on each process and fill it with its rank
  nda::array<int, 2> A(2, 2);
  A() = comm.rank();
 
  // reduce the array over all processes
  auto A_sum = mpi::reduce(A);
 
  // print the result
  if (comm.rank() == 0) std::cout << A_sum << std::endl;
}

Running with 4 cores outputs:

[[6,6]

[6,6]]

Se Example 6: MPI support for a more in-depth example.

Functions
template<typename A> requires (is_regular_or_view_v<A>)
void	nda::mpi_broadcast (A &&a, mpi::communicator comm={}, int root=0)
	Implementation of an MPI broadcast for nda::basic_array or nda::basic_array_view types.
template<typename A> requires (is_regular_or_view_v<A> and std::decay_t<A>::is_stride_order_C())
auto	nda::mpi_gather (A const &a, mpi::communicator comm={}, int root=0, bool all=false)
	Implementation of an MPI gather for nda::basic_array or nda::basic_array_view types.
template<typename A1, typename A2> requires (is_regular_or_view_v<A1> and std::decay_t<A1>::is_stride_order_C() and is_regular_or_view_v<A2> and std::decay_t<A2>::is_stride_order_C())
void	nda::mpi_gather_into (A1 const &a_in, A2 &&a_out, mpi::communicator comm={}, int root=0, bool all=false)
	Implementation of an MPI gather for nda::basic_array or nda::basic_array_view types that gathers directly into an existing array/view.
template<typename A> requires (is_regular_or_view_v<A>)
auto	nda::mpi_reduce (A const &a, mpi::communicator comm={}, int root=0, bool all=false, MPI_Op op=MPI_SUM)
	Implementation of an MPI reduce for nda::basic_array or nda::basic_array_view types.
template<typename A1, typename A2> requires (is_regular_or_view_v<A1> && is_regular_or_view_v<A2>)
void	nda::mpi_reduce_into (A1 const &a_in, A2 &&a_out, mpi::communicator comm={}, int root=0, bool all=false, MPI_Op op=MPI_SUM)
	Implementation of an MPI reduce for nda::basic_array or nda::basic_array_view types that reduces directly into an existing array/view.
template<typename A> requires (is_regular_or_view_v<A> and std::decay_t<A>::is_stride_order_C())
auto	nda::mpi_scatter (A const &a, mpi::communicator comm={}, int root=0)
	Implementation of an MPI scatter for nda::basic_array or nda::basic_array_view types.
template<typename A1, typename A2> requires (is_regular_or_view_v<A1> and std::decay_t<A1>::is_stride_order_C() and is_regular_or_view_v<A2> and std::decay_t<A2>::is_stride_order_C())
void	nda::mpi_scatter_into (A1 const &a_in, A2 &&a_out, mpi::communicator comm={}, int root=0)
	Implementation of an MPI scatter for nda::basic_array or nda::basic_array_view types that scatters directly into an existing array/view.

Function Documentation

◆ mpi_broadcast()

template<typename A>
requires (is_regular_or_view_v<A>)

void nda::mpi_broadcast	(	A &&	a,
		mpi::communicator	comm = {},
		int	root = 0 )

#include <nda/mpi/broadcast.hpp>

Implementation of an MPI broadcast for nda::basic_array or nda::basic_array_view types.

For the root process, the array/view is broadcasted to all other processes. For non-root processes, the array/view is resized/checked to match the broadcasted dimensions and the data is written into the given array/view. The actual broadcasting is done by calling mpi::broadcast_range.

It throws an exception, if a given view does not have the correct shape.

See Broadcasting an array/view for an example.

Note: If the array/view is contiguous with positive strides and if the value type is MPI compatible, the data is broadcasted using a single MPI_Bcast call. Otherwise, the data is broadcasted element-wise which can have considerable performance implications. Consider copying the data into a contiguous array/view before broadcasting.

Template Parameters

A	nda::basic_array or nda::basic_array_view type.

Parameters

a	Array/view to be broadcasted from/into.
comm	mpi::communicator object.
root	Rank of the root process.

Definition at line 45 of file broadcast.hpp.

◆ mpi_gather()

template<typename A>
requires (is_regular_or_view_v<A> and std::decay_t<A>::is_stride_order_C())

auto nda::mpi_gather	(	A const &	a,
		mpi::communicator	comm = {},
		int	root = 0,
		bool	all = false )

#include <nda/mpi/gather.hpp>

Implementation of an MPI gather for nda::basic_array or nda::basic_array_view types.

The function gathers C-ordered input arrays/views from all processes in the given communicator and makes the result available on the root process (all == false) or on all processes (all == true). The arrays/views are joined along the first dimension.

It simply constructs an empty array and then calls nda::mpi_gather_into.

See Gathering an array/view for examples.

Template Parameters

A	nda::basic_array or nda::basic_array_view type with C-layout.

Parameters

a	Array/view to be gathered.
comm	mpi::communicator object.
root	Rank of the root process.
all	Should all processes receive the result of the gather.

Returns: An nda::basic_array object with the result of the gathering.

Definition at line 125 of file gather.hpp.

◆ mpi_gather_into()

template<typename A1, typename A2>
requires (is_regular_or_view_v<A1> and std::decay_t<A1>::is_stride_order_C() and is_regular_or_view_v<A2> and std::decay_t<A2>::is_stride_order_C())

void nda::mpi_gather_into	(	A1 const &	a_in,
		A2 &&	a_out,
		mpi::communicator	comm = {},
		int	root = 0,
		bool	all = false )

#include <nda/mpi/gather.hpp>

Implementation of an MPI gather for nda::basic_array or nda::basic_array_view types that gathers directly into an existing array/view.

The function gathers C-ordered input arrays/views from all processes in the given communicator and makes the result available on the root process (all == false) or on all processes (all == true). The arrays/views are joined along the first dimension.

It is expected that all input arrays/views have the same shape on all processes except for the first dimension. The function throws an exception if

an input array/view is not contiguous with positive strides,
an output array/view is not contiguous with positive strides on receiving ranks or
if an output view does not have the correct shape on receiving ranks.

The actual gathering is done by calling mpi::gather_range. The input arrays/views are simply concatenated along their first dimension. The content of the output array/view depends on the MPI rank and whether it receives the data or not:

On receiving ranks, it contains the gathered data and has a shape that is the same as the shape of the input array/view except for the first dimension, which is the sum of the extents of all input arrays/views along the first dimension.
On non-receiving ranks, the output array/view is ignored and left unchanged.

Note: Gathering is only supported for contiguous arrays/views with positive strides and with MPI compatible value types.

Template Parameters

A1	nda::basic_array or nda::basic_array_view type with C-layout.
A2	nda::basic_array or nda::basic_array_view type with C-layout.

Parameters

a_in	Array/view to be gathered.
a_out	Array/view to gather into.
comm	mpi::communicator object.
root	Rank of the root process.
all	Should all processes receive the result of the gather.

Definition at line 83 of file gather.hpp.

◆ mpi_reduce()

template<typename A>
requires (is_regular_or_view_v<A>)

auto nda::mpi_reduce	(	A const &	a,
		mpi::communicator	comm = {},
		int	root = 0,
		bool	all = false,
		MPI_Op	op = MPI_SUM )

#include <nda/mpi/reduce.hpp>

Implementation of an MPI reduce for nda::basic_array or nda::basic_array_view types.

The function reduces input arrays/views from all processes in the given communicator and makes the result available on the root process (all == false) or on all processes (all == true).

It first default constructs an nda::basic_array object on the heap with its value type equal to the return type of reduce(std::declval<get_value_t<A>>()) and the same rank and algebra as the input array/view. On receiving ranks, the output array is then resized to the shape of the input array/view.

The actual reduction is done by calling nda::mpi_reduce_into with the input array/view and the constructed output array. The content of the returned array depends on the MPI rank and whether it receives the data or not:

On receiving ranks, it contains the reduced data.
On non-receiving ranks, the array is empty.

See Reducing an array/view for an example.

Note: If the input arrays/views are contiguous with positive strides and if the value type is MPI compatible, the data is reduced using a single MPI_Reduce or MPI_Allreduce call. Otherwise, the data is reduced element-wise, which can have considerable performance implications. Consider copying the data into a contiguous array/view before reducing.

Template Parameters

A	nda::basic_array or nda::basic_array_view type.

Parameters

a	Array/view to be reduced.
comm	mpi::communicator object.
root	Rank of the root process.
all	Should all processes receive the result of the reduction.
op	MPI reduction operation.

Returns: An nda::basic_array object with the result of the reduction.

Definition at line 127 of file reduce.hpp.

◆ mpi_reduce_into()

template<typename A1, typename A2>
requires (is_regular_or_view_v<A1> && is_regular_or_view_v<A2>)

void nda::mpi_reduce_into	(	A1 const &	a_in,
		A2 &&	a_out,
		mpi::communicator	comm = {},
		int	root = 0,
		bool	all = false,
		MPI_Op	op = MPI_SUM )

#include <nda/mpi/reduce.hpp>

Implementation of an MPI reduce for nda::basic_array or nda::basic_array_view types that reduces directly into an existing array/view.

The function reduces input arrays/views from all processes in the given communicator and makes the result available on the root process (all == false) or on all processes (all == true).

It is expected that all input arrays/views have the same shape on all processes. The function throws an exception, if an output view does not have the correct shape on receiving ranks.

The actual reduction is done by calling mpi::reduce_range. The content of the output array/view depends on the MPI rank and whether it receives the data or not:

On receiving ranks, it contains the reduced data and has a shape that is the same as the shape of the input array/view.
On non-receiving ranks, the output array/view is ignored and left unchanged.

Note: If the input/output arrays/views are contiguous with positive strides and if the value type is MPI compatible, the data is reduced using a single MPI_Reduce or MPI_Allreduce call. Otherwise, the data is reduced element-wise, which can have considerable performance implications. Consider copying the data into a contiguous array/view before reducing.

Template Parameters

A1	nda::basic_array or nda::basic_array_view type.
A2	nda::basic_array or nda::basic_array_view type.

Parameters

a_in	Array/view to be reduced.
a_out	Array/view to reduce into.
comm	mpi::communicator object.
root	Rank of the root process.
all	Should all processes receive the result of the reduction.
op	MPI reduction operation.

Definition at line 67 of file reduce.hpp.

◆ mpi_scatter()

template<typename A>
requires (is_regular_or_view_v<A> and std::decay_t<A>::is_stride_order_C())

auto nda::mpi_scatter	(	A const &	a,
		mpi::communicator	comm = {},
		int	root = 0 )

#include <nda/mpi/scatter.hpp>

Implementation of an MPI scatter for nda::basic_array or nda::basic_array_view types.

The function scatters a C-ordered input array/view from a root process across all processes in the given communicator. The array/view is chunked into equal parts along the first dimension using mpi::chunk_length.

It simply constructs an empty array and then calls nda::mpi_scatter_into.

See Scattering an array/view for an example.

Template Parameters

A	nda::basic_array or nda::basic_array_view type.

Parameters

a	Array/view to be scattered.
comm	mpi::communicator object.
root	Rank of the root process.

Returns: An nda::basic_array object with the result of the scattering.

Definition at line 118 of file scatter.hpp.

◆ mpi_scatter_into()

template<typename A1, typename A2>
requires (is_regular_or_view_v<A1> and std::decay_t<A1>::is_stride_order_C() and is_regular_or_view_v<A2> and std::decay_t<A2>::is_stride_order_C())

void nda::mpi_scatter_into	(	A1 const &	a_in,
		A2 &&	a_out,
		mpi::communicator	comm = {},
		int	root = 0 )

#include <nda/mpi/scatter.hpp>

Implementation of an MPI scatter for nda::basic_array or nda::basic_array_view types that scatters directly into an existing array/view.

The function scatters a C-ordered input array/view from a root process across all processes in the given communicator. The array/view is chunked into equal parts along the first dimension using mpi::chunk_length.

It is expected that all input arrays/views have the same rank on all processes. The function throws an exception, if

the input array/view is not contiguous with positive strides on the root process,
an output array/view is not contiguous with positive strides or
an output view does not have the correct shape.

The actual scattering is done by calling mpi::scatter_range. The input array/view on the root process is chunked along the first dimension into equal (as much as possible) parts using mpi::chunk_length. If the extent of the input array along the first dimension is not divisible by the number of processes, processes with lower ranks will receive more data than processes with higher ranks.

Note: Scattering is only supported for contiguous arrays/views with positive strides and with MPI compatible value types.

Template Parameters

A1	nda::basic_array or nda::basic_array_view type with C-layout.
A2	nda::basic_array or nda::basic_array_view type with C-layout.

Parameters

a_in	Array/view to be scattered.
a_out	Array/view to scatter into.
comm	mpi::communicator object.
root	Rank of the root process.

Definition at line 82 of file scatter.hpp.

Detailed Description

Functions

Function Documentation

◆ mpi_broadcast()

◆ mpi_gather()

◆ mpi_gather_into()

◆ mpi_reduce()

◆ mpi_reduce_into()

◆ mpi_scatter()

◆ mpi_scatter_into()