TRIQS/nda 1.3.0
Multi-dimensional array library for C++
|
In this example, we show how to write/read nda arrays and views to/from HDF5 files.
nda uses the h5 library and especially the h5::array_interface to provide HDF5 support.
All the following code snippets are part of the same main
function:
Before we dive into the HDF5 capabilities of nda, let us first specify the array that we will be working with:
Output:
Note: In the examples below, we will restrict ourselves to arrays and views in C-order. The reason is that when writing/reading to/from HDF5, the interface always checks if the arrays/views are in C-order. If this is not the case, it will use a temporary C-order array to perform the writing/reading.
Writing an array to an HDF5 file is as simple as
Dumping the HDF5 file gives
The array is written into a newly created dataset with a dataspace that has the same dimensions and the same shape as the original array. In this case, it is a 5-by-5 dataspace.
When writing to HDF5, the interface doesn't distinguish between arrays and views. We could have done the same with a view:
Dumping the corresponding HDF5 dataspace gives
Here, we write a view of every other row and column of the original array. Again, the created dataspace A_v
has the same dimensions and shape as the object that is written. In this case, a 3-by-3 view.
Note: nda::h5_write takes a fourth parameter which determines if the data should be compressed before it is written. By default, this is set to
true
. To turn the compression off, one can specify it in theh5::write
call, e.gh5::write(file, "A", A, /* compression off */ false);
Reading a full dataset into an array is straightforward:
Output:
As you can see, the array does not need to have the same shape as the dataset, since the nda::h5_read function will resize it if needed.
The same is not true for views. Views cannot be resized, so when we read into a view, we have to make sure that it has the correct shape (otherwise an exception will be thrown):
Output:
Here, we read the 3-by-3 dataset A_v
into a view B_v
consisting of every other column and the rows 1, 2 and 3 of the underlying 5-by-5 array B
.
So far we have only written to an automatically created dataset with exactly the same size and shape as the array/view that is being written. It is also possible to write to a slice of an existing dataset as long as the selected slice has the same shape and size as the array/view.
To demonstrate this, let us first create a dataset and zero it out (in production code, one would probably call the HDF5 C library directly to create a dataspace and a dataset but this is not needed for this simple example):
Dumping this dataset gives
Then, we can take a slice of this dataset, e.g. by specifying every other row:
Let's write the first 3 rows of A
to this slice:
Dumping the dataset gives
Under the hood, nda takes the slice and transforms it into an HDF5 hyperslab to which the data is then written.
If we write the remaining last 2 rows of A
to the empty rows in B
,
we get
Instead of reading the full dataset as we have done before, it is possible to specify a slice of the dataset that should be read.
We can reuse the slices from above to first read the rows 0, 2, and 4 and then the rows 1 and 3 of A
into the first 3 and the last 2 rows of a 5-by-5 array, respectively:
Output:
For the user, writing and reading an 1-dimensional array/view of strings works exactly the same way as with an array/view of arithmetic scalars:
Output:
The dumped HDF5 dataset gives
nda allows us to write/read arbitrary arrays/views as long as the objects contained in the array have specialized h5_write
and h5_read
functions (see h5 docs).
For example, an array of integer arrays can be written/read as
Output:
Dumping the corresponding HDF5 group gives
Now, I
is an HDF5 group and not a dataset and each object of the array, i.e. each integer array, is written to its own dataset with a name corresponding to its index in the array. In this case, "0", "1" and "2".