Reference documentation/manual
The HDFArchive
class offers a convenient interface between the python objects and the HDF5 files,
similar to a dictionary (or a shelve).
The module contains two classes:
HDFArchiveGroup
: operates on a subtree of the HDF5 fileHDFArchive
: anHDFArchiveGroup
at the root path, with a simple constructor.
Typically, one constructs an HDFArchive
explicitely, the HDFArchiveGroup
is created during operations, e.g.:
h = HDFArchive( "myfile.h5", 'r')
g = h['subgroup1'] # g is a HDFArchiveGroup.
Apart from the root path and the constructor, the classes are the same (in fact HDFArchive
is a HDFArchiveGroup
).
Let us first document HDFArchive
.
Warning
HDFArchive
and HDFArchiveGroup
do NOT handle parallelism.
Check however the HDFArchiveInert
below.
HDFArchive
HDFArchiveGroup
- class HDFArchiveGroup
There is no explicit constructor for the user of the class.
The
HDFArchiveGroup
support most of the operations supported by dictionaries. In the following, H is aHDFArchiveGroup
.- len(H)
Return the number of items in the
HDFArchiveGroup
H.
- H[key]
Return the item of H with key key, retrieved from the file. Raises a
KeyError
if key is not in theHDFArchiveGroup
.
- get_raw(key)
Returns the subgroup key, without any reconstruction, ignoring the HDF5_data_scheme.
- H[key] = value
Set
H[key]
to value.
- del H[key]
Remove
H[key]
from H. Raises aKeyError
if key is not in theHDFArchiveGroup
.
- key in H
Return
True
if H has a key key, elseFalse
.
- key not in H
Equivalent to
not key in H
.
- iter(H)
Return an iterator over the keys of the dictionary. This is a shortcut for
iterkeys()
.
- items()
Generator returning couples (key, values) in the group.
Warning
Note that in all these iterators, the objects will only be retrieved from the file and loaded into memory one by one.
- keys()
Generator returning the keys of the group.
- update(d)
Add into the archive the content of any mapping d: keys->values, with hfd-compliant values.
- values()
Generator returning the values in the group
- create_group(K)
Creates a new subgroup named K to the root path of the group. Raises exception if the subgroup already exists.
- is_group(K)
Return True iif K is a subgroup.
- is_data(K)
Return True iif K is a leaf.
- read_attr(AttributeName)
Return the attribute AttributeName of the root path of the group. If there is no attribute, return None.
- root_path()
Return the root path of the group
- apply_on_leaves(f)
For each named leaf (name,value) of the tree, it calls f(name,value).
f should return:
None : no action is taken
an empty tuple () : the leaf is removed from the tree
an hdf-compliant value : the leaf is replaced by the value
HDFArchiveInert
- class HDFArchiveInert
HDFArchive
andHDFArchiveGroup
do NOT handle parallelism. In general, it is good practive to write/read only on the master node. Reading from all nodes on a cluster may lead to communication problems.To simplify the writing of code, the simple HDFArchiveInert class may be useful. It is basically inert but does not fail.
- H[key]
Return H and never raise exception. E.g. H[‘a’][‘b’] never raises an exception.
- H[key] = value
Does nothing.
Usage in a mpi code, e.g.
R = HDFArchive("Results.h5",'w') if mpi.is_master_node() else HDFArchiveInert() a= mpi.bcast(R['a']) # properly broadcast the R['a'] from the master to the nodes. R['b'] = X # sets R['b'] in the file on the master only, does nothing on the nodes.
Hdf-compliant objects
By definition, hdf-compliant objects are those which can be stored/retrieved in an HDFArchive
.
In order to be hdf-compliant, a class must:
have a HDF5_data_scheme tag properly registered.
implement one of the two protocols described below.
HDF5 data scheme
To each hdf-compliant object, we associate a data scheme which describes how the data is stored in the hdf5 tree, i.e. the tree structure with the name of the nodes and their contents. This data scheme is added in the attribute HDF5_data_scheme at the node corresponding to the object in the file.
For a given class Cls, the HDF5_data_scheme is Cls._hdf5_data_scheme_ if it exists or the name of the class Cls.__name__.
The HDF5_data_scheme of a class must be registered in order for HDFArchive
to properly reconstruct the object when rereading.
The class is registered using the module formats
class myclass :
pass #....
from h5.formats import register_class
register_class (myclass)
The function is
- register_class(cls[, doc = None])
- Parameters:
cls – the class to be registered.
doc – a doc directory
Register the class for
HDFArchive
use.The name of data scheme will be myclass._hdf5_data_scheme_ if it is defined, or the name of the class otherwise.