Note
All functions of these modules are also directly accessable from the pyDive
module.
pyDive.arrayOfStructs module¶
The arrayOfStructs module addresses the common problem when dealing with structured data: While the user likes an array-of-structures layout the machine prefers a structure-of-arrays. In pyDive the method of choice is a virtual array-of-structures-object. It holds array-like attributes such as shape and dtype and allows for slicing but is operating on a structure-of-arrays internally.
Example:
...
treeOfArrays = {"FieldE" :
{"x" : fielde_x,
"y" : fielde_y,
"z" : fielde_z},
"FieldB" :
{"x" : fieldb_x,
"y" : fieldb_y,
"z" : fieldb_z}
}
fields = pyDive.arrayOfStructs(treeOfArrays)
half = fields[::2]["FieldE/x"]
# equivalent to
half = fields["FieldE/x"][::2]
# equivalent to
half = fields["FieldE"]["x"][::2]
# equivalent to
half = fields["FieldE"][::2]["x"]
# equivalent to
half = fields.FieldE.x[::2]
The example shows that in fact fields can be treated as an array-of-structures or a structure-of-arrays depending on what is more appropriate.
The goal is to make the virtual array-of-structs-object look like a real array. Therefore every method call or operation is forwarded to the individual arrays.:
new_field = fields.FieldE.astype(np.int) + fields.FieldB.astype(np.float)
Here the forwarded method calls are astype
and __add__
.
-
pyDive.arrayOfStructs.
arrayOfStructs
(structOfArrays)¶ Convert a structure-of-arrays into a virtual array-of-structures.
Parameters: structOfArrays – tree-like dictionary of arrays.
Raises: - AssertionError – if the arrays-types do not match. Datatypes may differ.
- AssertionError – if the shapes do not match.
Returns: Custom object representing a virtual array whose elements have the same tree-like structure as structOfArrays.
pyDive.algorithm module¶
-
pyDive.algorithm.
map
(f, *arrays, **kwargs)[source]¶ Applies f on engine on local arrays related to arrays. Example:
cluster_array = pyDive.ones(shape=[100], distaxes=0) cluster_array *= 2.0 # equivalent to pyDive.map(lambda a: a *= 2.0, cluster_array) # a is the local numpy-array of *cluster_array*
Or, as a decorator:
@pyDive.map def twice(a): a *= 2.0 twice(cluster_array)
Parameters: - f (callable) – function to be called on engine. Has to accept numpy-arrays and kwargs
- arrays – list of arrays including pyDive.ndarrays, pyDive.h5_ndarrays or pyDive.cloned_ndarrays
- kwargs – user-specified keyword arguments passed to f
Raises: - AssertionError – if the shapes of pyDive.ndarrays and pyDive.h5_ndarrays do not match
- AssertionError – if the distaxes attributes of pyDive.ndarrays and pyDive.h5_ndarrays do not match
- Notes:
- If the hdf5 data exceeds the memory limit (currently 25% of the combined main memory of all cluster nodes) the data will be read block-wise so that a block fits into memory.
- map chooses the list of engines from the first element of arrays. On these engines f is called. If the first array is a pyDive.h5_ndarray all engines will be used.
- map is not writing data back to a pyDive.h5_ndarray yet.
- map does not equalize the element distribution of pyDive.ndarrays before execution.
-
pyDive.algorithm.
mapReduce
(map_func, reduce_op, *arrays, **kwargs)[source]¶ Applies map_func on engine on local arrays related to arrays and reduces its result in a tree-like fashion over all axes. Example:
cluster_array = pyDive.ones(shape=[100], distaxes=0) s = pyDive.mapReduce(lambda a: a**2, np.add, cluster_array) # a is the local numpy-array of *cluster_array* assert s == 100
Parameters: - f (callable) – function to be called on engine. Has to accept numpy-arrays and kwargs
- reduce_op (numpy-ufunc) – reduce operation, e.g. numpy.add.
- arrays – list of arrays including pyDive.ndarrays, pyDive.h5_ndarrays or pyDive.cloned_ndarrays
- kwargs – user-specified keyword arguments passed to f
Raises: - AssertionError – if the shapes of pyDive.ndarrays and pyDive.h5_ndarrays do not match
- AssertionError – if the distaxes attributes of pyDive.ndarrays and pyDive.h5_ndarrays do not match
- Notes:
- If the hdf5 data exceeds the memory limit (currently 25% of the combined main memory of all cluster nodes) the data will be read block-wise so that a block fits into memory.
- mapReduce chooses the list of engines from the first element of arrays. On these engines the mapReduce will be executed. If the first array is a pyDive.h5_ndarray all engines will be used.
- mapReduce is not writing data back to a pyDive.h5_ndarray yet.
- mapReduce does not equalize the element distribution of pyDive.ndarrays before execution.
-
pyDive.algorithm.
reduce
(array, op)[source]¶ Perform a tree-like reduction over all axes of array.
Parameters: - array – pyDive.ndarray, pyDive.h5_ndarray or pyDive.cloned_ndarray to be reduced
- op (numpy-ufunc) – reduce operation, e.g. numpy.add.
If the hdf5 data exceeds the memory limit (currently 25% of the combined main memory of all cluster nodes) the data will be read block-wise so that a block fits into memory.
pyDive.fragment module¶
-
pyDive.fragment.
fragment
(*arrays, **kwargs)[source]¶ Create fragments of arrays so that each fragment will fit into the combined main memory of all engines when calling
load()
. The fragmentation is done by array slicing along the longest axis ofarrays[0]
. The edge size of the fragments is a power of two except for the last fragment.Parameters: - array – distributed arrays (e.g. pyDive.ndarray, pyDive.h5_ndarray, ...)
- kwargs – optional keyword arguments are:
memory_limit
andoffset
. - memory_limit (float) – fraction of the combined main memory of all engines reserved for fragmentation.
Defaults to
0.25
. - offset (bool) – If
True
the returned tuple is extended by the fragments’ offset (along the distributed axis). Defaults toFalse
.
Raises: - AssertionError – If not all arrays have the same shape.
- AssertionError – If not all arrays are distributed along the same axis.
Returns: generator object (list) of tuples. Each tuple consists of one fragment for each array in arrays.
Note that arrays may contain an arbitrary number of distributed arrays of any type. While the fragments’ size is solely calculated based on the memory consumption of arrays that store their elements on hard disk (see
hdd_arraytypes
), the fragmentation itself is applied on all arrays in the same way.Example:
big_h5_array = pyDive.h5.open("monster.h5", "/") # big_h5_array.load() # crash for h5_array, offset in pyDive.fragment(big_h5_array, offset=True): a = h5_array.load() # no crash print "This fragment's offset is", offset, "on axis:", a.distaxis
-
pyDive.fragment.
hdd_arraytypes
= (<class 'pyDive.distribution.multiple_axes.h5_ndarray'>, None)¶ list of array types that store their elements on hard disk
pyDive.mappings module¶
If numba is installed the particle shape functions will be compiled which gives an appreciable speedup.
-
pyDive.mappings.
mesh2particles
(mesh, particles_pos, shape_function=<class pyDive.mappings.CIC>)[source]¶ Map mesh values to particles according to a particle shape function.
Parameters: - mesh (array-like) – n-dimensional array. Dimension of mesh has to be greater or equal to the number of particle position components.
- particles_pos ((N, d)) – ‘d’-dim tuples for ‘N’ particle positions. The positions can be float32 or float64 and must be within the shape of mesh.
- shape_function (callable, optional) – Callable object returning the particle assignment value for a given param ‘x’. Has to provide a ‘support’ float attribute which defines the width of the non-zero area. Defaults to cloud-in-cell.
Returns: Mapped mesh values for each particle.
- Notes:
- The particle shape function is not evaluated outside the mesh.
-
pyDive.mappings.
particles2mesh
(mesh, particles, particles_pos, shape_function=<class pyDive.mappings.CIC>)[source]¶ Map particle values to mesh according to a particle shape function. Particle values are added to the mesh.
Parameters: - mesh (array-like) – n-dimensional array. Dimension of mesh has to be greater or equal to the number of particle position components.
- particles (array_like (1 dim)) – particle data. len(particles) has to be the same as len(particles_pos)
- particles_pos ((N, d)) – ‘d’-dim tuples for ‘N’ particle positions. The positions can be float32 or float64 and must be within the shape of mesh.
- shape_function (callable, optional) – Callable object returning the particle assignment value for a given param ‘x’. Has to provide a ‘support’ float attribute which defines the width of the non-zero area. Defaults to cloud-in-cell.
Returns: mesh
- Notes:
- The particle shape function is not evaluated outside the mesh.
pyDive.picongpu module¶
This module holds convenient functions for those who use pyDive together with picongpu.
-
pyDive.picongpu.
loadAllSteps
(folder_path, data_path, distaxis=0)[source]¶ Python generator object looping hdf5-data of all timesteps found in folder_path.
This generator doesn’t read or write any data elements from hdf5 but returns dataset-handles covered by pyDive.h5_ndarray objects.
All datasets within data_path must have the same shape.
Parameters: - folder_path (str) – Path to the folder containing the hdf5-files
- data_path (str) – Relative path starting from “/data/<timestep>/” within hdf5-file to the dataset or group of datasets
- distaxis (int) – axis on which datasets are distributed over when once loaded into memory.
Returns: tuple of timestep and a pyDive.h5_ndarray or a structure of pyDive.h5_ndarrays (
pyDive.structured
). Ordering is done by timestep.- Notes:
- If the dataset has a ‘sim_unit‘ attribute its value is stored in
h5array.unit
.
- If the dataset has a ‘sim_unit‘ attribute its value is stored in
-
pyDive.picongpu.
loadStep
(step, folder_path, data_path, distaxis=0)[source]¶ Load hdf5-data from a single timestep found in folder_path.
All datasets within data_path must have the same shape.
Parameters: - step (int) – timestep
- folder_path (str) – Path to the folder containing the hdf5-files
- data_path (str) – Relative path starting from “/data/<timestep>/” within hdf5-file to the dataset or group of datasets
- distaxis (int) – axis on which datasets are distributed over when once loaded into memory.
Returns: pyDive.h5_ndarray or a structure of pyDive.h5_ndarrays (
pyDive.structured
).- Notes:
- If the dataset has a ‘sim_unit‘ attribute its value is stored in
h5array.unit
.
- If the dataset has a ‘sim_unit‘ attribute its value is stored in
-
pyDive.picongpu.
loadSteps
(steps, folder_path, data_path, distaxis=0)[source]¶ Python generator object looping all hdf5-data found in folder_path from timesteps appearing in steps.
This generator doesn’t read or write any data elements from hdf5 but returns dataset-handles covered by pyDive.h5_ndarray objects.
All datasets within data_path must have the same shape.
Parameters: - steps (ints) – list of timesteps to loop
- folder_path (str) – Path to the folder containing the hdf5-files
- data_path (str) – Relative path starting from “/data/<timestep>/” within hdf5-file to the dataset or group of datasets
- distaxis (int) – axis on which datasets are distributed over when once loaded into memory.
Returns: tuple of timestep and a pyDive.h5_ndarray or a structure of pyDive.h5_ndarrays (
pyDive.structured
). Ordering is done by timestep.- Notes:
- If the dataset has a ‘sim_unit‘ attribute its value is stored in
h5array.unit
.
- If the dataset has a ‘sim_unit‘ attribute its value is stored in
pyDive.pyDive module¶
Make most used functions and modules directly accessable from pyDive.
Functions:
abs
absolute
add
arccos
arccosh
arcsin
arcsinh
arctan
arctan2
arctanh
bitwise_and
bitwise_not
bitwise_or
bitwise_xor
ceil
conj
conjugate
copysign
cos
cosh
deg2rad
degrees
divide
empty
empty_like
equal
exp
exp2
expm1
fabs
floor
floor_divide
fmax
fmin
fmod
frexp
greater
greater_equal
hypot
init
invert
isfinite
isinf
isnan
ldexp
left_shift
less
less_equal
log
log10
log1p
log2
logaddexp
logaddexp2
logical_and
logical_not
logical_or
logical_xor
maximum
minimum
mod
modf
multiply
ndarray
negative
nextafter
not_equal
ones
ones_like
power
rad2deg
radians
reciprocal
remainder
right_shift
rint
sign
signbit
sin
sinh
spacing
sqrt
square
structured
subtract
tan
tanh
true_divide
trunc
zeros
zeros_like
Modules:
IPParallelClient
arrays
cloned_ndarray