Note

All functions of these modules are also directly accessable from the pyDive module.

pyDive.arrayOfStructs module

The arrayOfStructs module addresses the common problem when dealing with structured data: While the user likes an array-of-structures layout the machine prefers a structure-of-arrays. In pyDive the method of choice is a virtual array-of-structures-object. It holds array-like attributes such as shape and dtype and allows for slicing but is operating on a structure-of-arrays internally.

Example:

...
treeOfArrays = {"FieldE" :
                    {"x" : fielde_x,
                     "y" : fielde_y,
                     "z" : fielde_z},
                "FieldB" :
                    {"x" : fieldb_x,
                     "y" : fieldb_y,
                     "z" : fieldb_z}
                }

fields = pyDive.arrayOfStructs(treeOfArrays)

half = fields[::2]["FieldE/x"]
# equivalent to
half = fields["FieldE/x"][::2]
# equivalent to
half = fields["FieldE"]["x"][::2]
# equivalent to
half = fields["FieldE"][::2]["x"]

# equivalent to
half = fields.FieldE.x[::2]

The example shows that in fact fields can be treated as an array-of-structures or a structure-of-arrays depending on what is more appropriate.

The goal is to make the virtual array-of-structs-object look like a real array. Therefore every method call or operation is forwarded to the individual arrays.:

new_field = fields.FieldE.astype(np.int) + fields.FieldB.astype(np.float)

Here the forwarded method calls are astype and __add__.

pyDive.arrayOfStructs.arrayOfStructs(structOfArrays)

Convert a structure-of-arrays into a virtual array-of-structures.

Parameters:

structOfArrays – tree-like dictionary of arrays.

Raises:
  • AssertionError – if the arrays-types do not match. Datatypes may differ.
  • AssertionError – if the shapes do not match.
Returns:

Custom object representing a virtual array whose elements have the same tree-like structure as structOfArrays.

pyDive.algorithm module

pyDive.algorithm.map(f, *arrays, **kwargs)[source]

Applies f on engine on local arrays related to arrays. Example:

cluster_array = pyDive.ones(shape=[100], distaxes=0)

cluster_array *= 2.0
# equivalent to
pyDive.map(lambda a: a *= 2.0, cluster_array) # a is the local numpy-array of *cluster_array*

Or, as a decorator:

@pyDive.map
def twice(a):
    a *= 2.0

twice(cluster_array)
Parameters:
  • f (callable) – function to be called on engine. Has to accept numpy-arrays and kwargs
  • arrays – list of arrays including pyDive.ndarrays, pyDive.h5_ndarrays or pyDive.cloned_ndarrays
  • kwargs – user-specified keyword arguments passed to f
Raises:
  • AssertionError – if the shapes of pyDive.ndarrays and pyDive.h5_ndarrays do not match
  • AssertionError – if the distaxes attributes of pyDive.ndarrays and pyDive.h5_ndarrays do not match
Notes:
  • If the hdf5 data exceeds the memory limit (currently 25% of the combined main memory of all cluster nodes) the data will be read block-wise so that a block fits into memory.
  • map chooses the list of engines from the first element of arrays. On these engines f is called. If the first array is a pyDive.h5_ndarray all engines will be used.
  • map is not writing data back to a pyDive.h5_ndarray yet.
  • map does not equalize the element distribution of pyDive.ndarrays before execution.
pyDive.algorithm.mapReduce(map_func, reduce_op, *arrays, **kwargs)[source]

Applies map_func on engine on local arrays related to arrays and reduces its result in a tree-like fashion over all axes. Example:

cluster_array = pyDive.ones(shape=[100], distaxes=0)

s = pyDive.mapReduce(lambda a: a**2, np.add, cluster_array) # a is the local numpy-array of *cluster_array*
assert s == 100
Parameters:
  • f (callable) – function to be called on engine. Has to accept numpy-arrays and kwargs
  • reduce_op (numpy-ufunc) – reduce operation, e.g. numpy.add.
  • arrays – list of arrays including pyDive.ndarrays, pyDive.h5_ndarrays or pyDive.cloned_ndarrays
  • kwargs – user-specified keyword arguments passed to f
Raises:
  • AssertionError – if the shapes of pyDive.ndarrays and pyDive.h5_ndarrays do not match
  • AssertionError – if the distaxes attributes of pyDive.ndarrays and pyDive.h5_ndarrays do not match
Notes:
  • If the hdf5 data exceeds the memory limit (currently 25% of the combined main memory of all cluster nodes) the data will be read block-wise so that a block fits into memory.
  • mapReduce chooses the list of engines from the first element of arrays. On these engines the mapReduce will be executed. If the first array is a pyDive.h5_ndarray all engines will be used.
  • mapReduce is not writing data back to a pyDive.h5_ndarray yet.
  • mapReduce does not equalize the element distribution of pyDive.ndarrays before execution.
pyDive.algorithm.reduce(array, op)[source]

Perform a tree-like reduction over all axes of array.

Parameters:
  • arraypyDive.ndarray, pyDive.h5_ndarray or pyDive.cloned_ndarray to be reduced
  • op (numpy-ufunc) – reduce operation, e.g. numpy.add.

If the hdf5 data exceeds the memory limit (currently 25% of the combined main memory of all cluster nodes) the data will be read block-wise so that a block fits into memory.

pyDive.fragment module

pyDive.fragment.fragment(*arrays, **kwargs)[source]

Create fragments of arrays so that each fragment will fit into the combined main memory of all engines when calling load(). The fragmentation is done by array slicing along the longest axis of arrays[0]. The edge size of the fragments is a power of two except for the last fragment.

Parameters:
  • array – distributed arrays (e.g. pyDive.ndarray, pyDive.h5_ndarray, ...)
  • kwargs – optional keyword arguments are: memory_limit and offset.
  • memory_limit (float) – fraction of the combined main memory of all engines reserved for fragmentation. Defaults to 0.25.
  • offset (bool) – If True the returned tuple is extended by the fragments’ offset (along the distributed axis). Defaults to False.
Raises:
  • AssertionError – If not all arrays have the same shape.
  • AssertionError – If not all arrays are distributed along the same axis.
Returns:

generator object (list) of tuples. Each tuple consists of one fragment for each array in arrays.

Note that arrays may contain an arbitrary number of distributed arrays of any type. While the fragments’ size is solely calculated based on the memory consumption of arrays that store their elements on hard disk (see hdd_arraytypes), the fragmentation itself is applied on all arrays in the same way.

Example:

big_h5_array = pyDive.h5.open("monster.h5", "/")
# big_h5_array.load() # crash

for h5_array, offset in pyDive.fragment(big_h5_array, offset=True):
    a = h5_array.load() # no crash
    print "This fragment's offset is", offset, "on axis:", a.distaxis
pyDive.fragment.hdd_arraytypes = (<class 'pyDive.distribution.multiple_axes.h5_ndarray'>, None)

list of array types that store their elements on hard disk

pyDive.mappings module

If numba is installed the particle shape functions will be compiled which gives an appreciable speedup.

class pyDive.mappings.CIC[source]

Cloud-in-Cell

class pyDive.mappings.NGP[source]

Nearest-Grid-Point

pyDive.mappings.mesh2particles(mesh, particles_pos, shape_function=<class pyDive.mappings.CIC>)[source]

Map mesh values to particles according to a particle shape function.

Parameters:
  • mesh (array-like) – n-dimensional array. Dimension of mesh has to be greater or equal to the number of particle position components.
  • particles_pos ((N, d)) – ‘d’-dim tuples for ‘N’ particle positions. The positions can be float32 or float64 and must be within the shape of mesh.
  • shape_function (callable, optional) – Callable object returning the particle assignment value for a given param ‘x’. Has to provide a ‘support’ float attribute which defines the width of the non-zero area. Defaults to cloud-in-cell.
Returns:

Mapped mesh values for each particle.

Notes:
  • The particle shape function is not evaluated outside the mesh.
pyDive.mappings.particles2mesh(mesh, particles, particles_pos, shape_function=<class pyDive.mappings.CIC>)[source]

Map particle values to mesh according to a particle shape function. Particle values are added to the mesh.

Parameters:
  • mesh (array-like) – n-dimensional array. Dimension of mesh has to be greater or equal to the number of particle position components.
  • particles (array_like (1 dim)) – particle data. len(particles) has to be the same as len(particles_pos)
  • particles_pos ((N, d)) – ‘d’-dim tuples for ‘N’ particle positions. The positions can be float32 or float64 and must be within the shape of mesh.
  • shape_function (callable, optional) – Callable object returning the particle assignment value for a given param ‘x’. Has to provide a ‘support’ float attribute which defines the width of the non-zero area. Defaults to cloud-in-cell.
Returns:

mesh

Notes:
  • The particle shape function is not evaluated outside the mesh.

pyDive.picongpu module

This module holds convenient functions for those who use pyDive together with picongpu.

pyDive.picongpu.getSteps(folder_path)[source]

Returns a list of all timesteps in folder_path.

pyDive.picongpu.loadAllSteps(folder_path, data_path, distaxis=0)[source]

Python generator object looping hdf5-data of all timesteps found in folder_path.

This generator doesn’t read or write any data elements from hdf5 but returns dataset-handles covered by pyDive.h5_ndarray objects.

All datasets within data_path must have the same shape.

Parameters:
  • folder_path (str) – Path to the folder containing the hdf5-files
  • data_path (str) – Relative path starting from “/data/<timestep>/” within hdf5-file to the dataset or group of datasets
  • distaxis (int) – axis on which datasets are distributed over when once loaded into memory.
Returns:

tuple of timestep and a pyDive.h5_ndarray or a structure of pyDive.h5_ndarrays (pyDive.structured). Ordering is done by timestep.

Notes:
  • If the dataset has a ‘sim_unit‘ attribute its value is stored in h5array.unit.
pyDive.picongpu.loadStep(step, folder_path, data_path, distaxis=0)[source]

Load hdf5-data from a single timestep found in folder_path.

All datasets within data_path must have the same shape.

Parameters:
  • step (int) – timestep
  • folder_path (str) – Path to the folder containing the hdf5-files
  • data_path (str) – Relative path starting from “/data/<timestep>/” within hdf5-file to the dataset or group of datasets
  • distaxis (int) – axis on which datasets are distributed over when once loaded into memory.
Returns:

pyDive.h5_ndarray or a structure of pyDive.h5_ndarrays (pyDive.structured).

Notes:
  • If the dataset has a ‘sim_unit‘ attribute its value is stored in h5array.unit.
pyDive.picongpu.loadSteps(steps, folder_path, data_path, distaxis=0)[source]

Python generator object looping all hdf5-data found in folder_path from timesteps appearing in steps.

This generator doesn’t read or write any data elements from hdf5 but returns dataset-handles covered by pyDive.h5_ndarray objects.

All datasets within data_path must have the same shape.

Parameters:
  • steps (ints) – list of timesteps to loop
  • folder_path (str) – Path to the folder containing the hdf5-files
  • data_path (str) – Relative path starting from “/data/<timestep>/” within hdf5-file to the dataset or group of datasets
  • distaxis (int) – axis on which datasets are distributed over when once loaded into memory.
Returns:

tuple of timestep and a pyDive.h5_ndarray or a structure of pyDive.h5_ndarrays (pyDive.structured). Ordering is done by timestep.

Notes:
  • If the dataset has a ‘sim_unit‘ attribute its value is stored in h5array.unit.

pyDive.pyDive module

Make most used functions and modules directly accessable from pyDive.

Functions:

abs

absolute

add

arccos

arccosh

arcsin

arcsinh

arctan

arctan2

arctanh

array

bitwise_and

bitwise_not

bitwise_or

bitwise_xor

ceil

conj

conjugate

copysign

cos

cosh

deg2rad

degrees

divide

empty

empty_like

equal

exp

exp2

expm1

fabs

floor

floor_divide

fmax

fmin

fmod

fragment

frexp

greater

greater_equal

hollow

hollow_like

hypot

init

invert

isfinite

isinf

isnan

ldexp

left_shift

less

less_equal

log

log10

log1p

log2

logaddexp

logaddexp2

logical_and

logical_not

logical_or

logical_xor

map

mapReduce

maximum

mesh2particles

minimum

mod

modf

multiply

ndarray

negative

nextafter

not_equal

ones

ones_like

particles2mesh

power

rad2deg

radians

reciprocal

reduce

remainder

right_shift

rint

sign

signbit

sin

sinh

spacing

sqrt

square

structured

subtract

tan

tanh

true_divide

trunc

zeros

zeros_like

Modules:

IPParallelClient

algorithm

arrays

cloned

cloned_ndarray

h5

mappings

picongpu