pyDive.arrays.ndarray module

Note

All of this module’s functions and classes are also directly accessable from the pyDive module.

pyDive.ndarray class

class pyDive.ndarray(shape, dtype=<type 'float'>, distaxes='all', target_offsets=None, target_ranks=None, no_allocation=False, **kwargs)

Represents a cluster-wide, multidimensional, homogeneous array of fixed-size elements. cluster-wide means that its elements are distributed across IPython.parallel-engines. The distribution is done in one or multiply dimensions along user-specified axes. The user can optionally specify which engine maps to which index range or leave the default that persuits an uniform distribution across all engines.

This ndarray - class is auto-generated out of its local counterpart: numpy.ndarray.

The implementation is based on IPython.parallel and local numpy.ndarray - arrays. Every special operation numpy.ndarray implements (“__add__”, “__le__”, ...) is also available for ndarray.

Note that array slicing is a cheap operation since no memory is copied. However this can easily lead to the situation where you end up with two arrays of the same size but of distinct element distribution. Therefore call dist_like() first before doing any manual stuff on their local arrays. However every cluster-wide array operation first equalizes the distribution of all involved arrays, so an explicit call to dist_like() is rather unlikely in most use cases.

If you try to access an attribute that is only available for the local array, the request is forwarded to an internal local copy of the whole distributed array (see: gather()). This internal copy is only created when you want to access it and is held until __setitem__ is called, i.e. the array’s content is manipulated.

__init__(shape, dtype=<type 'float'>, distaxes='all', target_offsets=None, target_ranks=None, no_allocation=False, **kwargs)

Creates an instance of ndarray. This is a low-level method of instantiating an array, it should rather be constructed using factory functions (“empty”, “zeros”, “open”, ...)

Parameters:
  • shape (ints) – shape of array
  • dtype – datatype of a single element
  • distaxes (ints) – distributed axes. Accepts a single integer too. Defaults to ‘all’ meaning each axis is distributed.
  • target_offsets (list of lists) – For each distributed axis there is a (inner) list in the outer list. The inner list contains the offsets of the local array.
  • target_ranks (ints) – linear list of engine ranks holding the local arrays. The last distributed axis is iterated over first.
  • no_allocation (bool) – if True no instance of numpy.ndarray will be created on engine. Useful for manual instantiation of the local array.
  • kwargs – additional keyword arguments are forwarded to the constructor of the local array.
copy()

Returns a hard copy of this array.

dist_like(other)

Redistributes a copy of this array (self) like other and returns the result. Checks whether redistribution is necessary and returns self if not.

Redistribution involves inter-engine communication.

Parameters:other (distributed array) – target array
Raises AssertionError:
 if the shapes of self and other don’t match.
Returns:new array with the same content as self but distributed like other. If self is already distributed like other nothing is done and self is returned.
gather()

Gathers local instances of numpy.ndarray from engines, concatenates them and returns the result.

Note

You may not call this method explicitly because if you try to access an attribute of the local array (numpy.ndarray), gather() is called implicitly before the request is forwarded to that internal gathered array. Just access attributes like you do for the local array. The internal copy is held until __setitem__ is called, e.g. a[1] = 3.0, setting a dirty flag to the local copy.

Warning

If another array overlapping this array is manipulating its data there is no chance to set the dirty flag so you have to keep in mind to call gather() explicitly in this case!

Returns:instance of numpy.ndarray

Factory functions

These are convenient functions to create a pyDive.ndarray instance.

pyDive.arrays.ndarray.array(array_like, distaxes='all')[source]

Create a pyDive.ndarray instance from an array-like object.

Parameters:
  • array_like – Any object exposing the array interface, e.g. numpy-array, python sequence, ...
  • distaxes (ints) – distributed axes. Defaults to ‘all’ meaning each axis is distributed.
pyDive.arrays.ndarray.empty(shape, dtype=<type 'float'>, distaxes='all', **kwargs)

Create a ndarray instance. This function calls its local counterpart numpy.empty on each engine.

Parameters:
  • shape (ints) – shape of array
  • dtype – datatype of a single element
  • distaxes (ints) – distributed axes
  • kwargs – keyword arguments are passed to the local function numpy.empty
pyDive.arrays.ndarray.empty_like(other, **kwargs)

Create a ndarray instance with the same shape, dtype and distribution as other. This function calls its local counterpart numpy.empty_like on each engine.

Parameters:
  • other – other array
  • kwargs – keyword arguments are passed to the local function numpy.empty_like
pyDive.arrays.ndarray.hollow(shape, dtype=<type 'float'>, distaxes='all')[source]

Create a pyDive.ndarray instance distributed across all engines without allocating a local numpy-array.

Parameters:
  • shape (ints) – shape of array
  • dtype – datatype of a single element
  • distaxes (ints) – distributed axes. Defaults to ‘all’ meaning each axis is distributed.
pyDive.arrays.ndarray.hollow_like(other)[source]

Create a pyDive.ndarray instance with the same shape, distribution and type as other without allocating a local numpy-array.

pyDive.arrays.ndarray.zeros(shape, dtype=<type 'float'>, distaxes='all', **kwargs)

Create a ndarray instance. This function calls its local counterpart numpy.zeros on each engine.

Parameters:
  • shape (ints) – shape of array
  • dtype – datatype of a single element
  • distaxes (ints) – distributed axes
  • kwargs – keyword arguments are passed to the local function numpy.zeros
pyDive.arrays.ndarray.zeros_like(other, **kwargs)

Create a ndarray instance with the same shape, dtype and distribution as other. This function calls its local counterpart numpy.zeros_like on each engine.

Parameters:
  • other – other array
  • kwargs – keyword arguments are passed to the local function numpy.zeros_like
pyDive.arrays.ndarray.ones(shape, dtype=<type 'float'>, distaxes='all', **kwargs)

Create a ndarray instance. This function calls its local counterpart numpy.ones on each engine.

Parameters:
  • shape (ints) – shape of array
  • dtype – datatype of a single element
  • distaxes (ints) – distributed axes
  • kwargs – keyword arguments are passed to the local function numpy.ones
pyDive.arrays.ndarray.ones_like(other, **kwargs)

Create a ndarray instance with the same shape, dtype and distribution as other. This function calls its local counterpart numpy.ones_like on each engine.

Parameters:
  • other – other array
  • kwargs – keyword arguments are passed to the local function numpy.ones_like

Universal functions

numpy knows the so called ufuncs (universal function). These are functions which can be applied elementwise on an array, like sin, cos, exp, sqrt, etc. All of these ufuncs from numpy are also available for pyDive.ndarray arrays, e.g.

a = pyDive.ones([100])
a = pyDive.sin(a)