digitize_dd
- mdtools.numpy_helper_functions.digitize_dd(sample, bins, right=False, expand_binnumbers=False, raise_if_less=False, raise_if_greater=False)[source]
Return the indices of the multi-dimensional bins to which each value in the input array belongs.
This function extends
numpy.digitize()
to multi-dimensional data.If values in sample are beyond the bounds of bins,
0
orlen(bins)
is returned as appropriate.- Parameters:
sample (
array_like
) – Data to be digitized. Either a sequence ofN
arrays each of lengthD
or a 2-dimensional array of shape(N, D)
. Each of theN
arrays/rows is interpreted as a set of coordinates inD
-dimensional space. If sample has less than 2 dimensions, simplynp.digitize(sample, bins, right=right)
is returned.bins (
array_like
orsequence
ofarray_likes
) – Array of bin edges to use for each of theD
dimensions. Must be either a sequence ofD
monotonic arrays (one for each dimension) or a single monotonic array that is used for all dimensions. If sample has less than 2 dimensions, bins must be a 1-dimensional, monotonic array.right (
bool
, optional) – Indicating whether the bin intervals include the right or the left bin edge. IfTrue
, the bin intervals include the right bin edge, i.e. the bin intervals are left-open and right-closed: (a, b] -> a < x <= b. IfFalse
, the bin intervals include the left bin edge, i.e. the bin intervals are left-closed and right-open: [a, b) -> a <= x < b. Seenumpy.digitize()
for more details.expand_binnumbers (
bool
, optional) – IfTrue
, the returned index array is unraveled into an array of shape(D, N)
where each row gives the bin numbers of the elements of sample along the corresponding dimension. IfFalse
, the returned index array has shape(N,)
and maps each element of sample to its corresponding linearized bin number (using row-major ordering). Note that the returned linearized bin indices index into an array containing two extra bins at the outer bin edges to capture values outside of the defined bin bounds. See alsoscipy.stats.binned_statistic_dd()
.raise_if_less, raise_if_greater (
bool
, optional) – IfTrue
, raise anValueError
if a value of sample lies below/above the first/last bin. Otherwise, if values in sample lie below or above the bounds of bins,0
orlen(bins)
is returned, respectively.
- Returns:
bin_ix (
numpy.ndarray
) – Array of indices. This array assigns to each element of sample an integer that represents the bin number to which this element belongs. The representation depends on the expand_binnumbers argument.
See also
numpy.digitize()
Return the indices of the 1-dimensional bins to which each value of the input array belongs
scipy.stats.binned_statistic_dd()
Compute a multi-dimensional binned statistic for a set of data
numpy.unravel_index()
Convert an array of flat indices into a tuple of coordinate arrays
Examples
>>> mdt.nph.digitize_dd(0.5, [0, 1]) 1 >>> mdt.nph.digitize_dd([-1, 0, 1], [0, 1]) array([0, 1, 2]) >>> mdt.nph.digitize_dd([-1, 0, 1], [0, 1], right=True) array([0, 0, 1])
>>> a = np.array([[-1, 0.5], ... [ 0, 1.5], ... [ 1, 2.5]]) >>> bins_x = np.arange(-1, 2) >>> bins_y = np.arange(1, 4) >>> mdt.nph.digitize_dd(a, [bins_x, bins_y]) array([ 4, 9, 14]) >>> mdt.nph.digitize_dd(a, [bins_x, bins_y], expand_binnumbers=True) array([[1, 2, 3], [0, 1, 2]])
>>> from scipy.stats import binned_statistic_dd >>> a = np.random.rand(4,3) * 5 >>> bins_x = np.arange(5) >>> bins_y = np.arange(-1, 5.1, 2) >>> bins_z = np.arange(0, 6.1, 2) >>> bins = (bins_x, bins_y, bins_z) >>> bin_ix = mdt.nph.digitize_dd(a, bins) >>> ret = binned_statistic_dd(a, values=a.T, bins=bins) >>> np.array_equal(bin_ix, ret.binnumber) True >>> bin_ix = mdt.nph.digitize_dd(a, bins, expand_binnumbers=True) >>> ret = binned_statistic_dd( ... a, values=a.T, bins=bins, expand_binnumbers=True ... ) >>> np.array_equal(bin_ix, ret.binnumber) True