I have a 3D-array consisting of several numbers within each band. Is there a function that returns the index positions where the array meets MULTIPLE conditions?
I tried the following:
index_pos = numpy.where( array[:,:,0]==10 and array[:,:,1]==15 and array[:,:,2]==30)It returns the error:
ValueError: The truth value of an array with more than one element is ambiguous.
Use a.any() or a.all() 3 Answers
You actually have a special case where it would be simpler and more efficient to do the following:
Create the data:
>>> arr
array([[[ 6, 9, 4], [ 5, 2, 1], [10, 15, 30]], [[ 9, 0, 1], [ 4, 6, 4], [ 8, 3, 9]], [[ 6, 7, 4], [ 0, 1, 6], [ 4, 0, 1]]])The expected value:
>>> index_pos = np.where((arr[:,:,0]==10) & (arr[:,:,1]==15) & (arr[:,:,2]==30))
>>> index_pos
(array([0]), array([2]))Use broadcasting to do this simultaneously:
>>> arr == np.array([10,15,30])
array([[[False, False, False], [False, False, False], [ True, True, True]], [[False, False, False], [False, False, False], [False, False, False]], [[False, False, False], [False, False, False], [False, False, False]]], dtype=bool)
>>> np.where( np.all(arr == np.array([10,15,30]), axis=-1) )
(array([0]), array([2]))If the indices you want are not contiguous you can do something like this:
ind_vals = np.array([0,2])
where_mask = (arr[:,:,ind_vals] == values)Broadcast when you can.
Spurred by @Jamie's comment, some interesting things to consider:
arr = np.random.randint(0,100,(5000,5000,3))
%timeit np.all(arr == np.array([10,15,30]), axis=-1)
1 loops, best of 3: 614 ms per loop
%timeit ((arr[:,:,0]==10) & (arr[:,:,1]==15) & (arr[:,:,2]==30))
1 loops, best of 3: 217 ms per loop
%timeit tmp = (arr == np.array([10,15,30])); (tmp[:,:,0] & tmp[:,:,1] & tmp[:,:,2])
1 loops, best of 3: 368 ms per loopThe question becomes, why is this?:
First off examine:
%timeit (arr[:,:,0]==10)
10 loops, best of 3: 51.2 ms per loop
%timeit (arr == np.array([10,15,30]))
1 loops, best of 3: 300 ms per loopOne would expect that arr == np.array([10,15,30]) would be at worse case 1/3 the speed of arr[:,:,0]==10. Anyone have an idea why this is not the case?
Then when combining the final axis there are many ways to accomplish this.
tmp = (arr == np.array([10,15,30]))
method1 = np.all(tmp,axis=-1)
method2 = (tmp[:,:,0] & tmp[:,:,1] & tmp[:,:,2])
method3 = np.einsum('ij,ij,ij->ij',tmp[:,:,0] , tmp[:,:,1] , tmp[:,:,2])
np.allclose(method1,method2)
True
np.allclose(method1,method3)
True
%timeit np.all(tmp,axis=-1)
1 loops, best of 3: 318 ms per loop
%timeit (tmp[:,:,0] & tmp[:,:,1] & tmp[:,:,2])
10 loops, best of 3: 68.2 ms per loop
%timeit np.einsum('ij,ij,ij->ij',tmp[:,:,0] , tmp[:,:,1] , tmp[:,:,2])
10 loops, best of 3: 38 ms per loopThe einsum speed up is well defined elsewhere, but it seems odd to me that there is such a difference between all and consecutive &'s.
The and operator won't work in this case.
index_pos = numpy.where(array[:,:,0]==10 and array[:,:,1]==15 and array[:,:,2]==30)Give this a try:
index_pos = numpy.where((array[:,:,0]==10) & (array[:,:,1]==15) & (array[:,:,2]==30)) 1 The problem is the use of the native Python and keyword, which doesn't behave the way you'd like on arrays.
Instead, try using the numpy.logical_and function.
cond1 = np.logical_and(array[:,:,0]==10, array[:,:,1]==15)
cond2 = np.logical_and(cond1, array[:,:,2]==30)
index_pos = numpy.where(cond2)You might even create your own version of logical_and that accepts arbitrary number of conditions:
def my_logical_and(*args): return reduce(np.logical_and, args)
condition_locs_and_vals = [(0, 10), (1, 15), (2, 30)]
conditions = [array[:,:,x] == y for x,y in conditition_locs_and_vals]
my_logical_and(*conditions)Using bitwise-and (&) works but only by coincidence. The bitwise-and is for comparing bits or bool types. Using it to compare the truth value of numeric arrays is not robust (for instance, if you suddenly need to index on locations where an entry evaluates to True rather than actually first converting to a bool array). logical_and really should be used instead of & (even if it comes with a speed penalty).
Also, chaining together arbitrary lists of conditions with & can be painful both to read and type. And for re-usability of the code, so that later programmers don't have to change around a bunch of the subordinate clauses to the & operator, it might be better to store the individual conditions separately, and then use a function like the one above to combine them.