numpy get index where value is true

>>> ex=np.arange(30)
>>> e=np.reshape(ex,[3,10])
>>> e
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [10, 11, 12, 13, 14, 15, 16, 17, 18, 19], [20, 21, 22, 23, 24, 25, 26, 27, 28, 29]])
>>> e>15
array([[False, False, False, False, False, False, False, False, False, False], [False, False, False, False, False, False, True, True, True, True], [ True, True, True, True, True, True, True, True, True, True]], dtype=bool)

I need to find the rows that have true or rows in e whose value are more than 15. I could iterate using a for loop, however, I would like to know if there is a way numpy could do this more efficiently?

4 Answers

To get the row numbers where at least one item is larger than 15:

>>> np.where(np.any(e>15, axis=1))
(array([1, 2], dtype=int64),)
1

You can use the nonzero function. it returns the nonzero indices of the given input.

Easy Way

>>> (e > 15).nonzero()
(array([1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]), array([6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]))

to see the indices more cleaner, use transpose method:

>>> numpy.transpose((e>15).nonzero())
[[1 6] [1 7] [1 8] [1 9] [2 0] ...

Not Bad Way

>>> numpy.nonzero(e > 15)
(array([1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]), array([6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]))

or the clean way:

>>> numpy.transpose(numpy.nonzero(e > 15))
[[1 6] [1 7] [1 8] [1 9] [2 0] ...
3

A simple and clean way: use np.argwhere to group the indices by element, rather than dimension as in np.nonzero(a) (i.e., np.argwhere returns a row for each non-zero element).

>>> a = np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> np.argwhere(a>4)
array([[5], [6], [7], [8], [9]])

np.argwhere(a) is almost the same as np.transpose(np.nonzero(a)), but it produces a result of the correct shape for a 0-d array.

Note: You cannot use a(np.argwhere(a>4)) to get the corresponding values in a. The recommended way is to use a[(a>4).astype(bool)] or a[(a>4) != 0] rather than a[np.nonzero(a>4)] as they handle 0-d arrays correctly. See the documentation for more details. As can be seen in the following example, a[(a>4).astype(bool)] and a[(a>4) != 0] can be simplified to a[a>4].

Another example:

>>> a = np.array([5,-15,-8,-5,10])
>>> a
array([ 5, -15, -8, -5, 10])
>>> a > 4
array([ True, False, False, False, True])
>>> a[a > 4]
array([ 5, 10])
>>> a = np.add.outer(a,a)
>>> a
array([[ 10, -10, -3, 0, 15], [-10, -30, -23, -20, -5], [ -3, -23, -16, -13, 2], [ 0, -20, -13, -10, 5], [ 15, -5, 2, 5, 20]])
>>> a = np.argwhere(a>4)
>>> a
array([[0, 0], [0, 4], [3, 4], [4, 0], [4, 3], [4, 4]])
>>> for i,j in a: print(i,j)
...
0 0
0 4
3 4
4 0
4 3
4 4
>>>
1

I prefer np.flatnonzero(arr) to the nonzero() option when you only need the row idx. arr.nonzero() works, but it returns a tuple instead of an array. flatnonzero() is equivalent to np.nonzero(np.ravel(arr))[0].

As mentioned in the comments, np.where() is discouraged by the NumPy docs.

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

You Might Also Like