Delete null values ("nan") in an array

-1

What would be the most efficient way to replace "Nan" with zeros in a large numeric data matrix in Python using Numpy?

    
asked by Victor Villacorta 11.11.2017 в 16:51
source

1 answer

1

Simply assign the value you want by using the Boolean mask output numpy.isnan :

>>> import numpy as np

>>> a = np.array([[1, np.NaN, 2],
                  [3, 4, np.NaN],
                  [np.NaN, 9, 8]])

>>> a
array([[  1.,  nan,   2.],
       [  3.,   4.,  nan],
       [ nan,   9.,   8.]])

>>> a[np.isnan(a)] = 0      #<<<<<<<<<<<<<<<<<<
>>> a
array([[ 1.,  0.,  2.],
       [ 3.,  4.,  0.],
       [ 0.,  9.,  8.]])

In the supposition that your data comes from a list or any structure in which the NaN values are actually text strings ("NaN", "Nan", "nan", etc), you can use the argument dtype when building the array to define the type explicitly:

>>> import numpy as np

>>> l = [[1, "Nan", 2],
         [3, 4, "Nan"],
         ["Nan", 9, 8]]
>>> a = np.array(l, dtype = np.float)
>>> a[np.isnan(a)] = 0
    
answered by 11.11.2017 / 17:33
source