So it seems data
is a DataFrame of Pandas so x
, y
, and z
are columns ( pandas.core.series.Series
). I guess in the end you want to get an array without the values NaN
.
Keep in mind that:
DEPT_inv = sp.sum(sp.isnan(x))
What it gives us is a integer with the number of elements that are NaN in the x
column. Then you try to apply the mask to DEPT_inv
, which is not possible because it is not an array, it is an integer ( "Invalid index to scalar value" ).
To get three arrays with only non-zero values you can simply use pandas.dropna()
.
import scipy as sp
DEPT_filtered = sp.array(data['DEPT'].dropna())
NPHI_filtered = sp.array(data['NPHI'].dropna())
RT_filtered = sp.array(data['RT'].dropna())
With this you get three arrays with only the non-zero values of each column.
If you want to use a mask anyway, then you must pass each Series to an array of NumPy first:
import scipy as sp
x = sp.array(data['DEPT'])
y = sp.array(data['NPHI'])
z = sp.array(data['RT'])
DEPT_filtered = x[~sp.isnan(x)]
NPHI_filtered = y[~sp.isnan(y)]
RT_filtered = z[~sp.isnan(z)]
You do not clearly specify what the final result you want must be, keep in mind that if you want to eliminate rows of data that have all their null values then you can apply dropna
on data directly:
data.dropna(axis=0, how='all', inplace = True)