Know if a row of a dataframe has a NaN value - Pandas

Question

Know if a row of a dataframe has a NaN value - Pandas

Navigation

#1 by (1 votes)

1

I am trying to learn and get loose with pandas (which is costing me). I have a dataframe similar to this one:

If I want to know if there is any NaN, None or NaT value in the set I apply the following code.

t = df.isnull().any().any()
print t

If I want to know by columns I apply this,

r = df.isnull().any()
print r

If I want to know value to value,

a = df.isnull()
print a

But what if I want to know which rows have at least one lost value? That is, show me which rows have NaN, None or NaT.

I do not get anything coherent, all the tests have led me to error or to the previous cases.

EDITO

What if I want to know which rows have more than one value lost between their different columns ? For example, to know which rows (samples) have between all their columns 2 or more lost values.

How Abulafia has answered before, to know if a row has a lost value I apply,

df.isnull().any(axis=1)'

It occurs to me that (but it does not work),

df[df.isnull().any(axis=1)>1]

python python-3.x pandas

asked by NEA 10.12.2018 в 19:52

source

1 answer

Problem with "Gradle sync failed: Could not find any matches for com.android.tools.build:gradle:2.2.+" Decompose a String into parts

score 1 · Accepted Answer

As you indicate in the question, the result of:

df.isnull()

is a dataframe with as many rows and columns as df , but whose values are Boolean: True in the cells where there was a NaN or equivalent, and False in the others.

To this dataframe you can apply the .any() method that examines it by default by columns. The result will be a Series (one-dimensional) with as many elements as columns had df , its index being the name of each column and its values a True in the elements that represent columns that contain some True .

But what you did not know is that .any() can be passed a parameter to operate by rows instead of columns:

df.isnull().any(axis=1)

In this case the result is a Series (one-dimensional) with the same index as the original df and with Boolean values that are the result of applying any() to each row.