Naive Bayes Python Probability error


I have a question, and that is that I have 2 dataset, one is AdultTest and another AdultData.

In those dataset you have many rows of this type:

39, State-gov, 77516, Bachelors, 13, Never-married, Adm-clerical, Not-in-family, White, Female , 2174, 0, 40, United-States, >50K

and I would like to calculate the probability that a "Female" has more than> 50K, for this I did the following:

    #Lee AdultData.csv y lo pone como Integer, así puede calcular el naiveBAyes
    data1= np.genfromtxt('AdultData.csv',delimiter=',', dtype='int',skip_footer=1)
datatest=np.genfromtxt('adultTest.csv',delimiter=',', dtype='int',skip_footer=1)

    #Borra la ultima columna, porque esa es el target
    data_new=np.delete(data2, 14, 1)
dataTest_new=np.delete(datatest, 14, 1)
    Class =[row[14] for row in data2]
    from sklearn.naive_bayes import BernoulliNB
    clf= BernoulliNB(), Class)
# print(clf.predict_proba(dataTest_new))

and as a result of the probability prediction it always gives me:

[1. 0.]

I do not understand why, even if I put the AdultTest one, the same results come out, even though it has other data, because I do not get other results? What does the 2 columns mean?

could someone help me?

Greetings and thanks in advance!

asked by TomaateTip 29.08.2018 в 14:40

0 answers