I want to get rid of lines from a dataframe that have NaN
but when I do item_info.dropna(axis = 0, how='all')
,
that comes from the pandas.pydata.org documentation, it does not work good:
item_info.dropna(axis = 0, how='all')
Using this with
m2 = ranking_factorization_recommender.create(subcriber_eclipse,
target='count',
user_data = subcriber_eclipse,
item_data = sf_test)
Give the following error:
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-44-02025aac0088> in <module>() ----> 1 item_info.dropna(axis = 0, how='all') 2 3 #item_info.fillna(0, inplace=True) 4 5 #print item_info TypeError: dropna() got an unexpected keyword argument 'axis'
the table comes from a SQL query:
item_info = graphlab.SFrame.from_sql(conn,"""--- matrice d'utilisation des hastags par les eclipses
SELECT COUNT (eclipse_hashtag.eclipse_id), eclipse_hashtag.hashtag_id,eclipse_hashtag.eclipse_id FROM eclipse_hashtag
GROUP BY eclipse_hashtag.hashtag_id, eclipse_hashtag.eclipse_id
ORDER BY eclipse_hashtag.hashtag_id,eclipse_hashtag.eclipse_id ASC;
""")
item_info.rename({'eclipse_id':'item_id'})
Here is the structure:
type(item_info)
graphlab.data_structures.sframe.SFrame
And here is the trace of the error in full
[ERROR] graphlab.toolkits._main: Toolkit error: Missing value (None) encountered in column 'item_id.1'. Use the SFrame's dropna function to drop rows with 'None' values in them.' --------------------------------------------------------------------------- ToolkitError Traceback (most recent call last) <ipython-input-8-56b88cdef560> in <module>() 9 m2 = ranking_factorization_recommender.create(subcriber_eclipse, target='count', 10 user_data = subcriber_eclipse, ---> 11 item_data = item_info) 12 /home/antoine/anaconda2/lib/python2.7/site-packages/graphlab/toolkits/recommender/ranking_factorization_recommender.pyc in create(observation_data, user_id, item_id, target, user_data, item_data, num_factors, regularization, linear_regularization, side_data_factorization, ranking_regularization, unobserved_rating_value, num_sampled_negative_examples, max_iterations, sgd_step_size, random_seed, binary_target, solver, verbose, **kwargs) 267 opts.update(kwargs) 268 --> 269 response = _graphlab.toolkits._main.run('recsys_train', opts, verbose) 270 271 return RankingFactorizationRecommender(response['model']) /home/antoine/anaconda2/lib/python2.7/site-packages/graphlab/toolkits/_main.pyc in run(toolkit_name, options, verbose, show_progress) 87 _get_metric_tracker().track(metric_name, value=1, properties=track_props, send_sys_info=False) 88 ---> 89 raise ToolkitError(str(message)) ToolkitError: Missing value (None) encountered in column 'item_id.1'. Use the SFrame's dropna function to drop rows with 'None' values in them.