Use Python 3.5 and Pandas 0.20.3
I get an error when I try to pivot a panda dataframe which I obtained using the concat function.
I detail the process.
This is my first dataframe.
df = pd.DataFrame([
['2017-01-03 21:00:00','2017-01-03 21:00:00','2017-01-04 21:00:00','2017-01-04 21:00:00',
'2017-01-05 21:00:00','2017-01-05 21:00:00'],
['RUT','RUT','RUT','RUT','RUT','RUT'],[65.00,26.00,-47.00,-8.00,32.00,10.00],
[59.09,20.21,-53.12,-13.96,26.45,4.25]]).T df.columns=[['Fecha_Hora','Ticker_Suby','Rtdo_Bruto_x_Estrat','Rtdo_Neto_x_Estrat']] df = df.sort_values(by=['Ticker_Suby','Fecha_Hora',], ascending=True)
So, what do I get:
Then I group and add rows by TickerSuby
and FechaHora
.
a = df.groupby(by=['Ticker_Suby','Fecha_Hora',]).sum() a.head(100)
Then I use the function concat
to join the dataframe 'a'
with the dataframe 'df'
and calculate the cumulative sums of the columns 'Rtdo_Bruto_x_Estrat'
and 'Rtdo_Neto_x_Estrat'
.
df = pd.concat([a.groupby(level=[1]).cumsum(),a.groupby(level=[0]).cumsum().add_suffix('_cum')], 1) df.reset_index(drop=False)
Well, my problem is that I try to pivot this dataframe df
using the column ' Fecha_Hora'
as index but I get the following error:
--------------------------------------------------------------------------- KeyError Traceback (most recent call last) C:\Users\Angel\Anaconda44\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance) 2392 uniques : index -> 2393 """ 2394 if self.is_unique and not dropna: pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5239)() pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5085)() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20405)() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20359)() KeyError: 'Fecha_Hora' During handling of the above exception, another exception occurred: KeyError Traceback (most recent call last) in () ----> 1 df1 = a.pivot(index="Fecha_Hora",columns ="Ticker_Suby",values = "Rtdo_Bruto_x_Estrat") 2 df1.head() 3 C:\Users\Angel\Anaconda44\lib\site-packages\pandas\core\frame.py in pivot(self, index, columns, values) 3949 Returns 3950 ------- -> 3951 unstacked : DataFrame or Series 3952 """ 3953 from pandas.core.reshape.reshape import unstack C:\Users\Angel\Anaconda44\lib\site-packages\pandas\core\reshape\reshape.py in pivot(self, index, columns, values) 373 index = self.index 374 else: --> 375 index = self[index] 376 indexed = Series(self[values].values, 377 index=MultiIndex.from_arrays([index, self[columns]])) C:\Users\Angel\Anaconda44\lib\site-packages\pandas\core\frame.py in __getitem__(self, key) 2060 ------- 2061 q : DataFrame -> 2062 2063 Notes 2064 ----- C:\Users\Angel\Anaconda44\lib\site-packages\pandas\core\frame.py in _getitem_column(self, key) 2067 multidimensional key (e.g., a DataFrame) then the result will be passed 2068 to :meth:'DataFrame.__getitem__'. -> 2069 2070 This method uses the top-level :func:'pandas.eval' function to 2071 evaluate the passed query. C:\Users\Angel\Anaconda44\lib\site-packages\pandas\core\generic.py in _get_item_cache(self, item) 1532 1533 multicolumn : boolean, default True -> 1534 Use \multicolumn to enhance MultiIndex columns. 1535 The default will be read from the config module. 1536 C:\Users\Angel\Anaconda44\lib\site-packages\pandas\core\internals.py in get(self, item, fastpath) 3588 3589 if not isnull(item): -> 3590 loc = self.items.get_loc(item) 3591 else: 3592 indexer = np.arange(len(self.items))[isnull(self.items)] C:\Users\Angel\Anaconda44\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance) 2393 """ 2394 if self.is_unique and not dropna: -> 2395 return self 2396 2397 values = self.values pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5239)() pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5085)() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20405)() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20359)() KeyError: 'Fecha_Hora' ----------------------
I have also tried to apply something simpler like:
a.pivot_table(df,index=["Fecha_Hora",'Rtdo_Bruto_x_Estrat_cum'])
But I get:
KeyError Traceback (most recent call last) in () 1 #df1 = a.pivot(index="Fecha_Hora",columns ="Ticker_Suby",values = "Rtdo_Bruto_x_Estrat") 2 #df1.head() ----> 3 a.pivot_table(df,index=["Fecha_Hora",'Rtdo_Bruto_x_Estrat_cum']) C:\Users\Angel\Anaconda44\lib\site-packages\pandas\core\reshape\pivot.py in pivot_table(data, values, index, columns, aggfunc, fill_value, margins, dropna, margins_name) 110 for i in values: 111 if i not in data: --> 112 raise KeyError(i) 113 114 to_filter = [] KeyError: 'Rtdo_Bruto_x_Estrat_cum' ------------------------------
I have searched the net for what could be wrong but I have not found anything.
Something I have noticed may be related is the dataframe df that I get when applying concat executed in the IDE Pycharm. As you can see in the image, the date field appears as "deactivated" or "non-operative".