Transform dataframe dictionary into a single DataFrame

3

I have a dictionary that has Pandas dataframes as values. I want to get a new DataFrame joining all the dataframes contained in the dictionary. The problem is that I need to create a new column in which each row has the value of the key of that DataFrame in the dictionary.

I think it will be better understood with an example:

We have the following dictionary, which we will call diccionario :

{0:     count  eclipse_id
    7      19       10453
    56     12       11305
    48     11       11305

 1:     count  eclipse_id
    13      9       11305
    9       5       11339}

The goal is to transform it into:

count   item_id    user_id
   19     10453          0
   12     11305          0
   11     11305          0
    9     11305          1
    5     11339          1

As you can see, the value of each row in the column user_id corresponds to the key that its respective DataFrame had in the original dictionary.

    
asked by ThePassenger 06.07.2017 в 11:50
source

1 answer

4

The simplest thing is to use pandas.concat and use its parameter keys to create the new column. The parameter keys is passed an iterable with one item for each dataframe to be concatenated and this item is used to create a new index. You just have to pass that index to a normal column and rename it to "user_id":

import pandas as pd


#Creamos un diccionario de DataFrames para emular tu ejemplo
datos = {0: pd.DataFrame({'count': [19, 12, 21],
                          'eclipse_id': [10453, 11305, 11305]},
                          index = [7, 56, 48]),
         1: pd.DataFrame({'count': [9, 5],
                          'eclipse_id': [11305, 11339]},
                          index = [13, 9])
         }


# Ahora el diccionario "datos" es igual al que tu presentas.

df = pd.concat(datos.values(), keys=datos.keys())
df.reset_index(level=0, inplace =True)
df.rename(columns={'level_0': 'user_id'}, inplace=True)

Exit:

>>> df    

    user_id  count  eclipse_id
7         0     19       10453
56        0     12       11305
48        0     21       11305
13        1      9       11305
9         1      5       11339

You can change the order of the columns, sort it as you want or change the index.

    
answered by 06.07.2017 / 15:23
source