Python Pandas error "ValueError: can not merge DataFrame with instance of type class 'list'"

2

I need to add data from other dataframes in a dataframe, using the merge function, to get it to only take those values whose indexes have the same date. The sentence that I show next, returns me the error mentioned in the Title.

df_total = pd.merge(cotiz_diaria, [df, df1, df2, df3, df4, df5, df6, df7, df8, df9, df10, df11, df12], left_index=True, right_index=True) 
df_df_total

The error seems to indicate that it does not support df lists.

How else could this union be made, other than adding each dataframe one by one?

    
asked by efueyo 08.08.2018 в 20:57
source

1 answer

0

Indeed, your diagnosis is adequate. What happens is that merge() does not accept a list of Dataframe at most allows you to pass two of these objects. But you can iterate over the list and do the process of merge . For example:

import pandas as pd
import numpy as np

df1 = pd.DataFrame(np.array([['a', 1, 2]]))
df2 = pd.DataFrame(np.array([['b', 3, 4]]))
df3 = pd.DataFrame(np.array([['c', 5, 6]]))

dfs = [df1, df2, df3]

We have created a list dfs that contains 3 DataFrame , now we can do the merge :

dfs = iter(dfs)
df_final = next(dfs)
for df_ in dfs:
    df_final = df_final.merge(df_, left_index=True, right_index=True)

print(df_final)

  0_x 1_x 2_x 0_y 1_y 2_y  0  1  2
0   a   1   2   b   3   4  c  5  6

Detail:

  • With dfs = iter(dfs) we convert the list into a iterador , this is the way we are going to process it, on one hand we need the first element and then the rest, it is preferable to do so and avoid making copies of lists.

  • With df_final = next(dfs) we initialize the DataFrame final% with the first object in the list

  • Then we simply iterate over the following elements in the list and with df_final = df_final.merge(df_, left_index=True, right_index=True) we are doing the merge of each object.

An identical result but with fewer lines of code is to use the reduce() function %

from functools import reduce

dfs = [df1, df2, df3]
df_final = reduce(lambda left,right: pd.merge(left,right,left_index=True, right_index=True), dfs)
print(df_final)
    
answered by 08.08.2018 / 21:42
source