In my Python script I have generated a dictionary with several data, which I want to add to a CSV file, only if the dictionary keys do not exist inside the file, or if they exist then they are overwritten so they do not duplicate .
How can I do this using pandas?
My script is already able to add the data to the dataFrame, but I duplicate the data.
data_dict = {'2018mbusasalesjan': [0, '2018mbusasalesjan', 2018, 'January', 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], '2018mbusasalesfeb': [1, '2018mbusasalesfeb', 2018, 'February', 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], '2018mbusasalesmar': [2, '2018mbusasalesmar', 2018, 'March', 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], '2018mbusasalesapr': [3, '2018mbusasalesapr', 2018, 'April', 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], '2018mbusasalesmay': [4, '2018mbusasalesmay', 2018, 'May', 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], '2018mbusasalesjun': [5, '2018mbusasalesjun', 2018, 'June', 0, 0, 0, 0, 0, 0, 0, 1000, 0, 0, 0], '2018mbusasalesjul': [6, '2018mbusasalesjul', 2018, 'July', 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], '2018mbusasalesaug': [7, '2018mbusasalesaug', 2018, 'August', 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], '2018mbusasalessep': [8, '2018mbusasalessep', 2018, 'September', 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], '2018mbusasalesoct': [9, '2018mbusasalesoct', 2018, 'October', 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], '2018mbusasalesnov': [10, '2018mbusasalesnov', 2018, 'November', 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], '2018mbusasalesdec': [11, '2018mbusasalesdec', 2018, 'December', 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}
datos_Sales_Mbusa =pd.read_csv('mbusasales.csv', index_col = 0, encoding = 'utf-8')
datos_Sales_Mbusa.apply(lambda x: pd.api.types.infer_dtype(x.values))
num_datos = int(datos_Sales_Mbusa['indice'].count())
datos_Sales_Mbusa.index = range(datos_Sales_Mbusa.shape[0])
indice_archivo = list(datos_Sales_Mbusa.indice)
llaves = data_dict.keys()
nuevo_Valor = num_datos+1
valores = list(data_dict.values())
for i, valor in enumerate(valores):
nuevo_valor = int(nuevo_Valor)+i
datos_Sales_Mbusa.loc[int(nuevo_valor)]= [valor[1], valor[2], valor[3], valor[4], valor[5], valor[6], valor[7], valor[8], valor[9],
valor[10], valor[11], valor[12], valor[13],valor[14] ]
datos_Sales_Mbusa.to_csv('mbusasales.csv', encoding = 'utf-8',)
The file 'mbusasales.csv contains the following data:
,indice,year,month,brocc conv,caulif conv,br spr,Sugar Snap Peas,Yellow Sq,Green Zucch,Brocc Org,Caulif Org,Carrots Org,Corn Org,Edamame Org
0,2018mbusasalesjan,2018,January,0,0,0,0,0,0,0,0,0,0,0
1,2018mbusasalesfeb,2018,February,0,0,0,0,0,0,0,0,0,0,0
2,2018mbusasalesmar,2018,March,0,0,0,0,0,0,0,0,0,0,0
3,2018mbusasalesapr,2018,April,0,0,0,0,0,0,0,0,0,0,0
4,2018mbusasalesmay,2018,May,0,0,0,0,0,0,0,0,0,0,0
5,2018mbusasalesjun,2018,June,0,0,0,0,0,0,0,0,0,0,0
6,2018mbusasalesjul,2018,July,0,0,0,0,0,0,0,0,0,0,0
7,2018mbusasalesaug,2018,August,0,0,0,0,0,0,0,0,0,0,0
8,2018mbusasalessep,2018,August,350,0,0,0,0,0,0,0,0,0,0
9,2018mbusasalesoct,2018,October,0,0,0,0,0,0,0,0,0,0,0
10,2018mbusasalesnov,2018,November,0,0,0,0,0,0,0,0,0,0,0
11,2018mbusasalesdec,2018,December,0,0,0,0,0,0,0,0,0,0,0
12,2017mbusasalesjan,2017,January,0,0,0,0,0,0,0,0,0,0,0
13,2017mbusasalesfeb,2017,February,0,0,0,0,0,0,0,0,0,0,0
14,2017mbusasalesmar,2017,March,0,0,0,0,0,0,0,0,0,0,0
15,2017mbusasalesapr,2017,April,0,0,0,0,0,0,0,0,0,0,0
16,2017mbusasalesmay,2017,May,0,0,0,0,0,0,0,0,0,0,0
17,2017mbusasalesjun,2017,June,0,0,0,0,0,0,0,0,0,0,0
18,2017mbusasalesjul,2017,July,0,0,0,0,0,0,0,0,0,0,0
19,2017mbusasalesaug,2017,August,0,0,0,0,0,0,0,0,0,0,0
20,2017mbusasalessep,2017,August,350,0,0,0,0,0,0,0,0,0,0
21,2017mbusasalesoct,2017,October,0,0,0,0,0,0,0,0,0,0,0
22,2017mbusasalesnov,2017,November,0,0,0,0,0,0,0,0,0,0,0
The column that indicates whether the data is already in the csv file is 'index' and in the dictionary is the key. If you compare the data of the dictionary with those of the file, you can see that these indexes are already in the file, therefore they should be written.