Edit For a solution without pandas, go to the end.
Pandas is a library designed to work with tabulated data, which can be viewed as a series of data organized in columns, and capable of doing a great variety of operations, transformations, groupings, etc., of that data.
Using this library, what you are asking is quite simple. Although the explanation step by step will take me a lot of space, you will see that at the end two lines of code can do it all.
import pandas as pd # Esta es la forma típica de importarlo
# Estos son tus datos de entrada
productos = [
{'nombre': 'Jumbo maní', 'cantidad': 30, 'categoria': 'Jet'},
{'nombre': 'Jumbo maní', 'cantidad': 50, 'categoria': 'Jet'},
{'nombre': 'Papas de pollo', 'cantidad': 15, 'categoria': 'Margarita'},
{'nombre': 'Papas de pollo', 'cantidad': 12, 'categoria': 'Margarita'},
{'nombre': 'Ducales', 'cantidad': 25, 'categoria': 'Noel'},
{'nombre': 'Ducales', 'cantidad': 50, 'categoria': 'Noel'},
{'nombre': 'Bombón', 'cantidad': 30, 'categoria': 'Noel'}
]
# Pandas trabaja con "DataFrames", pero es capaz de crear uno a partir
# de los datos que le des, admitiendo muchos formatos, o de un archivo csv
# o descargándoselos de internet, o de otras fuentes. En este caso:
df = pd.DataFrame(productos)
Once your data has been transformed into DataFrame, we can dump it, to see how Pandas has organized it into a table:
>>> print(df)
cantidad categoria nombre
0 30 Jet Jumbo maní
1 50 Jet Jumbo maní
2 15 Margarita Papas de pollo
3 12 Margarita Papas de pollo
4 25 Noel Ducales
5 50 Noel Ducales
6 30 Noel Bombón
Now we can use Pandas operators as .groupby()
to group by a column (in this case "nombre"
) or by several (for example "nombre"
and "categoria"
, so that the category information does not disappear). To the result of that grouping we apply .sum()
in this case (there are more possible operations).
>>> print(df.groupby(("nombre", "categoria")).sum())
cantidad
nombre categoria
Bombón Noel 30
Ducales Noel 75
Jumbo maní Jet 80
Papas de pollo Margarita 27
The result for Pandas is a table in which the index, instead of being a number like before it went from 0 to 6, is now the pair "category-name". For each pair "category-name" the table has a row, and in the "quantity" column you have the sum you were looking for.
If you want to transform this result again into a list of dictionaries like the one you showed in the question, then you have to undo the index of pairs created by pandas, using the operation .reset_index()
, with which the multi-index is converts in new columns "name" and "category":
>>> print(df.groupby(("nombre", "categoria")).sum().reset_index())
nombre categoria cantidad
0 Bombón Noel 30
1 Ducales Noel 75
2 Jumbo maní Jet 80
3 Papas de pollo Margarita 27
This resulting dataframe can be converted back to list of dictionaries by the .to_dict()
method, which supports different output formats, being in this case the "record" format that we are interested in:
>>> print(df.groupby(("nombre", "categoria")).sum().reset_index().to_dict(orient="records"))
[{'cantidad': 30, 'categoria': 'Noel', 'nombre': 'Bombón'},
{'cantidad': 75, 'categoria': 'Noel', 'nombre': 'Ducales'},
{'cantidad': 80, 'categoria': 'Jet', 'nombre': 'Jumbo maní'},
{'cantidad': 27, 'categoria': 'Margarita', 'nombre': 'Papas de pollo'}]
In summary, these are the two promised lines that do it all:
df = pd.DataFrame(productos)
lista_nueva = df.groupby(("nombre", "categoria")).sum().reset_index().to_dict(orient="records")
Without Pandas
With pure python it is also relatively simple. The idea is to go through your entry list and take the pairs (name, category) to use them as keys in a dictionary, whose values will be the amounts that are calculated. A defaultdict(int)
simplifies the loop.
from collections import defaultdict
acumulador = defaultdict(int)
for producto in productos:
acumulador[(producto["nombre"], producto["categoria"])] += producto["cantidad"]
Once we have the results in acumulador
, this can be used to build your output list, iterating through the key pairs, value of acumulador
and using them to create new result dictionaries with them.
lista_nueva = [{"nombre": nombre, "categoria": cat, "cantidad": valor }
for (nombre, cat), valor in acumulador.items()]
This comes out:
[{'cantidad': 80, 'categoria': 'Jet', 'nombre': 'Jumbo maní'},
{'cantidad': 27, 'categoria': 'Margarita', 'nombre': 'Papas de pollo'},
{'cantidad': 75, 'categoria': 'Noel', 'nombre': 'Ducales'},
{'cantidad': 30, 'categoria': 'Noel', 'nombre': 'Bombón'}]