Suppose I have a list in .csv with different columns, one of them is "teacher name", the second "subject" and the third is "year".
There may be something similar to this:
Nombre profesor Asignatura Año
Juan Mates 2002
Pedro Lengua 2003
Luisa Mates 2005
Natalia Inglés 2002
Juan Inglés 2008
Natalia Física 2004
Juan Inglés 2018
Luisa Mates 2018
EDITO QUESTION
Is there any way to show through pandas
commands only those teachers who do more than one subject ? That is to say something like this:
Nombre profesor Count
Juan 2
Natalia 2
Since Juan comes out 3 times but twice his subject is the same, English and the same thing happens to Luisa, who comes out twice but the subject is the same, mates, therefore it does not count.
It occurs to me to use groupby()
datos = pd.read_csv('fichero.csv')
p = datos.groupby('Nombre_profesor','Asignatura').size().loc[lambda x: x>1]
But if I'm not mistaken this shows those teachers who repeat themselves and who do the same subject. That is, just the ones that I want to discard therefore I do not know what to do.
Thanks!