pandas python data management

0

I have a Dataframe with cities

from io import StringIO
import pandas as pd

csv =StringIO(u'''\
city
bogota
manzanares
guapi
el paso
momil
valencia
beltran
Lima
''')

and I want you to look for a list of the department that corresponds to that city and add it in a column next to it and finally in another column place the country.

the result should be:

  
          city          Departamento      Pais
     0    bogota           D.C.          COLOMBIA
     1    manzanares       Caldas        COLOMBIA
     2    lima             Lima          Peru
  

The list to consult is like this:

csv =parsing_data(u'''\
Pais,Departamento,Ciudad            
COLOMBIA,ANTIOQUIA,MEDELLIN
COLOMBIA,ANTIOQUIA,ABEJORRAL
COLOMBIA,ANTIOQUIA,ABRIAQUI
COLOMBIA,ANTIOQUIA,ALEJANDRIA
COLOMBIA,ANTIOQUIA,AMAGA
COLOMBIA,ANTIOQUIA,AMALFI
COLOMBIA,ANTIOQUIA,ANDES
COLOMBIA,ANTIOQUIA,ANGELOPOLIS
COLOMBIA,ANTIOQUIA,ANGOSTURA
PERU,Lima,Lima
''')

some idea ?? Thanks!

    
asked by Yan Chirino 08.05.2018 в 22:10
source

1 answer

1

You can read both csv , each in a dataframe, use the city as index in both dataframes and then make a join of the first with the second.

In the following example the result is not very spectacular, because in the fragment of the second .csv the only city of the first csv that appears is "Lima". The ones you can not find filled with NaN . In your case I understand that you will have more complete data.

from io import StringIO
import pandas as pd
csv1 =StringIO(u'''\
city
bogota
manzanares
guapi
el paso
momil
valencia
beltran
Lima
'''.upper())

csv2 =StringIO(u'''\
Pais,Departamento,Ciudad
COLOMBIA,ANTIOQUIA,MEDELLIN
COLOMBIA,ANTIOQUIA,ABEJORRAL
COLOMBIA,ANTIOQUIA,ABRIAQUI
COLOMBIA,ANTIOQUIA,ALEJANDRIA
COLOMBIA,ANTIOQUIA,AMAGA
COLOMBIA,ANTIOQUIA,AMALFI
COLOMBIA,ANTIOQUIA,ANDES
COLOMBIA,ANTIOQUIA,ANGELOPOLIS
COLOMBIA,ANTIOQUIA,ANGOSTURA
PERU,LIMA,LIMA
''')

df1 = pd.read_csv(csv1).set_index("CITY")
df2 = pd.read_csv(csv2).set_index("Ciudad")
resultado = df1.join(df2).reset_index()

I had to capitalize the names of the cities in the first csv to match the names of the second ones. The result is:

         CITY  Pais Departamento
0      BOGOTA   NaN          NaN
1  MANZANARES   NaN          NaN
2       GUAPI   NaN          NaN
3     EL PASO   NaN          NaN
4       MOMIL   NaN          NaN
5    VALENCIA   NaN          NaN
6     BELTRAN   NaN          NaN
7        LIMA  PERU         LIMA
    
answered by 08.05.2018 / 22:42
source