How to split a .csv file by a field

1

It turns out that I have a file, in which to save the surveys and questions (columns) for example:

opinion|date    |¿Tienes coche?|Color coche|¿Tienes móvil?|Tipo_contrato|Seguro hogar
Coches |28/08/18|    Si        |Rojo       |              |               |
Movil  |28/08/18|              |           |Si            |Fijo           |
Hogar  |28/08/18|              |           |              |               |No

And I want to partition the file according to the type of opinion and discard the other columns, for example if I give birth to Car , create a file with the questions of that survey

opinion|date    |¿Tienes coche?|Color coche|
Coches |28/08/18|    Si        |Rojo       |

I was doing this process with python, using pandas.

How can I do this process?

    
asked by Jav 29.08.2018 в 15:53
source

1 answer

1

I think I've got what I wanted to do, in case someone serves him:

1 I read the raw file and keep it filtered by the column I need but still contains empty columns

import pandas as pd
import numpy as np

leerDatos = pd.read_csv('opiniones.csv', sep = ';',encoding='utf-8')
leerDatos = leerDatos.replace('',np.nan)
#leerDatos = leerDatos.dropna(axis ="columns", how ="any")
datos = pd.DataFrame(leerDatos)
#En datos2 elijo la columna por la que quiero partir el fichero
datos2 = datos[datos['opinion']== 'coche']    
#Guardo el fichero por ejemplo en .csv
datos2.to_csv('fichero_opinion_filtrado.csv')

2 I read the filtered file with leftover columns (empty columns) and I remove them

leerDatosFiltrados = pd.read_csv('fichero_opinion_filtrado.csv', sep = ',', encoding = 'utf-8')
datosFiltrados = pd.DataFrame(leerDatosFiltrados)
datosFiltrados = datosFiltrados.replace('',np.nan)
datosFiltrados = datosFiltrados.dropna(axis="columns", how="any")
datosFiltrados.to_csv('fichero_final_1.csv')

This will eliminate empty columns and get for each opinion, the necessary columns

    
answered by 29.08.2018 в 17:04