How to get names and surnames from a text string with spaces and save them in a list in Python?

0

I am working on the processing of excel files that have a work data form.

I am currently using Pandas and I get lists for each field consisting of 2 cells.

The problem that occurs to me now is in terms of the data of the worker's children, since this can have more than 1, and the form was filled using spaces to separate the names within the same cell, so I get the following:

['NOMBRE Y APELLIDO DE HIJOS:', 'MARIANA ROSALES                               YESENIA ROSALES']

I access this one in the following way:

dato_personal[1]

My question is:

Is there any way to cut the text string so that it is left ["MARIANA", "ROSALES", "YESENIA", "ROSALES"] , or even for 3 names and last names?

It would be something like cutting in each space, but I can not think of the form, I found it was like cutting in a certain occurrence:

posicion_de_corte = dato_personal[1].replace(" ", 'X', 2).find(" ")
primer_dato = dato_personal[1][:posicion_de_corte]
primer_dato_clean = primer_dato.strip()

But it only works well for 2 two names.

And if I use a for cycle, when looking for the occurrence of space, something like: 15 16 17 18 19 20 21 22 23 , which are all blank spaces

    
asked by Victor Alvarado 09.11.2018 в 16:32
source

3 answers

0

Finish using this function:

def separar_en_lista(dato_personal, lista_datos_personales_sin_espacios):
        datos_separados_en_lista = dato_personal[1].split(" ")
        datos_separados_en_lista_sin_espacios = list(filter(None, datos_separados_en_lista))
        lista_datos_personales_sin_espacios[indice] = [
            dato_personal[0], datos_separados_en_lista_sin_espacios]

Where:

1) personal_data is the list that contains the two values of my form (field and value).

2) list_personal_data_without_spaces is my main list which I want to go through.

Example:

dato_personal
[0] => ["NOMBRES Y APELLIDOS"]
[1] => ["YESENIA MORALES                            PEDRO PEREZ"]


separar_en_lista(dato_personal, lista_de_ejemplo)

dato_personal
[0] => ["NOMBRES Y APELLIDOS"]
[1] => ["YESENIA, MORALES, PEDRO, PEREZ"]
    
answered by 09.11.2018 / 19:11
source
4

you can try this:

data = ['NOMBRE Y APELLIDO DE HIJOS:', 'MARIANA ROSALES                               YESENIA ROSALES']  
hijos = data[1].split()
hijos = [" ".join(x) for x in zip(hijos[::2], hijos[1::2])]
    
answered by 09.11.2018 в 17:14
1

You can try using the re of regular expressions library to find patterns.

For example:

import re data = ['NOMBRE Y APELLIDO DE HIJOS:', 'MARIANA ROSALES YESENIA ROSALES']
apellidos = re.split('\W+', data[1])

It will search and save all strings with 1 or more alphanumeric characters in a strings list.

The result is:

apellidos = ['MARIANA', 'ROSALES', 'YESENIA', 'ROSALES']

    
answered by 12.11.2018 в 16:22