Get data from a string in Python


You can help me with the next problem I have.

I have the following chain:


I'm trying to get the data "7558", "7558", "7558" y "802", "850", "1627" This is my code:

for y in str(nvaCadena):
        o_pro = nvaCadena.index('Pro:') + 4         
        f_pro = nvaCadena.index('|Nom:')            
        p = nvaCadena[o_pro: f_pro]

        o_nom = nvaCadena.index('|Nom:') + 5
        f_nom = nvaCadena.index('|Esq:')
        n = nvaCadena[o_nom:f_nom]

But I get the following:

7558 - 802
7558 - 802
7558 - 802

When I really want to get:

7558 - 802
7558 - 850
7558 - 1627

Could you help me?

asked by José McFly Carranza 25.02.2017 в 00:03

1 answer


The problem you have here is that the index method of type str always will return the lowest index in which a substring coincides in the main chain and so far in each iteration of the for you are searching in the same chain. You need to modify the chain you analyze in each iteration, removing the data already found (taking into account the found indexes and the length of the extracted substrings). You must bear in mind that the type str is immutable, you would then need to use an auxiliary string, to which you can go assigning a new value, and preferably in a loop while ; all this may require a lot of unnecessary code. You should also bear in mind that index raise an exception of type ValueError if the substring is not found.

I suggest you use a regular expression to interpret the string, next to the function findall of the module re , for example:

import re

patron = r'(?:Pro|Nom):(\d+)'
texto = 'Tip:1-Cli:337|Neg:695|Pro:7558|Nom:802|Esq:1|Rub:None|Con:None|Emp:None|Com:1,Tip:1-Cli:337|Neg:695|Pro:7558|Nom:850|Esq:1|Rub:None|Con:None|Emp:None|Com:1,Tip:1-Cli:337|Neg:695|Pro:7558|Nom:1627|Esq:1|Rub:None|Con:None|Emp:None|Com:1'

resultado = re.findall(pattern, texto)

['7558', '802', '7558', '850', '7558', '1627']

I think that with this you can easily reach the result you require. Regular expressions are pretty powerful!

You could also modify the pattern to receive values of different variables, and even to receive the name of those variables and not lose with what data corresponds to what; for example, here I transform the string into something that can be manipulated more easily in Python:

patron = r'(Pro|Nom|Esq):(\d+)'
resultado = re.findall(pattern, texto)

[('Pro', '7558'), ('Nom', '802'), ('Esq', '1'), ('Pro', '7558'), ('Nom', ' 850 '), (' Esq ',' 1 '), (' Pro ',' 7558 '), (' Nom ',' 1627 '), (' Esq ',' 1 ')]

answered by 25.02.2017 в 03:54