Read certain elements in a string

3

Let's say that I have this string a='12345&&&4554444' as I do to read only the numbers of that string without having to do it with a cycle since the only way I know is going through the list.

variable=str('')
for i in a:
    if i!='&':
         variable+=i
print(variable)

Whose output would be:

   1234545564444

Is there another way to do this?

Thank you!

    
asked by DDR 17.03.2018 в 18:33
source

2 answers

4

If you only have digits and "&" and you want to delete all the characters "&" you can use str.replace simply:

a = '12345&&&4554444'
variable = a.replace("&", "")
print(variable)

More general options that work regardless of what characters you have mixed with the digits there are more, including:

  • Regular expressions:

    import re
    
    a = '12345&&&4554444'
    variable = "".join(re.findall('\d+', a))
    print(variable)
    

    or using re.sub:

    import re
    
    a = '12345&&&4554444'
    variable = re.sub("\D", "", a)
    print(variable)
    

    In this case we tell you to substitute an empty string (delete) for any character other than a digit ( \D ).

  • Use a set ( set ) with the allowed characters to filter the data by means of a generator and a conditional, taking advantage of the efficiency of the searches in the hash tables:

    a = '12345&&&4554444' 
    permitidos = set('0123456789')
    variable = "".join((c for c in a if c in permitidos))
    print(variable)
    
  • str.isdigit , which returns True if all the characters in a string are digits:

    a = '12345&&&4554444'
    variable = "".join((c for c in a if c.isdigit()))
    print(variable)
    
  • str.tranlate :

    import string
    
    
    class TransTable:
        def __init__(self, intab):
            self._trasntable = {ord(c): c for c in intab}
    
        def __getitem__(self, char):
            return self._trasntable.get(char)
    
    
    a = '12345&&&4554444'
    trans_table = TransTable(string.digits)
    variable = a.translate(trans_table)
    print(variable)
    

    str.tranlate receives a "table" that has ordinal Unicode values as index and that for each one returns the value by which it must be replaced. The "table" can be any object that implements the __getitem__ method, such as a dictionary. The class TransTable is initialized with a sequence of characters that are going to be allowed, each time its method is called __getitem__ it returns the same character if it is among the allowed ones or None otherwise (which implies that character is eliminated by translate ).

An observation, concatenating strings ( cad = "foo" + "bar" ) is especially inefficient given its immutable character, which implies creating a new object each time it is done. str.join is better alternative, especially if it is used together with a generator, since we avoid the construction of intermediate objects.

Both str.tranlate , regular expressions and how to use a set have the advantage of being able to very simply specify which characters we want to keep in the chain.

    
answered by 17.03.2018 / 18:45
source
4

Yes, there are several ways, but first of all, when building a character-by-character string, as you do with variable , the approach you follow is concatenating letters at the end with the operator += It is inefficient because of the way Python handles the strings.

Since a string for python is immutable, when you add something to it, it actually creates a new string by copying the previous string plus what you have added to it. The previous string is discarded. This repeated many times involves copying the string many times, so instead a list is used, which does allow you to add things at the end (with .append() ) instead of copying everything every time you add something.

Finally the resulting list can be converted into a chain with the operator str.join()

Now let's see different ways to solve the problem

Using lists

Basically your code, but changing the string variable for a list:

a = '12345&&&4554444'
variable= []
for i in a:
    if i!='&':
         variable.append(i)
variable = "".join(variable)
print(variable)

List comprehensions ( list comprehensions )

You can use list comprehensions , which is a characteristic of the python language that allows replacing loops with a line of code. Not only is it more compact, and in my opinion more readable although that goes with tastes, but also slightly faster:

a = '12345&&&4554444'
variable = [i for i in a if i!='&']
variable = "".join(variable)
print(variable)

Functional Programming

If you come from the Lisp world or have a mathematical mind, you may be interested in the functional paradigm, which also allows you to eliminate loops by changing them for functions that are received as iterable parameters and other functions and internally apply the function in question to each value of the iterable.

This mode does not allow anything that can not be done also with list comprehensions and in fact the Python creator prefers the comprehensions with which the functional characteristics of python (as map() , filter() and others) have been relegated to a separate module ( functools ) instead of being part of the language as they were in version 2.

Personally I find the syntax of the list comprehensions more elegant, but it goes with tastes. This would be the functional mode:

a = '12345&&&4554444'
variable = filter(lambda i: i!='&', a)
variable = "".join(variable)
print(variable)

In this case, filter() expects two parameters. The second is an iterable one. The first is a function that will apply to each iterable element. If the function returns true , it accepts the element. If you do not reject it. What it returns is another iterable one with the accepted elements (which I later convert into a string with "".join() )

The first parameter that happened to filter() is a lambda , which is nothing more than a kind of ultrasilver functions whose code consists only of an expression whose evaluation will be the returned value. They are written by putting the word lambda , the name of the parameter (in this case i ), two points and the "body" of the function, which is the expression to evaluate, whose result is what will be returned. No control statement can be placed in this function. Just an expression.

String-specific functions

To substitute sub-chains, extract parts of them that follow a standard pattern, "translate" each character by a different one, etc. Python supplies many methods in class str , and in module re (regular expressions). I do not put here examples because there are already in other answers.

    
answered by 17.03.2018 в 18:49