I must iterate over a column and delete the special characters and letters. Any ideas? The issue is that it is not always at the beginning of the chain.
EXAMPLE:
PC 17572
546RT*5
RESULT:
17527
5465
I must iterate over a column and delete the special characters and letters. Any ideas? The issue is that it is not always at the beginning of the chain.
EXAMPLE:
PC 17572
546RT*5
RESULT:
17527
5465
You can use .map()
on that column, passing it a function that does the job of keeping only the characters that are digits. I'm not sure if you want what you return to be another string, or if you want the final numerical value. I will assume the former.
One such function could be the following, which receives a string as "X-223A* 14"
and returns another string as "22314"
.
def dejar_solo_cifras(txt):
return "".join(c for c in txt if c.isdigit())
This function can be applied to all elements of a column using .map()
, as the following example shows:
>>> import pandas as pd
>>> df = pd.DataFrame({"datos": ["PC 17572",
"546RT*5"]})
>>> df
datos
0 PC 17572
1 546RT*5
>>> df.datos = df.datos.map(dejar_solo_cifras)
>>> df
datos
0 17572
1 5465