Delete tildes in Python 3.6

2

enter the code here I am trying to delete the accents of a string I get when decrypting, looking in Google I found that to remove accents the unicodedata.normalize('NFD', string) method is used but when using it does not delete the accents, the code I have is the following:

import unicodedata
import gnupg
path = 'ruta del archivo encriptado'
gpg = gnupg.GPG(gpghome='~/.gnupg')
data = gpg.decrypt_file(open(path, 'rb'))
data = unicodedata.normalize('NFD', str(data))

When I print the data variable I get the following:

print(data)
>>> Roman González

The encrypted file is a JSON that contains the following:

{
    "name": "Roman González"
}
    
asked by Roman González 26.03.2018 в 20:23
source

2 answers

0

Investigating a bit more I found the solution, you had to code the string with raw_unicode_escape and then decode it with utf-8

data.encode('raw_unicode_escape').decode('utf-8')

source: link

    
answered by 26.03.2018 в 21:15
0

When you print , the terminal manages to put together the two unicode characters that now represent the a con tilde . If you check it, you will see that they are two characters, so you only have to keep the first one:

def normalize(c):
    return unicodedata.normalize("NFD",c)[0]

data = ''.join(normalize(c) for c in str(data))

Another possible solution to eliminate the remaining characters would be to ignore them:

data = unicode.normalize("NFD",str(data))
data = data.encode("utf8").decode("ascii","ignore")
    
answered by 03.07.2018 в 11:58