Problem when comparing two unicode variables in Python 2.7

0

I have a problem comparing two unicode variables in python 2.7. What I want to do in my program is open a file with characters (they are specifically Japanese characters) and compare with a series of characters of the same type entered by the user.

For some reason, the program skips the comparison even though it already checks that they are the same type of variable.

Here I leave the portion of the code I'm working with.

    # -*- coding: utf-8 -*-
    na = []
    n = open("jap.txt", "r")
    for linea in n:
            na.append(linea)
    n.close()
    ingreso = unicode(raw_input("Respuesta: "), "utf-8")
    if (ingreso == na[1]):
            print "Correcto"
    else:
            print "Incorrecto"
    
asked by Kazeazul 30.09.2017 в 02:53
source

1 answer

1

First, I'm going to assume that your jap.txt file uses UTF-8 as encoding.

The first problem you have is that you do not delete the line breaks when reading your txt. To then be able to compare with the input it is necessary that they be eliminated. A line of your txt will be seen "ず\n" or "ず\r\n" so if your input is "ず" , the comparison will always be False . For this it is enough to use the stript method

On the other hand, you should read the file with the appropriate coding so that each element of your list of lines is a unicode variable with UTF-8 coding and you can compare it with your input. The simplest thing is to use the module codecs of the standard library.

The code would look something like this:

# -*- coding: utf-8 -*-

import codecs

with codecs.open("jap1.txt", "r", encoding='utf-8') as f:
    na = [linea.strip() for linea in f]

ingreso = unicode(raw_input("Respuesta: "), 'utf-8')
if (ingreso == na[1]):
    print "Correcto"
else:
    print "Incorrecto"

I leave you an example of real execution, on the left you can see the content of jap.txt . At all times the entry is compared to the second line as you do (click on the gif to expand):

    
answered by 30.09.2017 в 09:51