Problem with the "encoding": Characters correctly stored in MySQL appear "rare" in Java

9

Dear, I have a small problem with a Java application that I am writing.

It is an application that connects via JDBC to MySQL. When saving the records, the characters are saved correctly (if I write "Mexico" in a JText box and send it to the database, it is saved as "Mexico"); but when reading them back using the ResultSet.getString() method the "special" characters (accents and "ñ") appear wrong ("Mexico" reads "Mexico").

I think it's something that has to do with the "encoding" of the characters, but I do not know specifically what it is .. The MySQL database uses the encoding utf8_spanish_ci , and the function Charset.defaultCharset() returns UTF-8 .

My specific question is, then: How to get MySQL read strings that contain special characters (that were correctly stored) to be displayed correctly in the Java application?

Update (partial solution):

After looking for a while, I found this question and its answer that helped me. Specifically, what it says is that when you open the connection you have to specify the set of characters that will be used; in my case:

DriverManager.getConnection(
           "jdbc:mysql://" + host + "/" + dbName 
           + "?useUnicode=true&characterEncoding=utf8_spanish_ci", 
    user, pass);

However, it only solves partially the problem:

When reading data from fields VARCHAR the special characters are read correctly. However, when reading fields JSON , values that have special characters are still shown "rare".

Update (final):

The problem has to do with the encoding that MySQL uses to save the JSON data; The answer below illustrates the procedure I followed to solve it.

    
asked by Barranka 30.04.2016 в 18:02
source

1 answer

6

After scratching the problem for a while, I found this reference in the MySQL user manual:

  

MySQL handles strings used in JSON context using the utf8mb4 character set and utf8mb4_bin collation. Strings in other character set are converted to utf8mb4 as necessary.

So, regardless of the character encoding I use, MySQL automatically converts the JSON string to utf8mb4 ... which is no major problem when saving, but when reading back: (

My solution (and I suspect it is not the best), was the following: Write, within the query, the conversion to the required coding. Something like this:

strSQL = "select convert(cast(a.json_data as char) using 'utf8') as json_data "
       + "from mi_tabla "
       + "where id = ?";
try(PreparedStatement ps = conn.prepareStatement(strSQL)) {
    /*
       Mi código para leer la tabla
     */
}

With this "adjustment", the data is read perfectly (with all the correct accented characters).

I guess there may be a simpler way to solve this problem, but so far this solution has helped me.

    
answered by 02.05.2016 / 20:02
source