The first thing is to clarify that the range of values of a byte is [-128, 127], so the largest positive number that can be handled with one byte is 127. This happens because the eighth bit (the first one of left to right) is used to indicate the sign.
The above makes it impossible to represent more than that number of characters with a single byte. For this reason in java uses the format UTF-16 in its characters, each representing 2 bytes (from Java 9 was introduced an optimization in the form of representing the String
which allows not always use 2 bytes per character if it is not necessary). In order to represent a UTF-16 character in this way, a small algorithm is performed on the character code, which we show below.
An example with the á:
-
Your code is 225 in decimal (greater than 127, you have to divide it), and in binary 0000-0000-1110-0001 (using 2 bytes).
-
The first 5 bits (left to right) are discarded because they are not necessary, leaving 000-1110-0001
-
The first 5 bits are superimposed on the x's of 110x-xxxx. With this we obtain our first byte 1100-0011, which in decimal is 195.
-
The remaining 6 bits are superimposed on the xs of 10xx-xxxx. With this we obtain our second byte 1010-0001, which in decimal is 161.
So the representation of the a would be [195 161].
This is why when reading a byte with the System.in.read()
method, you only read the first byte of the representation of the á, which is 195.
If what you want is to read characters you must use a Scanner
, in the following way.
try (Scanner scanner = new Scanner(System.in)) {
String cadena = scanner.next();
System.out.println(cadena);
} catch (Exception e) {
e.printStackTrace();
}
Notice that Scanner
has a method to read just one character.
With the InputStreamReader.read()
method if it works for you because it returns a int
that has a capacity of 4 bytes, which is greater than the 2 bytes of the character you enter.