Assign a floating point literal without a suffix to a float


Taking into account that in C / C ++ the floating point literals without a suffix are by default of type double , then when assigning a literal of this type to a float an implicit conversion is made from double to float .

float n = 3.14;

if(n == 3.14f)

This does not print the message in change if we add the suffix f to 3.14 in the declaration of n to avoid the conversion if it is printed. The question is, does loss of precision occur when the conversion takes place?

asked by cheroky 25.10.2016 в 04:59

3 answers




Does loss of precision occur when the conversion takes place?

Yes, a loss of precision occurs.

Narrowing ( Narrowing ).

The types of floating point data are:

  • float : Simple precision. They are usually a 32-bit wide type that follows the IEEE-754 32 bit.
  • double - Double precision. They are usually a 64-bit wide type that follows the IEEE-754 64 bit.
  • long double - Extended precision. They are usually an 80-bit type in 32-bit and 64-bit architectures. It does not necessarily follow the IEEE-754 .

Each time you go from a more precise type to one of less, a narrowing occurs; every time there is a narrowing data can be lost when this happens?

Type conversion.

According to the C standard in § section Real floating point types (translation and highlighting mine): Real floating point types

  • When a float is promoted to double or long double , or when a double is promoted to long double , its value does not change .
  • When a double is degraded to float , a long double is degraded to double or float , or a value represented in greater precision and range that the one required by its semantic type (see is explicitly converted to that semantic type, if the value that is being converted can be represented exactly in the new type, it will not be changed . If the value that is being converted is in the range of values that can be represented but can not be represented accurately, the result will be the number closest to the value either rounding up or down, rounding is implementation dependent. If the value that is being converted is outside the range of values that can be represented, the behavior is undefined.

    In your case, assigning the value 3.14 to a float corresponds to a narrowing, but since the value 3.14 is representable by float the value will not change exactly.

    What's wrong?

    If the value has not changed, why does your code fail ?:

    float n = 3.14;
    if(n == 3.14f)
       puts("Igual"); // no se imprime!

    Because I have lied, the value 3.14 is NOT representable by float exactly. There are floating-point numbers that are not exactly representable in binary, this is due to the properties of each base 1 .

    The value 3.14 in binary is not exactly representable and with double precision its value is approximately 3,140000000000000 12434497875802 but when you store it in float you lost some accuracy, how much exactly? it will depend on your system ...

    So you will be comparing a number similar to the truncation of the value double 3,14000000000000012434497875802 against a number similar to 3.14f which in many cases will not be the same number. For example the literal 3.14 on float is approximately 3.140000 1049041748046875 then your comparison would be, more or less:

    if(3.1400000000000001243449 == 3.1400001049041748046875) // El double ha sido truncado

    That evidently does not comply with equality.

    It's terrible! What can I do?

    As eferion says, you should avoid comparing floating-point numbers by equality, due to errors of rounding you should compare them by the almost equality , a function like this could be of help:

    #include <stdbool.h>
    #include <stdint.h> 
    bool casi_iguales(float izquierda, float derecha)
         return fabs(izquierda – derecha) <= FLT_EPSILON;
    bool casi_iguales(double izquierda, double derecha)
         return fabs(izquierda – derecha) <= DBL_EPSILON;

    The values FLT_EPSILON and DBL_EPSILON are the difference between 1 and the next value that can be represented by float and double respectively; in other words, they are roughly the smallest representable value for each of the types, so if the difference between izquierda and derecha is less than or equal to this value it is that both values are almost equal .

    1 For example, 1/3 in base 10 is a pure periodic number of value 0.333333 ... while in base 12 it is exactly 0.4. In decimal the value 1/10 is exactly 0,1 but in binary it is a mixed periodic number of value 0,00011001100110011 ...

    answered by 25.10.2016 / 10:13

    Floating numbers never ever have to be compared using the comparison operator.

    The reason is that these numbers have a certain precision, with the rest of the digits practically random.

    The correct way is to make the comparison assuming a certain margin of error:

    float v1,v2;
    if (fabs(v1-v2)<1e-4)
      // Los números son iguales
      // Los números son distintos

    You can play with precision to adapt it to your needs. The example is simply illustrative.


    answered by 25.10.2016 в 07:13

    Here is a very illustrative example of what happens. And it's really worth explaining.

    You have two memory areas uninitialized like this:

    Area A:

    | qwerytuiopasdfhgjklzxcvbnm1234567890 |

    Area B:

    | qwerytuiopasdfhgjklzxcvbnm1234567890 |

    Then you put a float in A:

    | 3.14           | // Nota como float es más pequeño.

    And a double in B:

    | 3.14                                 |

    So when comparing A with B, you get the comparison of:

    | 3.14                                 |


    | 3.14           gjklzxcvbnm1234567890 |

    Being these different. I hope you found it didactic.

    answered by 25.10.2016 в 07:43