Declare array inside loop vs empty it on each loop

4

I wonder if in C / C ++ it is better to declare a new array every loop loop:

while(true) {
  char array[255] = "";

  // Hacer algo con el array ...
}

Or empty it every turn of it:

char array[255] = "";

while(true) {
  for(short i = 0; i < sizeof(array); i++) {
    array[i] = 0;
  }

  // Hacer algo con el array...
}

By better I mean both from a perspective of performance (speed) and good practice.

    
asked by Hewbot 09.04.2016 в 19:36
source

4 answers

5

As far as good practices are concerned, it would be best to declare the variable in the smallest / most restricted scope that you can (in this case within the loop if it is not going to be used anywhere else). Makes the code easier to maintain (eg, you do not have to upload 100 lines to see what type the variable x was initialized) and in some cases the compiler could apply some optimization method.

Now from the performance point of view, it will depend to a large extent on the compiler (and the language) that you use and you would have to do tests to see which one is better. Modern compilers are responsible for optimize the generated code , so the result should be very similar (if not equal) in both cases.

... That was the theory, now let's see the practice. I have created three test cases (although perhaps not in the most scientific way) with loops that are repeated 10 million times and I have executed them 20 times to see the results. This is the code:

Case 1: statement inside the loop

#include <stdio.h>

int main() {
    int x = 0;
    for (x = 0; x < 10000000; x++) {
        char array[255] = "";
        array[0] = (char) (65 + (x%23));
        //printf("%s\n", array);
    }
    return 0;
}

Case 2: Declaration outside the loop, emptying with loop for

#include <stdio.h>

int main() {
    int x = 0;
    char array[255];
    for (x = 0; x < 10000000; x++) {
        int y;
        for (y = 0; y < 255; y++) { 
            array[y] = 0;
        }
        array[0] = (char) (65 + (x%23));
        //printf("%s\n", array);
    }
    return 0;
}

Case 3: Declaration outside the loop, emptied with memset

#include <stdio.h>
#include <string.h>

int main() {
    char array[255];
    int x = 0;
    for (x = 0; x < 10000000; x++) {
        memset(array,0,255);
        array[0] = (char) (65 + (x%23));
        //printf("%s\n", array);
    }
    return 0;
}

And the results were (drum roll):

  • Case 1 - 0.1587 seconds
  • Case 2 - 6.1470 seconds
  • Case 3 - 0.1413 seconds

Which is a bit ******** because it pulls down all part of the theory I put up. For the first and third case it would be fulfilled, but not for the second case (surely I did something wrong ... or I have the worst compiler in the world that is also quite possible).

    
answered by 11.04.2016 / 13:30
source
1

Good practices indicate that it is advisable that the scope of the variables be the minimum possible. This recommendation aims to avoid the reuse of variables for different purposes.

In terms of performance it seems obvious to deduce that forcing the program to reserve and initialize X bytes of memory in each iteration of a loop is going to be more expensive than reserving once and initializing in each iteration of the loop.

Now, performance is often a controversial issue. In the example you propose, can the time it will be used by the system to reserve 255 bytes in each array be considered important? The final answer depends on the requirements of the algorithm. If, for example, this function is going to take half a second and is only executed when the user presses a button, it can be safely considered that time is negligible ... if instead time is a critical unit and the rest of the algorithm is super optimized the thing changes.

My recommendation is that you first worry about creating readable code and, once it is finished, you can choose to optimize it or if it complies with the specifications.

By the way, to initialize memory you can use memset instead of creating your own loop.

    
answered by 11.04.2016 в 10:02
1

I assume that the complexity will be linear and the process of memory and initialization reservation equally costly or with similar costs for each loop turn; but it is difficult to evaluate since the compiler can make decisions to optimize the code.

It seems that you are facing a case of premature optimization ( early optimization in English ):

  

Programmers waste a huge amount of time thinking or worrying about the performance of non-critical parts of their programs, and these attempts to improve efficiency actually have a strong negative impact on program debugging and maintenance. We must forget about the small efficiencies.

If you consider that part of the code is critical, my advice is to do tests with different options of optimization of the compiler using one and another option, we will never be smarter than the compiler ... even code that we could have written in a way inefficient the compiler can translate it into something more performance behind the scenes. I bet that even in the worst case your 255-element array will not be a bottleneck in the performance of your program and the compiler will do a good job of optimizing that code.

Edited:

On the other hand, depending on how you use array , you can skip the initialization step:

while(true) {
    char array[255];

    // Hacer algo con el array ...
}

If you do not assign value to the array it will not be initialized to 0 and therefore, it will be a less process to do in each loop loop (do not pay for what you do not use, this is a C ++ principle). This will be useful if the first thing you do with array is to write to it, so your previous information is not relevant ... it could even be a reason for the compiler to reuse the space of array in each round instead to create it again.

    
answered by 11.04.2016 в 09:02
0

Keep in mind that creating a local array to the function is trivial for compilers in most cases, since it is a simple adjustment of the size of the stack.

In many architectures, adjusting the size of the stack takes very few instructions in machine code, and does not access memory . The simple act of 'cleaning' an array with array[0] = 0 can imply that dreaded memory access (depends on the use of the array inside the loop, so that the last accesses cause 'array [0]' to be within the lines of processor cache).

The curious value obtained when cleaning the array using memset() is possibly due to the compiler. gcc , for example, implements memset() using machine code generated inline , without making any calls to any real function. This, along with its own optimizations, will be responsible for such unexpected speed.

Comparing it with the case of array[0]=0 , here if we force the compiler to generate code; as much as they are optimized, the compilers still do not know how to program them alone; -)

I recognize that the speed of using memset() surprised me.

    
answered by 08.11.2016 в 20:13