DATE () vs DATE_FORMAT () in MySQL

5

The doubt came to me by answering a question here on Stackoverflow.

The aim was to obtain only the date (without the time) of a column DATETIME , to compare it using BETWEEN .

Obtaining that date can be done:

  • Using DATE : SELECT DATE(col_fecha) fecha;
  • Using DATE_FORMAT : SELECT DATE_FORMAT(col_fecha,'%Y-%m-%d') fecha;

Are there differences in the use of one or another function, for example in the face of performance ?

The MySQL documentation does not say anything about it, hence my doubt.

    
asked by A. Cedano 25.10.2017 в 12:49
source

2 answers

8

Checking the MySQL source code in the conversion functions of DATETIME a DATE and vice versa you can see that converting a field DATETIME to a field DATE is simple, just change a property of the class MYSQL_TIME and the work is practically done:

Conversion example of TIME to DATETIME :

void time_to_datetime(THD *thd, const MYSQL_TIME *ltime, MYSQL_TIME *ltime2)
{
  thd->variables.time_zone->gmt_sec_to_TIME(ltime2,
    static_cast<my_time_t>(thd->query_start()));
  ltime2->hour= ltime2->minute= ltime2->second= ltime2->second_part= 0;
  ltime2->time_type= MYSQL_TIMESTAMP_DATE;
  mix_date_and_time(ltime2, ltime);
}

Sample data mix DATETIME and TIME :

void mix_date_and_time(MYSQL_TIME *ldate, const MYSQL_TIME *ltime)
{
  DBUG_ASSERT(ldate->time_type == MYSQL_TIMESTAMP_DATE ||
              ldate->time_type == MYSQL_TIMESTAMP_DATETIME);

  if (!ltime->neg && ltime->hour < 24)
  {
    /*
      Simple case: TIME is within normal 24 hours internal.
      Mix DATE part of ltime2 and TIME part of ltime together.
    */
    ldate->hour= ltime->hour;
    ldate->minute= ltime->minute;
    ldate->second= ltime->second;
    ldate->second_part= ltime->second_part;
  }
  else
  {
    /* Complex case: TIME is negative or outside of 24 hours internal. */
    longlong seconds;
    long days, useconds;
    int sign= ltime->neg ? 1 : -1;
    ldate->neg= calc_time_diff(ldate, ltime, sign, &seconds, &useconds);
    DBUG_ASSERT(!ldate->neg);

    /*
      We pass current date to mix_date_and_time. If we want to use
      this function with arbitrary dates, this code will need
      to cover cases when ltime is negative and "ldate < -ltime".
    */
    DBUG_ASSERT(ldate->year > 0);

    days= (long) (seconds / SECONDS_IN_24H);
    calc_time_from_sec(ldate, seconds % SECONDS_IN_24H, useconds);
    get_date_from_daynr(days, &ldate->year, &ldate->month, &ldate->day);
  }
  ldate->time_type= MYSQL_TIMESTAMP_DATETIME;
}

This last step is a little more complex because in case of having negative hours or outside the 24 hour range, the date is recalculated.

Link to the date and time format analysis function ( parse_date_time_format ) .

Empirical evidence:

Empirical tests can be done to verify that the hypothesis is correct ( see test online ):

CREATE TABLE IF NOT EXISTS pruebas (
  id int(11) NOT NULL AUTO_INCREMENT,
  fecha datetime NOT NULL,
  PRIMARY KEY (id)
);

INSERT INTO pruebas (fecha) VALUES
('2017-10-25 00:00:00'),
('2017-10-26 11:33:47'),
...
('2017-10-25 00:00:00'),
('2017-10-26 11:33:47');

SELECT COUNT(*) fecha
FROM pruebas p
GROUP BY p.fecha;

SELECT COUNT(*) DATE
FROM pruebas p
GROUP BY DATE(p.fecha);

SELECT COUNT(*) DATE_FORMAT
FROM pruebas p
GROUP BY DATE_FORMAT(p.fecha, "%Y-%m-%d");

Results:

                Mínimo    Media     Máximo
Normal         0.715742  0,966951  1.126992
DATE()         0.721084  0,958496  1.117721
DATE_FORMAT()  1.979319  2,385409  2.758417

It can be seen that there is hardly any appreciable difference between obtaining normal data and those treated with the function DATE() (even in the sample of 5 results it seems faster to use DATE ), however that there is a more noticeable difference when using the DATE_FORMAT() function.

In summary: After the overhead of analyzing the structure of the query, resolution flow, etc., finally a date data conversion is much faster than any call to a function that has as an additional overhead the calculation of the text format that must be returned.

    
answered by 25.10.2017 / 13:04
source
4

I guess that Date() : SELECT DATE(col_fecha) fecha; is faster than using DATE_FORMAT , that is.

When you use Date() , you are not passing arguments, the format has it defined, instead, Date_format() has to evaluate the format of the date to each record, for what you want or not, it will be slower , and it is something that would be noticed at large scales of data.

The advantage of Date_format() compared to Date() is that you can define your format, Date as such does not allow you to select it.

I clarify that this is a own deduction , I can be wrong.

    
answered by 25.10.2017 в 12:52