Remove duplicates based on their date

2

I have a table called noarribados, it has the following structure, the fact is that I have several repeated records (varies the date and the status_bboo) I am interested in having only one (with the most recent_bboo status) I know that this is a function of the most recent date, how do I eliminate duplicates? we can say that the base has 8 million records (and use indices to make the process faster) the date format is year, month, day with hours, minutes and seconds

This is the file that is loaded in the aforementioned database

    
asked by sanlegas 19.04.2018 в 05:26
source

2 answers

2

To get the latest or most recent record, according to date, for each record, you can select them by grouping your MAX(fecha) per record.

SELECT tu_identificador, MAX(fecha) AS ultima_fecha FROM tabla GROUP BY tu_identificador;

Then, to make the DELETE you require, you store in memory the list of identifiers + last_date obtained, and remove from the table all those that do not match that pair. Something like:

DELETE FROM tabla WHERE CONCAT(tu_identificador, ';', fecha) NOT IN ('id1;2018-04-19 23:30:10', 'id2;2018-04-12 15:14:11', 'id3;2018-04-10 03:15:45'); -- y etc.

Doing this directly in Mysql, it could be something like that (although I'm not sure if it turns out ... especially because it can become recursive, and end up eliminating all XDDD):

DELETE FROM tabla WHERE (tu_identificador, fecha) NOT IN (SELECT tu_identificador, MAX(fecha) FROM tabla GROUP BY tu_identificador);
    
answered by 19.04.2018 в 14:14
1

Try using DISTINCT to only bring unique records, in the following table you can see how I have several records but some repeated by dates

+------------+-------------+
| fecha      | descripcion |
+------------+-------------+
| 2018-04-18 | hola        |
| 2018-04-18 | holaa       |
| 2018-04-17 | holaa       |
| 2018-04-17 | holaas      |
| 2018-04-16 | holaas      |
+------------+-------------+

One of the ways and I will use examples for it is like this:

SELECT DISTINCT fecha FROM ventas;

That the following returns to me:

+------------+
| fecha      |
+------------+
| 2018-04-18 |
| 2018-04-17 |
| 2018-04-16 |
+------------+

If for example now with the example that I put you I want to also appear the column description of each of the three previous records my query should be like this

SELECT DISTINCT(fecha), descripcion FROM ventas GROUP BY fecha;

What I get as a result:

+------------+-------------+
| fecha      | descripcion |
+------------+-------------+
| 2018-04-16 | holaas      |
| 2018-04-17 | holaa       |
| 2018-04-18 | hola        |
+------------+-------------+
    
answered by 19.04.2018 в 05:43