The fact that the table LICENCIAS
is not related to any other is problematic and involves a design error. What does an orphan table do there?
Putting in a query with JOIN
a table like that makes the system go crazy, because it does not know how to combine the data, rightly throws 104830657 records. Therefore, you must seriously decide what to do with the LICENCIAS
table. To whom are licenses granted to drivers? ... then you must link it to the table CONDUCTORES
, you can not leave it loose and want to use it in the JOIN in that way.
Since you have completely explained your data model, that there is a table TRAMITES
, which was precisely the piece that was missing in the puzzle, you can write the query like this:
SELECT
COUNT(co.sexo_id) as TOTAL,
se.SEXO as SEXO,
lic.fecha_captura as DIA -- no hace falta CAST si es DATETIME
FROM CONDUCTORES co
JOIN TRAMITES tr ON co.conductor_id=tr.conductor_id
JOIN LICENCIAS lic ON tr.licencia_id=lic.licencia_id
JOIN CAT_SEXO se on se.SEXO_ID=co.sexo_id
GROUP BY se.SEXO, DIA
ORDER BY DIA DESC, TOTAL ASC;
Let's see the complete design:
Correct design
CREATE TABLE IF NOT EXISTS cat_sexo
(
sexo_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
sexo CHAR(1),
CONSTRAINT cat_sexo_PKA01 UNIQUE (sexo)
)ENGINE=INNODB;
INSERT INTO cat_sexo (sexo)
VALUES
('M'),
('F');
CREATE TABLE IF NOT EXISTS conductores
(
conductor_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
conductor_nom VARCHAR(50),
sexo_id INT,
FOREIGN KEY (sexo_id) REFERENCES cat_sexo(sexo_id)
ON UPDATE CASCADE ON DELETE CASCADE
)ENGINE=INNODB;
INSERT INTO conductores (conductor_nom,sexo_id)
VALUES
('Pedro',1),
('Juana',2),
('María', 2),
('Santiago', 1),
('Laura',2);
CREATE TABLE IF NOT EXISTS licencias
(
licencia_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
fecha_captura DATETIME
)ENGINE=INNODB;
INSERT INTO licencias (fecha_captura)
VALUES
('2018-08-21'),
('2018-08-22'),
('2018-08-20'),
('2018-08-21'),
('2018-08-22'),
('2018-08-23'),
('2018-08-18'),
('2018-08-19'),
('2018-08-20'),
('2018-08-21'),
('2018-08-19'),
('2018-08-22');
CREATE TABLE IF NOT EXISTS tramites
(
tramite_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
conductor_id INT,
licencia_id INT,
fecha_captura DATETIME,
FOREIGN KEY (conductor_id) REFERENCES conductores(conductor_id)
ON UPDATE CASCADE ON DELETE CASCADE,
FOREIGN KEY (licencia_id) REFERENCES licencias(licencia_id)
ON UPDATE CASCADE ON DELETE CASCADE
)ENGINE=INNODB;
INSERT INTO tramites (conductor_id, licencia_id)
VALUES
(1,1),
(1,2),
(2,3),
(2,4),
(2,5),
(2,6),
(3,7),
(3,8),
(3,9),
(3,10),
(4,11),
(4,12);
Now we execute the modified query slightly. Given that the table with the most related data is CONDUCTORES
we put it first, so it is easier to write the JOIN
and it is clearer:
SELECT
COUNT(co.sexo_id) as TOTAL,
se.SEXO as SEXO,
lic.fecha_captura as DIA -- no hace falta CAST si es DATETIME
FROM CONDUCTORES co
JOIN TRAMITES tr ON co.conductor_id=tr.conductor_id
JOIN LICENCIAS lic ON tr.licencia_id=lic.licencia_id
JOIN CAT_SEXO se on se.SEXO_ID=co.sexo_id
GROUP BY se.SEXO, DIA
ORDER BY DIA DESC, TOTAL ASC;
Result (correct) with the hypothetical data entered above:
TOTAL SEXO DIA
----------------------------------------------
1 M 23.08.2018 00:00:00
1 F 22.08.2018 00:00:00
2 M 22.08.2018 00:00:00
1 F 21.08.2018 00:00:00
2 M 21.08.2018 00:00:00
2 F 20.08.2018 00:00:00
1 M 19.08.2018 00:00:00
1 F 19.08.2018 00:00:00
1 M 18.08.2018 00:00:00
Incorrect design
To understand the reason for the error, let's now look at the same design as above, but with a table LICENCIAS2
orphan . You will see how the fully dislocated data returns to you because in the JOIN
a table participates that the system does not know how to handle it.
CREATE TABLE IF NOT EXISTS licencias2
(
licencia_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
fecha_captura DATETIME
)ENGINE=INNODB;
INSERT INTO licencias2 (fecha_captura)
VALUES
('2018-08-21'),
('2018-08-22'),
('2018-08-20'),
('2018-08-21'),
('2018-08-22'),
('2018-08-23'),
('2018-08-18'),
('2018-08-19'),
('2018-08-20'),
('2018-08-21'),
('2018-08-19'),
('2018-08-22');
Let's write the same query as before, using that orphan table:
SELECT COUNT(co.sexo_id) as TOTAL, se.SEXO as SEXO, CAST( fecha_captura AS DATE ) as DIA
FROM LICENCIAS2 lic
join CAT_SEXO se on se.SEXO_ID=sexo_id
join CONDUCTORES co on co.sexo_id=se.SEXO_ID
GROUP BY se.SEXO, CAST( fecha_captura AS DATE ) order by DIA desc, TOTAL asc
We will immediately have some totally dislocated results:
TOTAL SEXO DIA
----------------------------------------------
4 M 23.08.2018 00:00:00
6 F 23.08.2018 00:00:00
12 M 22.08.2018 00:00:00
18 F 22.08.2018 00:00:00
12 M 21.08.2018 00:00:00
18 F 21.08.2018 00:00:00
8 M 20.08.2018 00:00:00
12 F 20.08.2018 00:00:00
8 M 19.08.2018 00:00:00
12 F 19.08.2018 00:00:00
4 M 18.08.2018 00:00:00
6 F 18.08.2018 00:00:00
Test data used for this response
You can see here a DEMONSTRATION of all the code used and testing .