Improve the response time of a LEFT JOIN in Mysql

2

(I correct the graph The employee_location table only has reference to the location)

I have the following tables that allow me to obtain the location of an employee

The Employee_information table may or may not have a reference value for the location code. So I should also look for it in the Location table. The problem is with the table employee_location . In this is the employee who is not referenced in employee_information , with one exception may appear multiple times in different cities. However, the report that I must submit ONLY must show one city per employee (at the request of the user).

So far I have this:

select coalesce(ei.name_value, l.name)
from employee e 
INNER JOIN status s ON s.employee_id = e.employee_id 
             AND s.date between '2017-01-01' and '2017-10-01' -- Fechas a modo de ejemplo
Left join employee_information ei ON e.employee_id = ei.employee_id
Left join location l on l.location_id = ei.value

Left join (
          select e1.employee_id, min(l1.location) as location
          from employee e1 
          left join employee_location el1 on e.employee_id = el1.employee_id
          left join location l1 on el1.location_id = l1.location_id
          group by employee_id
) as temp_emp ON temp_emp.employee_id = e.employee_id

How can I improve this subquery ?. I do not want to look for the information of ALL the employees present in employee_location since it is quite heavy. It is possible to use the employees of the table with alias and instead of e1, since these have already been delimited with the dates in the status table. Or someone may recognize another, more optimal way of doing what I need.

So far I have not been able to find a reference that allows me to solve this concern.

This is only part of the final data that the report should contain.

Just to clarify. Another team of the company developed this model (and it is sometimes difficult to obtain clarifications from them), I only consult it.

I hope it was clear with the explanation of the problem and someone can give me a hand.

Thank you very much.

    
asked by db_ 20.11.2017 в 21:35
source

2 answers

0

With the last adjustments to your question, it is obvious that this join in the main part of the query is over:

Left join location l on l.location_id = ei.value

The join a location you only need it within the derived table.

Additionally, within the derived table, you do not need to consult again with employee , nor do you need LEFT JOIN s.

The query would be as follows:

select coalesce(ei.name_value, el.location_name)
from employee e 
INNER JOIN status s ON s.employee_id = e.employee_id 
             AND s.date between '2017-01-01' and '2017-10-01' -- Fechas a modo de ejemplo
Left join employee_information ei ON e.employee_id = ei.employee_id
left join (
  select el.employee_id, min(l.Name) as location_name
    from employee_location el
    join location l on l.location_id = el.location_id
   group by el.employee_id
) el ON el.employee_id = e.employee_id
    
answered by 20.11.2017 в 21:52
0

For me, I would completely omit the name_value column from the employee_information table. If employee_information has the "value" column full and consistent, then you use it to join with location . Evidently there has to be a index in employee_information.value . Forget about name_value . It is inconsistent, de-normalized and messes up.

I also see that employee_information.field has value Location . If this field can have other values than Location then you should not do a join with the entire table. Only with tuples that have value Location . This also suggests that you should do a index in field .

You say that an employee's location may not be in employee_information and that in which case it may be in employee_location . You do not say if both information are mutually exclusive. Can a user be in both tables?

Anyway, I would get the tuples employee_id, location_id in a subquery and then do the joins.

I also imagine that you not only want the name of the location, since it would not be useful if you do not associate it with the name of the employee. I will then allow myself to change the SELECT fields.

SELECT e.name, l.name
FROM employee e 
INNER JOIN status s ON s.employee_id = e.employee_id AND s.date between '2017-01-01' and '2017-10-01'
LEFT JOIN (
  SELECT employee_id, MIN(location_id) as location_id FROM employee_location el1 GROUP BY employee_id
  UNION ALL
  SELECT employee_id, value FROM employee_information WHERE field='Location'
) location_employee ON location_employee.employee_id = e.employee_id

LEFT JOIN location l ON l.location_id=location_employee.location_id

Note that I am using UNION ALL because I assume that the same employee is not both employee_information and employee_location .

A index could possibly help on employee_location.location_id or a composite index on this same table in employee_id, location_id .

I assume that the primary key of employee is employee_id , the primary key of location is location_id and that there are indexes both in employee_location.employee_id and employee_information.employee_id

    
answered by 22.11.2017 в 13:23