LEFT OUTER JOIN query with 4 tables

3

I have 4 tables and Wallpapers, Downloads, Favorites, Votes that are defined as well. Of wallpapers I just need the id.

Table Downloads

+----+---------+---------+
| id | wall_id | user_id |
+----+---------+---------+
|  1 |       1 |       1 |
|  2 |      35 |    NULL |
|  3 |      35 |    NULL |
+----+---------+---------+

Favorites Tables

+----+---------+---------+
| id | user_id | wall_id |
+----+---------+---------+
|  1 |      12 |      10 |
|  2 |      12 |       2 |
+----+---------+---------+

Votes Table

+----+---------+---------+---------+
| id | user_id | wall_id | type    |
+----+---------+---------+---------+
|  1 |      12 |       1 | dislike |
|  2 |      12 |      39 | like    |
|  3 |       1 |       2 | like    |
|  4 |       2 |       2 | like    |
|  5 |       3 |       2 | like    |
|  6 |       5 |       2 | dislike |
|  7 |      12 |      10 | like    |
|  8 |      12 |       2 | like    |
+----+---------+---------+---------+

Basically what I need is a query that gives me back the amount of Downloads, Favorites, Likes and Dislikes of a wallpaper. Try nesting with several LEFT OUTER JOIN like this:

SELECT w.id,COUNT(d.id) AS Downloads,
COUNT(f.id) AS Favorites,
SUM(IF(v.type = 'like',1,0)) AS Likes,
SUM(IF(v.type = 'dislike',1,0)) AS Dislikes 
FROM wallpapers AS w 
LEFT OUTER JOIN downloads AS d ON w.id = d.wall_id 
LEFT OUTER JOIN favorites AS f ON w.id = f.wall_id 
LEFT OUTER JOIN votes AS v ON w.id = v.wall_id 
WHERE w.id = 2
GROUP BY w.id;

That returns the result of the form that I need but with the wrong calculations that is to say, values are duplicated and other strange things happen.

Results Table

+----+-----------+-----------+-------+----------+
| id | Downloads | Favorites | Likes | Dislikes |
+----+-----------+-----------+-------+----------+
|  2 |        10 |        10 |     8 |        2 |
+----+-----------+-----------+-------+----------+

I understand that it is because the LEFT OUTER JOIN can not be nested like this because the 3 tables must have records that correspond to the id of the first table.

Any Solution?

    
asked by Edwin V 23.03.2017 в 01:45
source

3 answers

3

The problem is that some of the values are being counted multiple times. This is because you are doing the COUNT of the id's without checking that those id's have not been counted before. The same error will occur with the SUM , so it could be the case that the same votes are added several times.

When doing a LEFT OUTER JOIN , all the records in the tables on the left are kept and combined with the records in the tables on the right (or with NULL if there are none). The problem is that by keeping the values on the left, they are duplicating (or multiplying) some of them because they are "added back" for each record on the right.

To see this better we are going to remove the COUNT , SUM and GROUP BY of your SELECT, which leaves us the following sentence:

SELECT w.id,
       d.id AS Downloads,
       f.id AS Favorites,
       v.type AS Likes,
       v.type AS Dislikes 
FROM   wallpapers AS w 
       LEFT OUTER JOIN downloads AS d ON w.id = d.wall_id 
       LEFT OUTER JOIN favorites AS f ON w.id = f.wall_id 
       LEFT OUTER JOIN votes AS v ON w.id = v.wall_id 
WHERE  w.id = 2

that when executed with the data provided in the question, it returns the following:

id | Downloads | Favorites | Likes    | Dislikes
------------------------------------------------
2  | NULL      | 2         | like     | like
2  | NULL      | 2         | like     | like
2  | NULL      | 2         | like     | like
2  | NULL      | 2         | dislike  | dislike
2  | NULL      | 2         | like     | like

As you see, first you select w.id which is 2 as indicated in the WHERE , there is no download so Downloads is NULL, yes you find a favorite with the id 2 ... and now it is When the problem starts: 5 votes are found, for which each row will have the same left part combined with each of the votes. That would not be a problem if it were not because now we find that the id 2 has been selected 5 times (one for each vote).

A quick solution would be to add a DISTINCT to COUNT to avoid that problem:

SELECT w.id,
       COUNT(DISTINCT d.id) AS Downloads,
       COUNT(DISTINCT f.id) AS Favorites,
       SUM(IF(v.type = 'like',1,0)) AS Likes,
       SUM(IF(v.type = 'dislike',1,0)) AS Dislikes 
FROM   wallpapers AS w 
       LEFT OUTER JOIN downloads AS d ON w.id = d.wall_id 
       LEFT OUTER JOIN favorites AS f ON w.id = f.wall_id 
       LEFT OUTER JOIN votes AS v ON w.id = v.wall_id 
WHERE  w.id = 2
GROUP BY w.id;

That seems to work with the naked eye, but it still leaves you with a problem with SUM (which can not be seen first with the data put in the question). Imagine that instead of having a single favorite for the Wallpaper with id 2 (as there is now), there would be two. In that case, all votes will be combined with those two producing 10 rows. And the SUM would be adding the same vote twice.

To solve this, my recommendation would be that instead of using SUM you would use COUNT with DISTINCT as for the other values (this is what you are doing anyway, because you are really simulating a COUNT with a SUM ). And that you will separate the "Likes" and "Dislikes" with different JOINs.

With the changes I say, the SQL statement would be like this:

SELECT w.id, 
       COUNT(DISTINCT d.id) AS Downloads,
       COUNT(DISTINCT f.id) AS Favorites,
       COUNT(DISTINCT v1.id) AS Likes,
       COUNT(DISTINCT v2.id) AS Dislikes 
FROM   wallpapers AS w 
       LEFT OUTER JOIN downloads AS d ON w.id = d.wall_id 
       LEFT OUTER JOIN favorites AS f ON w.id = f.wall_id 
       LEFT OUTER JOIN votes AS v1 ON w.id = v1.wall_id AND v1.type = 'like'
       LEFT OUTER JOIN votes AS v2 ON w.id = v2.wall_id AND v2.type = 'dislike'
WHERE  w.id = 2
GROUP BY w.id

That already returns the correct values always because although there are duplicate ids, only the different ids are being counted.

    
answered by 23.03.2017 / 05:13
source
2

I have given you a fairly long answer based on trying to adjust it as much as possible to the code you presented in the question. Now I'm going to put a simple SELECT that is valid and with which you get the same result, but still I would recommend more the other answer .

This is the query:

SELECT w.id,
       (SELECT COUNT(d.id) FROM downloads d WHERE d.wall_id = w.id) as downloads,
       (SELECT COUNT(f.id) FROM favorites f WHERE f.wall_id = w.id) as favorites,
       (SELECT SUM(IF(v.type = 'like',1,0)) FROM votes v WHERE v.wall_id = w.id) as likes,
       (SELECT SUM(IF(v.type = 'dislike',1,0)) FROM votes v WHERE v.wall_id = w.id) as dis
FROM   wallpapers w
WHERE  w.id = 2
GROUP BY w.id

As you can see, the query itself is shorter and simpler ... but it contains dependent subqueries (something you can see if you do EXPLAIN ), which will end up slowing down because they depend on one of the values ( w.id ) of the main query and must be executed with each row of the same.

    
answered by 23.03.2017 в 05:31
2

There is a third additional response to the two that Álvaro provided that is to use derived tables:

SELECT w.id,
       d.Downloads,
       f.Favorites,
       v.Likes,
       v.Dislikes 
FROM wallpapers AS w 
LEFT JOIN (SELECT wall_id,
                  count(*) as Downloads 
           FROM downloads 
           GROUP BY wall_id) AS d ON w.id = d.wall_id 
LEFT JOIN (SELECT wall_id,
                  count(*) as Favorites 
           FROM favorites 
           GROUP BY wall_id) AS f ON w.id = f.wall_id 
LEFT JOIN (SELECT wall_id,
                  SUM(IF(type = 'like',1,0)) AS Likes,
                  SUM(IF(type = 'dislike',1,0)) AS Dislikes 
           FROM votes  
           GROUP BY wall_id) AS v ON w.id = v.wall_id 
WHERE w.id = 2
GROUP BY w.id;
    
answered by 23.03.2017 в 13:39