The explanation is that you are using different conditions. In the separate queries you use the criterion: año='2009' and mes='05'
. But in the integral query the condition of JOIN
uses another criterion: t1.id_cod = t2.id_cod
, while only the records of the table t1
are restricted with the condition t1.año = '2009' and t1.mes = '05'
.
If in the table t2
there are records that share the value of id_cod
with the records in the table t1
that comply with the condition t1.año='2009' and t1.mes='05'
are going to be included in the result regardless of the value of t2.año
and t2.mes
.
Now, if you change the condition of JOIN
to: t1.año = t2.año and t1.mes = t2.mes
leaving the same condition of WHERE
you will get the number of records expected, but it is likely that the value of the column id_cod
will not match .
If you add the column id_cod
to the conditions of JOIN
: t1.id_cod = t2.id_cod and t1.año = t2.año and t1.mes = t2.mes
it is likely that the number of records is less than expected if there are records in t1
that have no correspondence in t2
.
When analyzing an SQL query it is important to remember that the order in which it is executed is not the same one in which it is written. In the case of the query that generates more records than expected, the following occurs:
It starts with the FROM
clause and the intersection of the tables joined by the operator INNER JOIN
is generated, generating an intermediate result that includes the records of both tables that meet condition t1.id_cod = t2.id_cod
. If in t2
there is more than one record with the same value of id_cod
as a record of t1
or vice versa, all records are included.
For example if in t1
there is a record with id_cod = 1
and in t2
there are two records with id_cod = 1
then the intermediate result will have two records with t1.id_cod=1
and t2.id_cod=1
; the record of t1
would be repeated for each record of t2
that has the same value of id_cod
.
Once you have that intermediate result, it is filtered according to the conditions of clause WHERE
. In this case, the records that do not meet the condition are deleted: t1.año = '2009' and t1.mes = '05'
.
This is where the root of the discrepancy is, since the filtering condition does not restrict the records in the t2
table. For each record in t1
that has año='2009'
and mes='05'
, all the t2
records that have the same value id_cod
will be included regardless of the value that have año
and mes
in t2
.
Finally the result is projected according to the clause SELECT
. In this case *
that indicates to include in the final projection all the columns of the result, normally in the order defined in the structure of the tables and following the order of appearance of the tables in the clause FROM
. In this case, first all the columns in the table t1
followed by all the columns in the table t2
.