Query with data grouped by time, whether it exists or not

2

Currently I have to make a report based on a query, the query that is made should return all stoppages that have occurred in a range of determined dates every five minutes, if there is a record that meets that condition for that interval should return 1, and if there is no record, you should return 0. To explain it a little easier I put an image of the example table: (* Do not let upload image so I put link below) image of the example table What I need is that after the query I return 0 for the interval 00:00 - 00:05, 00:05 - 00:10, ...., and that for 06:05 - 06:10 returned 1 and so on this logic all the intervals between certain dates.

To achieve this I have found some help:

select variable,
   convert(varchar(8), tiempo, 1)+' '
  +convert(varchar(2), datepart(hh, tiempo))+':'
  +convert(varchar(2), datepart(mi, tiempo)/5),
  count(*) 
  from datos group by 
  variable,
  convert(varchar(8), tiempo, 1)+' '
  +convert(varchar(2), datepart(hh, tiempo))+':'
  +convert(varchar(2), datepart(mi, tiempo)/5)

The problem is that this code if it shows me the intervals in which there are records but not in those that do not exist. I'm working against an MSSQL Server 2012 Greetings.

    
asked by U. Busto 07.08.2017 в 12:28
source

2 answers

2

This is a fairly recurrent problem, that of generating records where there are none. In your case, you would need to set the time periods in advance to determine if there are readings in them or not. As you mentioned that you work with a maximum of 24 hours that would be 288 (26 * 60/5) periods, it is feasible to use the solution that I propose, and that is based on generating a dynamic table of time intervals from a date initial. With this table we can do a LEFT JOIN with datos and where there are no records we will count as 0 the amount.

Let's see:

DECLARE @Datos TABLE (
    Variable VARCHAR(15),
    Tiempo   DATETIME
)

DECLARE @Intervalos TABLE (
    Nro     INT,
    Desde   DATETIME,
    Hasta   DATETIME
)

DECLARE @PrimerIntervalo DATETIME

INSERT INTO @Datos (Variable, Tiempo)
VALUES  ('TAG_1', '20170807 10:00:01 am'),
    ('TAG_1', '20170807 10:04:01 am'),
    ('TAG_1', '20170807 10:11:01 am'),
    ('TAG_1', '20170807 10:12:01 am'),
    ('TAG_1', '20170807 10:25:01 am')

SELECT  @PrimerIntervalo    = '20170807 09:30:00 am'

-- CREAMOS UNA TABLA DE INTERVALOS DE 24 HORAS 
INSERT INTO @Intervalos(Nro)
SELECT  (T1.NRO-1)*100 + (T2.NRO-1)*10 + T3.NRO
    FROM ( SELECT 1 AS 'NRO' UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 UNION SELECT 10 ) T1 
    CROSS JOIN ( SELECT 1 AS 'NRO' UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 UNION SELECT 10 ) T2
    CROSS JOIN ( SELECT 1 AS 'NRO' UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 UNION SELECT 10 ) T3
    WHERE (T1.NRO-1)*100 + (T2.NRO-1)*10 + T3.NRO <= (24 * 60) / 5
    ORDER BY T1.NRO,T2.NRO,T3.NRO

UPDATE  @Intervalos
    SET Desde = DATEADD(Mi, (Nro - 1) * 5, @PrimerIntervalo),
            Hasta = DATEADD(Mi, Nro * 5, @PrimerIntervalo)

-- CONSULTA FINAL
SELECT  V.Variable,
    CONVERT(CHAR(5), I.Desde, 108)  AS 'Desde',
    CONVERT(CHAR(5), I.Hasta, 108)  AS 'Hasta',
    ISNULL(COUNT(D.Tiempo),0)       AS 'Cantidad'
    FROM (SELECT DISTINCT Variable AS Variable
            FROM @Datos
        ) V
    LEFT JOIN @Intervalos I
        ON 1 = 1
    LEFT JOIN @Datos D
        ON D.Tiempo >= I.Desde
        AND D.Tiempo < I.Hasta
        AND D.Variable = V.Variable
        AND D.Tiempo BETWEEN @PrimerIntervalo AND DATEADD(HH, 24, @PrimerIntervalo)
    GROUP BY V.Variable,
        I.Nro, 
        I.Desde,
        I.Hasta
    ORDER BY Nro

The first part does not have much mystery, we created a variable type table, (it could be a temporary one too) for an example of what would be your table datos and a new one that we will call Intervalos . Then we set the value of a variable @PrimerIntervalo that will represent from where we will make the query about datos . As an example SELECT @PrimerIntervalo = '20170807 09:30:00 am' our query will begin on 7/8 at 9.30.

Then we complete Intervalos with one of the possible methods to generate sequences or intervals that we can use in SQL Server, there are better ways in terms of performance but the one I show you is quite simple to understand, they are about CROSS JOINS of three tables with 10 records each which will allow us to generate a single table with 1000 records (10 x 10 x 10), numbered from 1 to 1000, of which we will only use 288 to generate the intervals. I suggest this excellent article on how to generate sequences in SQL server in case you want to deepen the subject.

To end this stage, we simply generate the fields From / To adding 5 minutes to the date of @PrimerIntervalo by Nro of Intervalos , which would leave us something like this:

Nro     Desde                       Hasta  
======= =========================== =========================
1,00    07/08/2017 09:30:00 a.m.    07/08/2017 09:35:00 a.m.
2,00    07/08/2017 09:35:00 a.m.    07/08/2017 09:40:00 a.m.
3,00    07/08/2017 09:40:00 a.m.    07/08/2017 09:45:00 a.m.
....

This already gives us a guideline of what we are going to end up doing: "crossing" these periods / intervals with datos if in one we do not have records we will count it as 0.

And finally this query:

-- CONSULTA FINAL
SELECT  V.Variable,
    CONVERT(CHAR(5), I.Desde, 108)  AS 'Desde',
    CONVERT(CHAR(5), I.Hasta, 108)  AS 'Hasta',
    ISNULL(COUNT(D.Tiempo),0)       AS 'Cantidad'
    FROM (SELECT DISTINCT Variable AS Variable
            FROM @Datos
            WHERE Tiempo BETWEEN @PrimerIntervalo AND DATEADD(HH, 24, @PrimerIntervalo)
        ) V
    LEFT JOIN @Intervalos I
        ON 1 = 1
    LEFT JOIN @Datos D
        ON D.Tiempo >= I.Desde
        AND D.Tiempo < I.Hasta
        AND D.Variable = V.Variable
        AND D.Tiempo BETWEEN @PrimerIntervalo AND DATEADD(HH, 24, @PrimerIntervalo)
    GROUP BY V.Variable,
        I.Nro, 
        I.Desde,
        I.Hasta
    ORDER BY Nro

That verifies each different Variable that you have in the 24 hours requested and for each of the 288 time intervals, the output is something like this:

Variable  Desde Hasta   Cantidad
========= ===== =====   ========
TAG_1     09:45 09:50   0,00
TAG_1     09:50 09:55   0,00
TAG_1     09:55 10:00   0,00
TAG_1     10:00 10:05   2,00
TAG_1     10:05 10:10   0,00
TAG_1     10:10 10:15   2,00
TAG_1     10:15 10:20   0,00
TAG_1     10:20 10:25   0,00
TAG_1     10:25 10:30   1,00
TAG_1     10:30 10:35   0,00
...
    
answered by 07.08.2017 / 16:44
source
0

If I have understood correctly, you should paint all the ranges of 5 minutes of a day (according to what you said, it would be 288 lines) and that depending on whether there is a value in the data table, I will paint a 0 or a 1 truth?

If so, I believe that it can not be done through only a consultation. Since you can not paint values that do not exist in the table, you should have a for or some control structure to iterate over those 288 ranges of 5 minutes that you have a day and go consulting in each round if there is a value in the table to know whether to paint a 1 or a 0

To do that you should use procedures , the part bad that I see this, is that you would have to launch 288 queries (one for each range) and that could overload the BD a little.

    
answered by 07.08.2017 в 12:45