Problem Anomaly Detection Twitter package, data frame with 0 columns and 0 rows

1

Good morning, everyone. I've been trying to use Twitter's anomaly detection package for the available R language here . My problem is that trying to emulate the examples of Twitter with a database in which I find myself working available here It gives a very unexpected result. The script that I am using is this:

library(readr)
t <- as.data.frame(read_csv("D:/Descargas/turkey_elec.csv", 
          col_names = FALSE))

library(AnomalyDetection)
AnomalyDetectionVec(t, max_anoms=0.02, period=365, direction='both', only_last=FALSE, plot=TRUE)

and it throws me the following result:

$anoms
data frame with 0 columns and 0 rows

$plot
NULL

I have looked on the internet and I have met people who have the same problem but so far no response. Has someone else happened to you? Can you help me? Any other package or code for univariate series in R?

Many thanks to everyone who takes the time to read my problem.

Good morning.

    
asked by Santiago Marín Agudelo 18.01.2018 в 22:47
source

1 answer

0

The problem, which is not really a problem either, is that in the data of your example AnomalyDetectionVec() does not find any anomaly, so you get that message. The return of the function is a list of two elements, one is% co_of% that is the graph and the other is% co_of% that is a% co_of% with the set of anomalous values. When doing:

res <- AnomalyDetectionVec(t, max_anoms=0.02, period=365, direction='both', only_last=FALSE, plot=TRUE

res$anoms

You get it right:

> data frame with 0 columns and 0 rows

That is, plot is empty, no anomalies have been detected. What you can do to test the function, is to force an anomaly, for example:

t$Valor[417] <- t$Valor[417]*3 # Triplicamos un valor cualquiera
res <- AnomalyDetectionVec(t, max_anoms=0.02, period=365, direction='both', only_last=FALSE, plot=TRUE)
res$anoms
res$plot

The result:

 index    anoms
1   417 48653.12

And the graphic:

Now, as you well ask yourself, why does not it detect several obvious anomalies? You have to think about the nature of the data, we are talking about electricity consumption, we have daily measurements, we know that consumption will have a tendency to grow and demand will be seasonal in the year. In general, when talking about electricity or gas consumption, seasonality is measured per quarter, logically by the seasons that the temperature will vary without a doubt depending on the demand. In your example, you are doing an annual analysis: anoms , it would be best to do it per quarter: data.frame and the other issue is to establish the long-term period, in our case the year: anoms

AnomalyDetectionVec(t, 
                    max_anoms=0.02, 
                    period=round(365/4), 
                    longterm_period=365,
                    direction='both', 
                    plot=TRUE)

And now, yes, we see some more appropriate anomalies with the graph

    
answered by 19.01.2018 / 05:11
source