How to call data in csv of date and time that are in the same column, to make a daytime cycle?

2

I am very new to this! I'm looking at a class called data analysis, where we are asked to perform a diurnal cycle of a time series that measures radiation per minute, but I have a file in csv where in a single column I have the date and time, as I do to separate them and then select only the time of interest.

I expect a quick response.

Good morning.

Example Data = pd.read_csv ("6001_Pira.csv", index_col = 0)

Fecha_Hora                 Radiacion  Temperatura  Calidad

2016-11-09 21:24:00        103       299.87        1
2016-11-09 21:25:00        105       299.85        1
2016-11-09 21:26:00        108       299.85        1
2016-11-09 21:27:00        110       299.85        1
2016-11-09 21:28:00        114       299.85        1
2016-11-09 21:29:00        117       299.80        1
2016-11-09 21:30:00        120       299.78        1
2016-11-09 21:31:00        121       299.76        1
2016-11-09 21:32:00        121       299.75        1

As it is an analysis per minute, that is, I need to know how the radiation is every day at 21:28:00 but the time and date are in a single column, as you can filter the data belonging to that time especially?

    
asked by Claudia P. 25.11.2016 в 18:27
source

1 answer

1

First of all I want to warn that my knowledge of Pandas is not the best and that it is possible that there are more appropriate ways of doing this.

In principle all the data you have are strings, that is, if you read a csv all the data will be type str (string) unless you force the type conversion when you read the document. Since the date is type str and not type datetime the simplest is to find the string you want within the column Fecha_Hora . For this you can use .str.contains(subcadena) :

import pandas as pd

Data = pd.read_csv("6001_Pira.csv")
Resultados = Data[Data['Fecha_Hora'].str.contains('21:28:00')]

If you will force the type conversion when reading your document because you need the dates to be datetime to be able, for example, to compare them properly, then the previous thing would not be valid.

If the data in the column Fecha_Hora were of type datetime we can use .dt.time to extract only the time (hour, minutes and seconds but without the date) of each row and filter comparing it with the desired time:

import pandas as pd
from datetime import datetime

dtypes = {'Fecha_Hora':datetime}
parse_dates=['Fecha_Hora']
Data = pd.read_csv("6001_Pira.csv", dtype=dtypes, parse_dates=parse_dates)

diana = datetime.strptime('21:28:00','%H:%M:%S').time()
Resultados = Data[Data['Fecha_Hora'].dt.time == diana])

If it is applied over a csv like the following (by way of example):

ID,Fecha_Hora,Radiaccion,Temperatura,Calidad
01,2016-11-09 21:24:00,103,299.87,1
02,2016-11-09 21:25:00,105,299.85,1
03,2016-11-09 21:26:00,108,299.85,1
04,2016-11-09 21:27:00,110,299.85,1
05,2016-11-09 21:28:00,114,299.85,1
06,2016-11-09 21:29:00,117,299.80,1
07,2016-11-09 21:30:00,120,299.78,1
08,2016-11-09 21:31:00,121,299.76,1
09,2016-11-09 21:32:00,121,299.75,1
10,2016-11-10 06:28:00,114,299.85,1
11,2016-11-10 20:28:00,114,299.85,1
12,2016-11-10 21:20:00,114,299.85,1
13,2016-11-10 21:28:00,114,299.85,1
14,2016-11-10 21:28:01,114,299.85,1

The two codes give us the same output when we print Resultados :

    ID           Fecha_Hora  Radiaccion  Temperatura  Calidad
4    5  2016-11-09 21:28:00         114       299.85        1
12  13  2016-11-10 21:28:00         114       299.85        1

That I think it's what you want.

    
answered by 25.11.2016 / 21:34
source