How to read a CSV file in Python and extract the maximum and minimum value per minute?

1

I used this code to extract the data from a CSV file, filter it and even organize it per minute, but I would like your help to create a cycle that allows me to update the information per minute, determining the maximum and the minimum value.

import pandas as pd
import datetime

datos = pd.read_csv('C:/Users/TECNOLOGIA/datos.csv', names=['LocalTime', 'Message', 'MarketTime', 'Symbol', 'Type', 'Price', 'Size', 'Source','Condition','Tick','Mmid','SubMarketId','Date'], usecols=['Type','MarketTime','Price'],index_col='Type')

df=pd.DataFrame(datos)
df=(df.loc['Type=0'])
"""Con el siguiente codigo se eliminan las letras de MarketTime y Price"""

df2 = pd.DataFrame()
df2['MarketTime']=df['MarketTime'].str.extract('((?:[01]\d|2[0-3]):[0-5]\d:[0-5]\d)')
df2['Price']=df['Price'].str.extract('(\d+(?:\.\d+)?)')

"""Con el siguiente codigo se agrupa por minutos"""

df2['MarketTime']=pd.DatetimeIndex(df2['MarketTime'])
df2.set_index(keys='MarketTime', inplace=True)
inicio=datetime.time(11,18)
fin=datetime.time(11,19)
print(df2[['Price']].between_time(inicio,fin))

What I have not been able to do is make this a loop that allows me to determine the highest and lowest value of each minute of the file. This file is being updated constantly, it is important that the code take the new information.

The code generates this output:

    
asked by Manuel Rincon 29.01.2018 в 06:17
source

2 answers

1

It can be done in several ways. A quick way is to create an infinite loop with the instruction " while True " and nesting all the code inside that while. To run every 1 minute you must include the time.sleep (60) statement at the end of the loop.

import pandas as pd
import datetime
import time

while True:
   datos = pd.read_csv('C:/Users/TECNOLOGIA/datos.csv', names=['LocalTime', 'Message', 'MarketTime', 'Symbol', 'Type', 'Price', 'Size', 'Source','Condition','Tick','Mmid','SubMarketId','Date'], usecols=['Type','MarketTime','Price'],index_col='Type')
   #Incluye el resto de código...
   time.sleep(60)  # Espera 60 segundos
    
answered by 29.01.2018 / 15:05
source
0

First you should truncate the MarketTime attribute so that they are only up to the minutes in the following way

df['MarketTime'] = df['MarketTime'].values.astype('<M8[m]')

Then using the function function max and min you would find the corresponding values grouped by MarketTime, it would be something like this:

df.groupby('MarketTime').agg({'max':np.max,'min':np.min})['Price']

This would return a dataFrame with MarketTime , max and min as columns

    
answered by 29.01.2018 в 17:03