Hi, I have this df. I am trying to obtain average values assigned to each hour of measured data with a frequency of 5 minutes. This is how I charge the df from MySQL:
import mysql.connector
from mysql.connector import errorcode
import pandas as pd
def data_base_connection(user,password,host,database):
config = {
'user': user,
'password': password,
'host': host,
'database': database,
'raise_on_warnings': True,
}
try:
cnx = mysql.connector.connect(**config)
return cnx
except mysql.connector.Error as err:
if err.errno == errorcode.ER_ACCESS_DENIED_ERROR:
print("Something is wrong with your user name or password")
elif err.errno == errorcode.ER_BAD_DB_ERROR:
print("Database does not exist")
else:
print(err)
cnx=data_base_connection("root","cigefi","127.0.0.1","a_siquirres")
# Create cursor
cursor=cnx.cursor()
query1=("select fh,tc10, tc20, tc30, hr10, hr20, hr30, prec10, prec20, prec30, vs10, vu10, dir10, desU10, vmax10, vs20, vu20, dir20, desU20, vmax20, vs30, vu30, dir30, desU30, vmax30, tsuelo, vw, paus from siqoriginal")
# Excute Query
cursor.execute(query1)
rows = cursor.fetchall()
df = pd.DataFrame([[ij for ij in i] for i in rows]) # Se cargan los rows pandas, de la consulta que se hizo.
df.rename(columns={0: 'fh', 1: 'tc10', 2:'tc20', 3:'tc30', 4: 'hr10', 5: 'hr20', 6: 'hr30', 7:'prec10', 8: 'prec20', 9: 'prec30', 10: 'vs10', 11: 'vu10', 12: 'dir10', 13: 'desU10', 14: 'vmax10', 15: 'vs20', 16: 'vu20', 17: 'dir20', 18: 'desU20', 19: 'vmax20', 20: 'vs30', 21: 'vu30', 22: 'dir30', 23: 'desU30', 24: 'vmax30', 25: 'tsuelo', 26: 'vw', 27: 'paus'}, inplace=True)
df['fh'] = pd.to_datetime(df['fh'], infer_datetime_format=True) # Convertir datetime.
df
Then I try to use Pandas' gruopby to make the hourly averages:
medias = df.groupby([df.index.year,
df.index.month,
df.index.hour])['tc10'].mean()
medias.index.names = ["Año", "Mes", "Hora"]
However, I miss this error:
'RangeIndex' object has no attribute 'year'.
I would like to know if there is a problem with the frequency of the data, and if there is any way in which you can assign a time average to each hour. Like a for, or some module of numpy or pandas.
Thank you very much in advance.