Working with Python 3.5. and Pandas
I am developing an application which performs a series of calculations and processes on a dataframe in Pandas. I filled that dataframe through a select that I launched against my database in SQL Server through the pypodbc library.
import pyodbc
import pandas.io.sql as pd
conn = pyodbc.connect('DRIVER={SQL Server};SERVER=GENIL\Luna;DATABASE=Central;UID=sa;PWD=1')
sql="Select IdActivo,NombreActivo,tickeractivo from Activos"
df = pd.read_sql(sql,conn)
df.head(10)
print (df)
Well, said select returns more than 15 million records so it takes almost 10 minutes to receive it and store it in the panda dataframe for further processing.
My question is whether it is possible to create and manage the result of this select in cache so that it is much faster to load data in any dataframe that you need throughout the process.
Any suggestions on how I can achieve this?
Thank you very much
Angel