Do operations on columns in a Dataframe loop

2

I want to do calculations on three columns of an array values_array .

def calculateAllEMA(self,values_array):
    df = pd.DataFrame(values_array, columns=['BTC', 'ETH', 'DASH'])
    for i,column in enumerate(df[column]):
        ema=[]
        for i in range(0, len(column)-24):
            EMA_yesterday = column.iloc[1+i:22+i].mean()
            k = float(2)/(22+1)
            ema.append(column.iloc[23 + i]*k+EMA_yesterday*(1-k))
        mean_exp[i] = ema[-1]
    return mean_exp

But he tells me:

    for i,column in enumerate(df[column]):
UnboundLocalError: local variable 'column' referenced before assignment

And I do not see where it is referenced ...

Here is values_array :

[(3554.05, 299.44, 198.51), (3554.05, 299.46, 198.51),
(3554.05, 299.55, 198.54), (3554.05, 299.55, 198.54),
(3554.05, 299.55, 198.54), (3554.05, 299.55, 198.51),
(3554.05, 299.44, 198.51), (3553.8, 299.64,198.49),
(3553.8, 299.65, 198.49), (3553.8, 299.65, 198.49),
(3553.8, 299.65, 198.49), (3553.8, 299.65, 198.49),
(3553.8, 299.64, 198.49), (3553.8, 299.65, 198.49),
(3553.8, 299.65, 198.49), (3553.8, 299.65, 198.49), 
(3553.8, 299.65, 198.49), (3553.8, 299.64, 198.49), 
(3553.91, 299.55, 198.54), (3553.8, 299.64, 198.49), 
(3553.8, 299.65, 198.49), (3553.8, 299.69, 198.49),
(3553.8, 299.65, 198.49), (3553.8, 299.65, 198.49)]
    
asked by ThePassenger 11.08.2017 в 18:01
source

3 answers

1

The problem is that you are using a variable that has not yet been assigned:

for i,column in enumerate(df[column]):

column in the enumerate (df [column]) segment is not defined.

I should try something similar to this:

def calculateAllEMA(values_array):
    df = pd.DataFrame(values_array, columns=['BTC', 'ETH', 'DASH'])
    column_by_search = "BTC" # Según el nombre de la columna en 'columns'
    for i,column in enumerate(df[column_by_search]):
    
answered by 11.08.2017 / 18:44
source
1

It seems to me that what you need, if you want to iterate each column, is something like this:

def calculateAllEMA(self,values_array):
    columns = ['BTC', 'ETH', 'DASH']
    df = pd.DataFrame(values_array, columns=columns)
    for i, column in enumerate(columns):
        ema=[]
        for j in range(0, len(df[column])-24):
            EMA_yesterday = df[column].iloc[1+j:22+j].mean()
            k = float(2)/(22+1)
            ema.append(df[column].iloc[23 + j]*k+EMA_yesterday*(1-k))
        mean_exp[i] = ema[-1]
    return mean_exp

I'm not sure if df[column] is correct since I do not know much about the library, but it makes sense.

Keep in mind that in the second loop I'm using j to not interfere with the i of your first loop.

    
answered by 11.08.2017 в 18:50
1

Pandas has its own iterators, if you want to iterate through columns and at the same time collect the index of each column, it would be like this:

import pandas as pd

df = pd.DataFrame(data = values, columns = ['BTC', 'ETH', 'DASH'])

for idx,columna in enumerate(df):

print(idx, columna)

Exit:

0 BTC
1 ETH
2 DASH

If you want to go through rows:

for fila in df.iterrows():

    print(fila)

Exit:

(0, BTC     3554.05
ETH      299.44
DASH     198.51
Name: 0, dtype: float64)
(1, BTC     3554.05
ETH      299.46
DASH     198.51
Name: 1, dtype: float64)
(2, BTC     3554.05
ETH      299.55
DASH     198.54
Name: 2, dtype: float64)

...

So your function would look like this:

def calculateAllEMA(self,values_array):

    df = pd.DataFrame(values_array, columns=['BTC', 'ETH', 'DASH'])


    for idx,column in enumerate(df):

        ema=[]

        for i in range(0, len(df)-24):

            EMA_yesterday = column.iloc[1+i:22+i].mean()

            k = float(2)/(22+1)

            ema.append(column.iloc[23 + i]*k+EMA_yesterday*(1-k))

        mean_exp[i] = ema[-1]

    return mean_exp

I hope it helps.

    
answered by 18.12.2018 в 15:35