I'm trying to make subtractions between rows of 2 columns. For this I use a function that subtracts row1-row2, squares it and takes it out of square root to eliminate negative sign. Then it goes to row2-row3 and so on until the end of the data. Later the function calculates the same but for a separation of 2 rows (row1-row3), 3 rows and so on up to rows-1. The results are saved in dataFrame. The data that I am working with is arranged like this:
df1
Out[44]:
TRACK_ID POSITION_X POSITION_Y POSITION_T
0 0 1 1 35.36
1 0 2 2 35.52
2 0 3 3 35.68
3 0 4 4 35.84
4 0 1 1 35.36
5 0 4 3 34.88
6 0 2 3 34.40
7 0 6 4 33.92
8 0 4 2 33.44
The function seems to work, the problem is that I realized that for some values of separation between subtractions, I repeated exactly the same values of the previous column. Example:
rad
Out[28]:
0 1 2 3 4 5 6 \
0 1.414214 2.828427 4.242641 0.000000 3.605551 2.236068 2.236068
1 1.414214 2.828427 1.414214 2.236068 1.000000 4.472136 4.472136
2 1.414214 2.828427 1.000000 1.000000 3.162278 1.414214 1.414214
3 4.242641 1.000000 2.236068 2.000000 2.000000 NaN NaN
4 3.605551 2.236068 5.830952 3.162278 NaN NaN NaN
5 2.000000 2.236068 1.000000 NaN NaN NaN NaN
6 4.123106 2.236068 NaN NaN NaN NaN NaN
7 2.828427 NaN NaN NaN NaN NaN NaN
8 NaN NaN NaN NaN NaN NaN NaN
7 8
0 3.162278 NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
5 NaN NaN
6 NaN NaN
7 NaN NaN
8 NaN NaN
Column 5 and 6 is the same.
This is my complete code:
df1 = df[['TRACK_ID','POSITION_X','POSITION_Y','POSITION_T']].copy()
#Parameter input
N = df1.groupby('TRACK_ID').size()
max_time = N*(0.160)
frames = max_time/N
t_step=frames.item()
data = pd.DataFrame({'N':N,'max_time':max_time,'frames':frames})
print(data)
t=np.linspace(0.160, max_time.item(), N)
#funcion para calcular las diferencias
def radial(df1, coords=['POSITION_X', 'POSITION_Y']):
tau = t.copy()
shifts = np.divide(tau,t_step).astype(float) #matrix que se ocupa para construir las diferencias entre valores de filas
print(shifts)
radials = list()
for i, shift in enumerate(shifts):
diffs = np.array(df1[coords] - df1[coords].shift(-shift))
sqdist = np.square(diffs).sum(axis=1)
r = np.sqrt(sqdist)
radials.append(r)
radial_disp = pd.DataFrame({'radials':radials})
return radials
radial_d = radial(df1, coords=['POSITION_X', 'POSITION_Y'])
radd = pd.DataFrame.from_records(radial_d) #horizontal
rad = radd.transpose() #vertical
I already modified some parts of the function and had realized that my variable shifts
that establishes the separation between the subtractions gave me repeated results because they were as int
fix it by putting float
but the result is still the same. Why is the calculation repeated for the same column? Thanks for reading my post