I have the following dataframe in python:
month = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,1,2,3,4]
active = [1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1]
data1 = [1709.1,3869.7,4230.4,4656.9,48566.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,93738.2,189293.2,194412.6,206585.8]
df = pd.DataFrame({
'month' : month,
'active' : active,
'd1' : data1,
'calculate' : 0,
});
and I need to calculate the column 'calculate', in the following way:
month active d1 calculate 0 1 1 1709.1 569.70 1 2 1 3869.7 1859.60 2 3 1 4230.4 3269.73 3 4 1 4656.9 4822.03 4 5 0 48566.0 0.00 5 6 0 0.0 0.00 6 7 0 0.0 0.00 7 8 0 0.0 0.00 8 9 0 0.0 0.00 9 10 0 0.0 0.00 10 11 0 0.0 0.00 11 12 0 0.0 0.00 12 13 0 0.0 0.00 13 14 0 0.0 0.00 14 15 0 0.0 0.00 15 16 0 0.0 0.00 16 17 0 0.0 0.00 17 18 0 0.0 0.00 18 19 0 0.0 0.00 19 20 0 0.0 0.00 20 1 1 93738.2 31246.07 21 2 1 189293.2 94343.80 22 3 1 194412.6 159148.00 23 4 1 206585.8 228009.93
I'm doing it in the following way:
df['calculate'] = np.where(
df.month > 1,
np.where(
df.active,
(df.d1/3).cumsum(),
0,
),
(df['d1']/3)
)
but the result is not what was expected:
month active d1 calculate 0 1 1 1709.1 569.700000 1 2 1 3869.7 1859.600000 2 3 1 4230.4 3269.733333 3 4 1 4656.9 4822.033333 4 5 0 48566.0 0.000000 5 6 0 0.0 0.000000 6 7 0 0.0 0.000000 7 8 0 0.0 0.000000 8 9 0 0.0 0.000000 9 10 0 0.0 0.000000 10 11 0 0.0 0.000000 11 12 0 0.0 0.000000 12 13 0 0.0 0.000000 13 14 0 0.0 0.000000 14 15 0 0.0 0.000000 15 16 0 0.0 0.000000 16 17 0 0.0 0.000000 17 18 0 0.0 0.000000 18 19 0 0.0 0.000000 19 20 0 0.0 0.000000 20 1 1 93738.2 31246.066667 21 2 1 189293.2 115354.500000 22 3 1 194412.6 180158.700000 23 4 1 206585.8 249020.633333
I do not know if I'm clear on my application, I thank whoever can help me.