I'm creating a table with pandas where the first two columns are created with numpy arrays:
age = np.random.randint(20,85,size=400)
possible_genders = ['male','female']
gender = [np.random.choice(possible_genders) for i in range(400)]
the table itself is:
df = pd.DataFrame({'age': age, 'gender': gender})
Then I want to create a third column that is another array whose values are based on values age and gender in the following way:
for index, row in df.iterrows():
if gender == 'male' and age > 45:
print(1)
elif gender == 'male' and age < 45:
print(2)
elif gender == 'female' and age > 55:
print(3)
else:
print(4)
At the moment I have only put the prints to test if it discriminates each case well in relation to the original table but the output generated by the prints is a column of 400 cuatros: For what is this? How can I get this discrimination effectively?