Improve the speed by creating dynamic columns in a Dataframe

Question

Improve the speed by creating dynamic columns in a Dataframe

Navigation

1

I am creating a Dataframe with the following information:

import numpy as np
import pandas as pd
from time import time

start_time = time()

columns = 60

Data = pd.DataFrame(np.random.randint(low=0, high=10, size=(700000, 3)), columns=['a', 'b', 'c'])
Data['f'] = (Data.index % 60) + 1
Data['column_-1'] = 100
for i in range(columns):
    Data['column_' + str(i)] = np.where(  # Condicion 1
        Data['f'] == 1,
        1000 + i,
        np.where(  # Condicion 2
            i < Data['f'],
            0,
            np.where(  # Condicion 3
                Data['a'] > Data['b'],
                Data['column_' + str(-1)] * Data['c'],
                Data['column_' + str(-1)]
            )
        )
    )

elapsed_time = time() - start_time
print("Elapsed time: %.10f seconds." % elapsed_time)

Elapsed time: 1.0710000992 seconds.

I want to know if there is a better way to do it, generating the columns dynamically and improving the speed of the script, thanks.

python python-3.x numpy pandas

asked by Yeison H. Arias 08.10.2018 в 20:33

source

0 answers

Hibernate performs unwanted queries (@ManyToOne) Laravel nested queries, how to query three tables