Create column from another column in which each row indicates the column to use to obtain that cell

3

I have the following information in a DataFrame in Python:

##    x1   x2   x3   x4   x5   colum
##0  206  214  021  122  554     2
##1  226  234  123  456  789     4
##2  245  253  558  855  123     5
##3  265  272  000  111  222     4
##4  283  291  214  589  996     1

and I need to generate a new column depending on the value that contains the colum column, as follows:

##    x1   x2   x3   x4   x5   colum   newColum
##0  206  214  021  122  554     2       214
##1  226  234  123  456  789     4       456
##2  245  253  558  855  123     5       123
##3  265  272  000  111  222     4       111
##4  283  291  214  589  996     1       283

I do not know if I'm clear on my application, I thank whoever can help me.

    
asked by Yeison H. Arias 12.09.2018 в 12:36
source

1 answer

1

You can use pandas.DataFrame.apply on the rows (axis 1) obtaining the name of the column by simply formatting strings:

import io
import pandas as pd

data = io.StringIO("""\
 ##   x1   x2   x3   x4   x5   colum
##0  206  214  021  122  554       2
##1  226  234  123  456  789       4
##2  245  253  558  855  123       5
##3  265  272  000  111  222       4
##4  283  291  214  589  996       1
""")

df = pd.read_table(data, sep="\s+", engine="python",  index_col=0)
df["newColum"] = df.apply(lambda row: row[f'x{row["colum"]}'],  axis=1) 
>>> df

      x1   x2   x3   x4   x5  colum  newColum
##                                           
##0  206  214   21  122  554      2       214
##1  226  234  123  456  789      4       456
##2  245  253  558  855  123      5       123
##3  265  272    0  111  222      4       111
##4  283  291  214  589  996      1       283

If there is a possibility that there are values in the column that do not correspond to the head of any column you can use pandas.Series.get and assign NaN for example in these cases:

lambda row: row.get(f'x{row["colum"]}', default=np.nan)

If you use a Python version less than 3.6 you can use str.format ( "x{}".format(row["colum"]) ) instead of string literals formatted ( f'x{row["colum"] ):

    
answered by 12.09.2018 / 16:48
source