concatenate 2 array with condition in python

Question

concatenate 2 array with condition in python

Navigation

#1 by (0 votes)

2

I have 2 array, of 2 columns each. Array A has the form:

pid                track

0        77vFvcdWZi0ZqiDOoHU3YI 
1        0e9hR1vTrzlUvFH5PgA9rY 
2        00z4wF0iJsp6GwDkQxkGs6 
3        0CyRloqqjpeKEAd3cO6J6z 
4        0c1gHntWjKD7QShC8s99sq
...      .....

while array B is:

     diccionario                  id
0UaMYEvWZi0ZqiDOoHU3YI            34
0e9hR1vTrzlUvFH5PgA9rY             3
00z4wF0iJsp6GwDkQxkGs6             1 
0CyRloqqjpeKEAd3cO6J6z             2
0c1gHntWjKD7QShC8s99sq            16
...      .....

I need an array C, where the id of that array is the dictionary id of array B that corresponds to the track of array A, that is, in the form:

pid                 id
 0                  22 
 1                   3
 2                   1
 3                   2
 4                  16
...                .....

But I do not know how to do it in python

python array

asked by Skiel 24.05.2018 в 08:05

source

1 answer

How to make a servlet wait to see if it has uploaded the file before redirecting Call method every so often ionic wordpress

score 0 · Accepted Answer

If you can use the pandas library, always advisable to deal with structured data in tables as is your case, the solution is:

Load each of your tables into a pandas dataframe

Use the "track" column in table 1 as an index, and the "dictionary" column in table 2.

Make a join() of both tables. Pandas is in charge of "making fit" the rows with the same index.

As a consequence of the previous step, "NaN" values (which are a kind of null values) may appear in elements that have an index in one of the tables but not in the other. Optionally you can change those "NaN" for any other value (zero, or a string of dashes, etc.)

The result of the join will use as index the hashes that served to make the mixture. If you do not want them you can make a reset_index() to the result

You can also optionally set the column pid as result index.

The following fragment does steps 1 and 2. In this case, the tables "read" them from strings, since I find it more comfortable to do so simply by copying and pasting the example of your question. In your case, you probably read it from a file. Pandas can read csv and excel and many other formats.

Edit : By the way, I've messed up the rows in array B so that it looks better than .join() does not just concatenate "horizontally", but actually "searches" in B the elements based on the indexes.

import io
import pandas as pd

datos1 = """\
pid                track
0        77vFvcdWZi0ZqiDOoHU3YI 
1        0e9hR1vTrzlUvFH5PgA9rY 
2        00z4wF0iJsp6GwDkQxkGs6 
3        0CyRloqqjpeKEAd3cO6J6z 
4        0c1gHntWjKD7QShC8s99sq"""

datos2 = """\
     diccionario                  id
0CyRloqqjpeKEAd3cO6J6z             2
00z4wF0iJsp6GwDkQxkGs6             1 
0c1gHntWjKD7QShC8s99sq            16
0e9hR1vTrzlUvFH5PgA9rY             3
0UaMYEvWZi0ZqiDOoHU3YI            34"""

a = pd.read_table(io.StringIO(datos1), sep=r'\s+').set_index("track")
b = pd.read_table(io.StringIO(datos2), sep=r'\s+').set_index("diccionario")

The following fragment does steps 3 and 4 (fill in the unknown values with a "-"):

c = a.join(b).fillna("-")
print(c)

                        pid  id
track                          
77vFvcdWZi0ZqiDOoHU3YI    0   -
0e9hR1vTrzlUvFH5PgA9rY    1   3
00z4wF0iJsp6GwDkQxkGs6    2   1
0CyRloqqjpeKEAd3cO6J6z    3   2
0c1gHntWjKD7QShC8s99sq    4  16

The following code would do steps 5 and 6, that is, remove the "track" index and put pid as the new index, giving the desired result:

c = c.reset_index(drop=True).set_index("pid")
print(c)