You do not need to reinvent the wheel in principle, there's something in Numpy that does that for you, numpy.unique
. Returns the unique elements of an array, but if we pass the parameter index
as True
then returns a second array containing the indexes of those elements, and if there is more than one returns the index of the first found.
Applying it over the column in question (the second column of your matrix) you only need to do slicing later on your matrix using that array.
In your case there is a problem, you want the last row in the final matrix that has the repeated column and not the first one. You could make the matrix be ordered in the appropriate way, something like this:
[[1581 1243 2459 1257]
[1581 1243 2459 1259]
[1581 1243 2459 1260]
[1581 1244 2459 1260]
[1581 2018 2459 2032]]
However, as you have it, we can use another parameter of numpy.unique
, return_counts
that gives us the times that these elements appear repeated and recalculate the indexes:
import numpy as np
matriz = np.array([[1581, 1243, 2459, 1260],
[1581, 1243, 2459, 1259],
[1581, 1243, 2459, 1257],
[1581, 1244, 2459, 1260],
[1581, 2018, 2459, 2032]])
_, indices, counts = np.unique(matriz[:,1], return_index=True, return_counts = True)
filtro = counts + indices - 1
matriz = matriz[filtro, :]
print(matriz)
Exit:
[[1581 1243 2459 1257]
[1581 1244 2459 1260]
[1581 2018 2459 2032]]
Warning: In order for this code to work properly, your matrix must be previously sorted in the same way as yours. That is, the matrix has to be previously sorted so that the rows with the same element in the second column are together and the last of each group is the one you want in your final matrix.