I want to create an array, a dictionary or a DataFrame (whatever the form) that contains the id grouped by group of subscribers that are in the same group.
The ids are in a DataFrame side_subscriber.index
, the output of this array is:
Int64Index([160, 161, 296, 306, 365, 386, 471], dtype='int64', name=u'subscriber_id')
Groups are in numpy.ndarray
called indexResultat
:
[1 1 0 0 1 1 1]
I try to do the following without knowing how to initiate the array grouping by group:
kernelGroup = []
i = 0
for idx in indexResultat:
print "idx : ",idx
i = i+1
print kernelGroup
for kernel in kernelGroup:
print "kernel : ",kernel
if idx == kernel:
print "we have the group ",kernel
print kernel
# anadimos el id
kernelGroup = kernelGroup[kernel].append(side_subscriber.index[idx])
break
# no habemos el grupo
print "we don't have the group", idx
#kernelGroup = kernelGroup.append(kernelGroup,[idx,side_subscriber.index[idx]])
kernelGroup = kernelGroup.append([idx,side_subscriber.index[i]])
print kernelGroup
And I get:
idx : 1
[]
we don't have the group 1
idx : 1
None
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-64-a0add6c15d78> in <module>()
5 i = i+1
6 print kernelGroup
----> 7 for kernel in kernelGroup:
8 print "kernel : ",kernel
9 if idx == kernel:
TypeError: 'NoneType' object is not iterable
The output I expect this
{0:[296, 306], 1:[160, 161, 365, 386, 471]}:
I know that this function does more or less what I want to do:
def cluster_points(X, mu):
clusters = {}
for x in X:
bestmukey = min([(i[0], np.linalg.norm(x-mu[i[0]])) \
for i in enumerate(mu)], key=lambda t:t[1])[0]
try:
clusters[bestmukey].append(x)
except KeyError:
clusters[bestmukey] = [x]
return clusters