The idea is that I have a list that takes out tweets in each refresh, so it is creating a list of new data with their dates, I have a word counter and in sum the expected result is to count the words and assign them to a list of dates that are being updated, this is my list of dates ...
FechasListC = []
for i in FechasList:
if i not in FechasListC:
FechasListC.append(i)
print FechasListC
[u'Jul07', u'Jul08', u'Jul10', u'Jul11', u'Jul12', u'Jul13', u'Jul14',
u'Jul19', u'Jul23', u'Jul25']
and my counter is this ...
cv = CountVectorizer()
count_matrix = cv.fit_transform(Jul07.text)
jul07count = pd.DataFrame(cv.get_feature_names(), columns=["word"])
jul07count["count"] = count_matrix.sum(axis=0).tolist()[0]
jul07count = jul07count.sort_values("count",
ascending=False).reset_index(drop=True)
jul07count.set_index("word", inplace=True)
cv = CountVectorizer()
count_matrix = cv.fit_transform(Jul08.text)
jul08count = pd.DataFrame(cv.get_feature_names(), columns=["word"])
jul08count["count"] = count_matrix.sum(axis=0).tolist()[0]
jul08count = jul08count.sort_values("count",
ascending=False).reset_index(drop=True)
jul08count.set_index("word", inplace=True)
The idea is to go creating those "jul08count" on the counters ...
A thousand thanks!