Starting, you first download the messages as "tweets.json"
and then load them as "mensajes_twitter.json"
. Be careful how you formulate the question. Also, if you want to count retweets, one minute is a very short time to capture that information.
R
reads the file .json
as if it were a list. Each list is composed of a series of boxes in which logical values, vectors or even more lists can be housed. Use the str(lista_mensajes_twitter[[1]])
command and you will see that the first tweet in the list is composed.
That said, to count the length of the first tweet you have to know in which box the text is located.
str(lista_mensajes_twitter[[1]], max.level = 1)
List of 30
$ created_at : chr "Thu May 25 14:44:02 +0000 2017"
$ id : num 8.68e+17
$ id_str : chr "867753126899666944"
$ text : chr "RT @HoopsOverHoes_: Bro in the first pic I thought....nvm https://t.co/BVsMV3HTWQ"
$ source : chr "<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>"
$ truncated : logi FALSE
$ in_reply_to_status_id : NULL
$ in_reply_to_status_id_str: NULL
$ in_reply_to_user_id : NULL
$ in_reply_to_user_id_str : NULL
$ in_reply_to_screen_name : NULL
$ user :List of 38
$ geo : NULL
$ coordinates : NULL
$ place : NULL
$ contributors : NULL
$ retweeted_status :List of 29
$ quoted_status_id : num 8.67e+17
$ quoted_status_id_str : chr "867076974337982465"
$ quoted_status :List of 27
$ is_quote_status : logi TRUE
$ retweet_count : num 0
$ favorite_count : num 0
$ entities :List of 4
$ favorited : logi FALSE
$ retweeted : logi FALSE
$ possibly_sensitive : logi FALSE
$ filter_level : chr "low"
$ lang : chr "en"
$ timestamp_ms : chr "1495723442183"
Knowing that it is in the box $text
or number 3, we proceed to calculate the length:
nchar(lista_mensajes_twitter[[1]][['text']])
81
To apply it to each tweet you use a loop, a apply
or the functions of the package purrr
(very powerful handling lists):
library(purrr)
len_tweets <- lista_mensajes_twitter %>% map("text") %>% map_int(nchar)
head(len_tweets)
[1] 81 140 140 134 140 127
The result is a vector with the number of characters per tweet.
To extract the number of followers and number of retweets:
fllw <- lista_mensajes_twitter %>% map("user") %>% map("followers_count")
rt <- lista_mensajes_twitter %>% map_dbl("retweet_count")
The number of responses is more difficult to determine, since you have to track all tweets with id
over in_reply_to_status_id
of others.
The correlation is simply made with cor between the vectors.