Read table of a url with R

2

I have problems with the read.table function. I want to read a table from an url, and save it in R as a dataframe. The url is: link

I wrote this code:

library(RCurl)

a <- getURL('https://datanalytics.com/uploads/datos_treemap.txt')
b = read.table(a, sep="\t ", header = TRUE, nrows=3)

download.file("https://datanalytics.com/uploads/datos_treemap.txt","/mnt/M/Ana Rubio/R/datos_treemap.txt",method = c("wget"))

But I can not get the data saved as a dataframe, and I get the following error:

[1] "<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML 2.0//EN\">\n<html><head>\n<title>302 Found</title>\n</head><body>\n<h1>Found</h1>\n<p>The document has moved <a href=\"https://datanalytics.com/uploads/datos_treemap.txt\">here</a>.</p>\n<hr>\n<address>Apache/2.4.27 (Ubuntu) Server at datanalytics.com Port 80</address>\n</body></html>\n"

I have also tried to download the file as a txt and save it to my computer, but I generated a txt with the table in a single row. The code that I used is:

download.file("https://datanalytics.com/uploads/datos_treemap.txt","/mnt/M/Ana Rubio/R/datos_treemap.txt",method = c("wget"))

Does anyone know what mistakes I'm making? Thanks in advance.

    
asked by Ana 09.11.2017 в 15:33
source

2 answers

2

Ana:

The problem appears, in some systems, with the "HTTPS" connections. Use the "curl" method, in "download.file", to download the dataset instead of "wget". It would be something like this:

download.file("https://datanalytics.com/uploads/datos_treemap.txt", "datos_treemap.txt", "curl")
df<-read.table("datos_treemap.txt", header=T)
class(df) #[1] "data.frame"
View(df)

    
answered by 11.11.2017 / 18:38
source
0

Ana, you do not need to download the file using download.file since read.table can directly receive and process the url,

  

file can also be a complete URL. (For the supported URL schemes, see   the 'URLs' section of the help for url.)

for example:

b = read.table("https://datanalytics.com/uploads/datos_treemap.txt", sep="\t", header = TRUE, nrows=3)

I also modified the sep parameter since you had a space of more and only accept a single character.

    
answered by 09.11.2017 в 15:47