How do I look for a label by printing its ruby content?

0

I need to make a code that reads me web pages, and that incidentally collects the re-address links and copies what they have inside, it's in ruby

  require 'net/http'
  pagina = Net::HTTP.get(ARGV[0],ARGV[1])
  puts "...Leyendo la pagina....\n#{pagina}\n=========\n"

  ahref = "<a href"
  cierra ="</a>"
  startahref = pagina.gsub(ahref)
  cierraahref = pagina.gsub(cierra)
  img = "<img"
  cierra2 = ">"
  startimg =pagina.gsub(img)
  cierraimg = pagina.gsub(cierra2)

  puts startahref

  puts "<a href...</a> cantidad : "
  ahref2 = startahref..cierraahref
  puts pagina[ahref2]

  puts "<img ...> cantidad :"
  img2 = startimg..cierraimg
  puts pagina[img2]
    
asked by Felipe Olaya Ospina 05.11.2017 в 20:56
source

1 answer

0

To analyze html / xml code and similar no I would recommend doing it by hand (unless it is for educational purposes); instead you should use a specific library for that task, for example the gem Nokogiri .

For example, to obtain links using Nokogiri, you could do the following:

require 'nokogiri'
require 'net/http'

pagina  = Net::HTTP.get(ARGV[0],ARGV[1])
enlaces = Nokogiri::HTML(pagina).xpath('//a[@href]').map { |link| link['href'] }

puts enlaces

This would give you an arrangement with the links (i.e. the values of href of each label a ), not the content of the label as such.

    
answered by 06.11.2017 / 00:00
source