Python HTTPConnectionPool Failed to establish a new connection: [Errno 11004] getaddrinfo failed

0

I was trying to follow a scraping tutorial on grupoothis , and I The following error arises after a couple of queries.

The error in question

Traceback (most recent call last):
  File "C:\Users\Sebastián\AppData\Local\Programs\Python\Python36\lib\site-packages\urllib3\connection.py", line 141, in _new_conn
    (self.host, self.port), self.timeout, **extra_kw)
  File "C:\Users\Sebastián\AppData\Local\Programs\Python\Python36\lib\site-packages\urllib3\util\connection.py", line 60, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "C:\Users\Sebastián\AppData\Local\Programs\Python\Python36\lib\socket.py", line 745, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 11001] getaddrinfo failed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Sebastián\AppData\Local\Programs\Python\Python36\lib\site-packages\urllib3\connectionpool.py", line 601, in urlopen
    chunked=chunked)
  File "C:\Users\Sebastián\AppData\Local\Programs\Python\Python36\lib\site-packages\urllib3\connectionpool.py", line 346, in _make_request
    self._validate_conn(conn)
  File "C:\Users\Sebastián\AppData\Local\Programs\Python\Python36\lib\site-packages\urllib3\connectionpool.py", line 850, in _validate_conn
    conn.connect()
  File "C:\Users\Sebastián\AppData\Local\Programs\Python\Python36\lib\site-packages\urllib3\connection.py", line 284, in connect
    conn = self._new_conn()
  File "C:\Users\Sebastián\AppData\Local\Programs\Python\Python36\lib\site-packages\urllib3\connection.py", line 150, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.VerifiedHTTPSConnection object at 0x000001C9E10469E8>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Sebastián\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\adapters.py", line 440, in send
    timeout=timeout
  File "C:\Users\Sebastián\AppData\Local\Programs\Python\Python36\lib\site-packages\urllib3\connectionpool.py", line 639, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "C:\Users\Sebastián\AppData\Local\Programs\Python\Python36\lib\site-packages\urllib3\util\retry.py", line 388, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='en.wikipedia.orghttps', port=443): Max retries exceeded with url: //calcpad.blog/the-calcpad-language (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x000001C9E10469E8>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Sebastián\Desktop\Machine learning\python\scraping_programming_lenguaje.py", line 116, in <module>
    main()
  File "C:\Users\Sebastián\Desktop\Machine learning\python\scraping_programming_lenguaje.py", line 106, in main
    e = edges()
  File "C:\Users\Sebastián\Desktop\Machine learning\python\scraping_programming_lenguaje.py", line 38, in edges
    aux = requests.get(URL, timeout=6)
  File "C:\Users\Sebastián\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\api.py", line 72, in get
    return request('get', url, params=params, **kwargs)
  File "C:\Users\Sebastián\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\api.py", line 58, in request
    return session.request(method=method, url=url, **kwargs)
  File "C:\Users\Sebastián\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\sessions.py", line 508, in request
    resp = self.send(prep, **send_kwargs)
  File "C:\Users\Sebastián\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\sessions.py", line 618, in send
    r = adapter.send(request, **kwargs)
  File "C:\Users\Sebastián\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\adapters.py", line 508, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='en.wikipedia.orghttps', port=443): Max retries exceeded with url: //calcpad.blog/the-calcpad-language (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x000001C9E10469E8>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed',))

Here the code:

def edges():
    
    print("Entro a edges")
    time.sleep(0.8)    # pause 5.5 seconds
    
    res = []
    for lang, URL in langsWikipedia():
        html = requests.get(URL, timeout=6).text
        soup = BeautifulSoup(html, 'html.parser')
        table = soup.find('table', class_='infobox')
        if table:
            tr = table.find_all('tr')
            for i in range(len(tr)):
                if tr[i].th and tr[i].th.text == 'Influenced':                    
                    
                    for a in tr[i + 1].td.find_all('a', recursive=False):
                        res.append((lang, a.text))
                    break

    
    print("SALGO de edges()")
    time.sleep(0.8)    # pause 5.5 seconds
    return list(set(res))

In my opinion, the error occurs in the line of html = requests.get(URL, timeout=6).text due to an excess of consultation or some misconfiguration of my ports. I do not remember making changes to these and I've already tried disabling antivirus and firewall

    
asked by Sebatian rojas cortez 17.06.2018 в 20:30
source

1 answer

0

The problem was that one of the URLs taken from the query was incorrect, so I could solve it by evading it with an if. Then there was another problem that I solved by increasing the waiting time. I increased the timeout to 12.

    
answered by 17.06.2018 в 23:45