I have a server in python 'simpleHTTP' running on my machine. He works as he should but there is a problem with the client side.
An error is raised when you try to read a directory that has an accent in its name.
This has happened to me now because I have always used the English language in my machines ... Because I made a new installation based on Arch Linux and this corrected the language based on the time zone this is the error that arose now .
A normal folder in English would be: Videos
But already in the Spanish / Latin language it would be: Videos
This is an example of what the client receives with:
WebServerResponse = urllib.urlopen(Url).read()
Exit:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"><html>
<body>
<h2>Directory listing for /home/user/</h2>
<hr>
<ul>
<li><a href="Descargas/">Descargas/</a>
<li><a href="Documentos/">Documentos/</a>
<li><a href="Escritorio/">Escritorio/</a>
<li><a href="Im%C3%A1genes/">Imágenes/</a>
<li><a href="Matrix.txt">Matrix.txt</a>
<li><a href="M%C3%BAsica/">Música/</a>
<li><a href="Plantillas/">Plantillas/</a>
<li><a href="P%C3%BAblico/">Público/</a>
<li><a href="V%C3%ADdeos/">Vídeos/</a>
</ul>
<hr>
</body>
As you can see for the folder: Videos this has value V% C3% ADdeos
By running URLReturn = urllib.urlopen (RemoteDevice) .read () and then executing the urllib.unquote (URLReturn) function it manages to give the correct value to each character ... The problem is that if I want to split the result into pieces with the split ('\ n') method it is re-encoded but this time it replaces the characters with others.
For example:
'href="V\xc3\xaddeos/">V\xc3\xaddeos/</a>'
Locale: es_PR.UTF-8
What should I do to change this behavior?
Edit: This is the server part
def HTTPServerStart(Secure=False):
# Generate Certificate
# sudo openssl req -new -x509 -keyout /etc/ssl/certs/LocalHTTPSSever.pem -out /etc/ssl/certs/LocalHTTPSSever.pem -days 365 -nodes
# https://letsencrypt.org/
if Secure:
import BaseHTTPServer, ssl
ServerType="HTTPS"
print "Local Secure HTTP Server Enabled"
print "You May Need To Add A Certificate Exception In Your Browser To Access The Server"
else:
import SocketServer
ServerType="HTTP"
import SimpleHTTPServer, os
Port = 8000
if not Secure:
Handler = SimpleHTTPServer.SimpleHTTPRequestHandler
Handler.extensions_map.update({'.webapp': 'application/x-web-app-manifest+json',});
ServerPath = os.environ['HOME']
os.chdir(ServerPath)
try:
if Secure:
try:
httpd = BaseHTTPServer.HTTPServer(("", Port), SimpleHTTPServer.SimpleHTTPRequestHandler)
httpd.socket = ssl.wrap_socket (httpd.socket, certfile='/etc/ssl/certs/LocalHTTPSSever.pem', server_side=True)
except ssl.SSLError:
print "Error With SSL Certificate. Maybe You Will Need To Generate Another One"
else:
httpd = SocketServer.TCPServer(("", Port), Handler)
except socket.error:
print "%s Server Already Running On Selected Port %s" % (ServerType, Port)
return False
print "Serving %s Server On Address %s://%s:%s/" % (ServerType, ServerType.lower(), Address()['NetworkIP'], Port)
try:
httpd.serve_forever()
except KeyboardInterrupt:
return "Exit"
Edit 2: System information
uname -a
Linux User 4.10.8-1-ARCH # 1 SMP PREEMPT Fri Mar 31 16:50:19 CEST 2017 x86_64 GNU / Linux
cat /etc/arch-release
Antergos Linux release 17.4 (ISO-Rolling)
cat /etc/os-release
NAME="Antergos Linux" VERSION="17.4-ISO-Rolling" ID="antergos" ID_LIKE="arch" PRETTY_NAME="Antergos Linux"
env
LC_COLLATE = en_PR.UTF-8 LANG = es_PR.UTF-8 GDMSESSION = xfce TERM = xterm-256color SHELL = / bin / bash
This operating system was installed and specified Puerto Rico as a country as well as a time zone.
The desktop environment is: XFCE
Edit 3:
HTTP Client
import urllib
WebServerResponse = urllib.urlopen(RemoteFile).read()
So far everything is going more or less well. This is the output of the html that I showed above.
So if I use WebServer.split()
then it is translated to 'href="V%C3%ADdeos/">V\xc3\xaddeos/</a>'
But if I execute urllib.unquote(WebServerResponse).split()
then the output 'href="V\xc3\xaddeos/">V\xc3\xaddeos/</a>'