Page 1 of 1
What encoding when using the "Add by file path" API?
Posted: March 8th, 2012, 4:19 am
by Usenet
I have some issues when trying to use the "Add by file path" api.
Then documentation states:
Code: Select all
api?mode=addlocalfile&name=full/local/path/to/file.ext
Adding the path, "C:\home\users\Örn\the.nzb" in utf-8 and then urlencoded
Code: Select all
api?mode=addlocalfile&name=C%3A%5Chome%5Cusers%5C%C3%96rn%5Cthe.nzb
Fails with a "no file exists" error.
However if I use the urlencoded unicode character for Ö it works.
Code: Select all
api?mode=addlocalfile&name=C%3A%5Chome%5Cusers%5C%D6rn%5Cthe.nzb
.
Is this the intended behavior?
Any python snippet is appreciated
Re: What encoding when using the "Add by file path" API?
Posted: March 8th, 2012, 6:13 am
by shypike
The calling program is supposed to add headers specifying the encoding.
What does your Python snippet look like?
Re: What encoding when using the "Add by file path" API?
Posted: March 8th, 2012, 6:21 am
by Usenet
shypike wrote:The calling program is supposed to add headers specifying the encoding.
Aha. I use this:
Code: Select all
def _sabResponse(self, url):
try:
req = urllib2.Request(url)
response = urllib2.urlopen(req)
except:
responseMessage = "unable to load url: " + url
else:
log = response.read()
response.close()
if "ok" in log:
responseMessage = 'ok'
else:
responseMessage = log
return responseMessage
Re: What encoding when using the "Add by file path" API?
Posted: March 8th, 2012, 6:31 am
by shypike
If you send UTF-8 to urllib2, it won't know that it is UTF-8 (it might consider it to be Latin-1).
When you send it Unicode, it will know and convert that to UTF-8 and send the proper headers along.
Re: What encoding when using the "Add by file path" API?
Posted: March 8th, 2012, 7:42 am
by Usenet
In this case I build the url by
Code: Select all
url = self.baseurl + "mode=addlocalfile&name=" + urllib.quote_plus(local_file_name.encode('utf-8'))
since urllib.qoute_plus doesnt like unicode characters.
How would I url encode the url before sending it to urllib2?
BTW, thanks for the help!
Re: What encoding when using the "Add by file path" API?
Posted: March 8th, 2012, 10:55 am
by shypike
The problem is that quote_plus doesn't understand UTF-8 very well.
You should encode in Latin-1, this maps sufficiently to Unicode to work for this case.
At least when Latin-1 covers your needs.
When you get byte string values from system calls, assume that they are Latin-1.
Re: What encoding when using the "Add by file path" API?
Posted: March 8th, 2012, 11:38 am
by Usenet
Re: What encoding when using the "Add by file path" API?
Posted: March 8th, 2012, 1:08 pm
by shypike
Not exactly.
The Python function will only accept 8bit ASCII with the implicit
assumption that it's Latin-1.
Re: What encoding when using the "Add by file path" API?
Posted: March 8th, 2012, 3:11 pm
by Usenet
Aha, thanks, *sigh* I really hate these different encodings. I'll dig further.
Re: What encoding when using the "Add by file path" API?
Posted: March 8th, 2012, 4:32 pm
by Usenet
When reading e bit more it seems as if the standard for a GET request is somewhat vague. The encoding specified in the urllib2 request is only for the response.
Anyway, encoding to latin-1 does the trick.