Page 1 of 1

Sabnzbd cannot handle non ASCII characters on ASCII file

Posted: June 12th, 2024, 4:36 am
by abu3safeer
Some indexers, or even ngPost may produce ASCII nzb, but it contains some special characters or any unicode letter, it would throw"not well-formed" error message

I did fix this by using python script, but if python script can do it, then sabnzbd can do it better and auto detect the file encoding and try to parse it as utf-8 file.

This is the script I use to fix those "not well-formed" nzb files.

Code: Select all

import pathlib

files = list(pathlib.Path().rglob('*.nzb'))
fileslist = []
for item in files:
    if item.is_file():
        fileslist.append(item)
print(fileslist)


for file in fileslist:
    try:
        with open(file, 'r', encoding='ansi') as f:
            text = f.read()


        with open(file, 'w', encoding='utf8') as f:
            f.write(text)
    except:
        pass

Re: Sabnzbd cannot handle non ASCII characters on ASCII file

Posted: June 12th, 2024, 5:06 am
by safihre
Could you share such an NZB with me at [email protected]?

How do you upload the file to Sab?

Re: Sabnzbd cannot handle non ASCII characters on ASCII file

Posted: June 12th, 2024, 6:31 am
by abu3safeer
I have sent an email with the needed nzbs.

Re: Sabnzbd cannot handle non ASCII characters on ASCII file

Posted: June 15th, 2024, 2:27 pm
by safihre
The problem is that there's just an invalid encoded character in there for UTF8 encoding. It just can't handle it.
If we try to open it in ANSI mode, other files would fail to process that have valid UTF8 encoding.

Re: Sabnzbd cannot handle non ASCII characters on ASCII file

Posted: June 17th, 2024, 8:03 am
by abu3safeer
I meant something like this:
Sabnzbd will try to open a file, if it throw "not well-formatted" exception, it try to open it as ASCII and then save it as UTF-8, then try to process it normally, this should solve 99% of the issues since posting tools like ngPost only post with ASCII and UTF-8
Of course if the file actually is not formatted correctly, it will still throw the exception, and this time is real, not an encoding issue.

I didn't see any other encoding rather than ASCII and UTF-8 in nzb file till now, even the generated nzb from variety of indexers.

Re: Sabnzbd cannot handle non ASCII characters on ASCII file

Posted: June 17th, 2024, 8:19 am
by safihre
At that point in the code we deal with file-pointers, which makes such a trick quite hard.
And we haven't really experienced any problems until your example NZB's, so it seems not really a widespread problem?

Re: Sabnzbd cannot handle non ASCII characters on ASCII file

Posted: June 18th, 2024, 9:28 am
by abu3safeer
I see, It is not a widespread problem since it only effects non-Latin letters, I think I might change the encoding manually until sabnzbd can handle it somehow.
Thanks for your time.