Fairly Minor Bug with non English Accents

Report & discuss bugs found in SABnzbd
Forum rules
Help us help you:
  • Are you using the latest stable version of SABnzbd? Downloads page.
  • Tell us what system you run SABnzbd on.
  • Adhere to the forum rules.
  • Do you experience problems during downloading?
    Check your connection in Status and Interface settings window.
    Use Test Server in Config > Servers.
    We will probably ask you to do a test using only basic settings.
  • Do you experience problems during repair or unpacking?
    Enable +Debug logging in the Status and Interface settings window and share the relevant parts of the log here using [ code ] sections.
Post Reply
vdown
Release Testers
Release Testers
Posts: 38
Joined: January 30th, 2008, 8:01 am

Fairly Minor Bug with non English Accents

Post by vdown »

Downloaded a couple of things in the last 2 weeks, one had an umlaut in the rar file name and the other an acute accent.

The file names in the Verbose SAB window arent displayed correctly, substituting characters for the accents.

After completion SAB is unable to delete the RARs.

Not a massive problem just slightly annoying and I assume not fixable unless SAB supports unicode.

Sorry if this has already been listed as a bug.
User avatar
shypike
Administrator
Administrator
Posts: 19774
Joined: January 18th, 2008, 12:49 pm

Re: Fairly Minor Bug with non English Accents

Post by shypike »

I have successfully tested in the past with accented chars, so your report surprises me.
Can you email the "offending" NZB file to [email protected] ?

Also, what Operating System do you use?
vdown
Release Testers
Release Testers
Posts: 38
Joined: January 30th, 2008, 8:01 am

Re: Fairly Minor Bug with non English Accents

Post by vdown »

email sent and Im on Vista Ultimate, UK Language settings.
User avatar
shypike
Administrator
Administrator
Posts: 19774
Joined: January 18th, 2008, 12:49 pm

Re: Fairly Minor Bug with non English Accents

Post by shypike »

If you got the NZB file from newzbin.com, then they have replaced the special character in "Fr÷ken". Newzbin always returns "clean" 7-bit ASCII names.
The yEnc embedded name contains a bit of a dubious value 0xF7, which does not have a fixed character representation.
Python supports Unicode alright, but names in yEnc encoded articles are not Unicode, but just 8-bit ASCII, with an unspecified codepage.
Conversion of these characters is guesswork, depending on the codepage of the poster and of the downloader (which may not match).

The fact that you get an error message, indicates that internally in SABnzbd there must be some confusion about the correct filename. I will look into it further, but it will take some time.

Thanks for your report.
User avatar
shypike
Administrator
Administrator
Posts: 19774
Joined: January 18th, 2008, 12:49 pm

Re: Fairly Minor Bug with non English Accents

Post by shypike »

BTW there are some more things wrong with the NZB file.
The accented character in the article subject is different from the character in the yEnc encoding.
Also, the character ö in the subject is not proper Unicode. Again: interpretation of 8-bit ASCII is dubious. Browsers don't really know what to do with them.

Perhaps SABnzbd could deal a bit more elegant with this.
vdown
Release Testers
Release Testers
Posts: 38
Joined: January 30th, 2008, 8:01 am

Re: Fairly Minor Bug with non English Accents

Post by vdown »

I got one for sure from Newzbin, the other may have been from Binsearch.

Thanks for looking into it, like I said before its something I can live with.
User avatar
shypike
Administrator
Administrator
Posts: 19774
Joined: January 18th, 2008, 12:49 pm

Re: Fairly Minor Bug with non English Accents

Post by shypike »

I located the problem, it's caused by the "unrar.exe" program that is used for unpacking.
SABnzbd captures the console output of this program and analyses which RAR files have actually been used and can therefore be deleted.
The problem is that unrar shows some non-ASCII characters in filenames incorrectly. This is the reason SABnzbd cannot delete the files.
Actually, the blame is not entirely with  unrar, Windows itself has a problem too.
The filename in question is 'Fr\xF6ken', which is displayed in a command window as 'Fröken'.
If you copy this name from the console (Edit->Mark) and use that string to find the file, it won't work.
It seems that Windows itself has some problem selecting a proper codepage to map non-ASCII chars.

OK, this is fairly technical story. The bottom line is that it's very difficult to solve this problem without a redesign of the way the unpacking is designed. So that may take a while.
User avatar
sander
Release Testers
Release Testers
Posts: 9070
Joined: January 22nd, 2008, 2:22 pm

Re: Fairly Minor Bug with non English Accents

Post by sander »

As a test, I'm downloading the Fröken-post on my Ubuntu Linux system. I got the NZB from binsearch.

If that works, the solution is easy: switch to Ubuntu Linux!  :D

...

And here's the result: everything works without problems with "Fröken" on SABnzbdplus on Ubuntu. So: switch to Ubuntu! ;-)
Last edited by sander on February 4th, 2008, 1:35 pm, edited 1 time in total.
If you like our support, check our special newsserver deal or donate at: https://sabnzbd.org/donate
User avatar
shypike
Administrator
Administrator
Posts: 19774
Joined: January 18th, 2008, 12:49 pm

Re: Fairly Minor Bug with non English Accents

Post by shypike »

Found a solution, remapping the characters that come from unrar seems to work.
Only I don't know if this will have side effects.
It's touch and go with these 8-Bit ASCII chars, Unicode was invented for a reason. Sadly yEnc does not support it.
Post Reply