sabdnzbd does not handle UTF8 characters in filenames, e.g.
I live in denmark, where we have letters like æ, ø and å, I just tried to download "Lærkevej" which causes problems
1) the file name displayed in the browser (plush) is L_rkevej.
2) the file name generated tin the file system is: Lrkevej
3) the par2 file refrences a file called "Lærkevej" (using one of the utf-8 encodings for æ) .. which cause par to fail
Special national characters in filenames are not handled
Forum rules
Help us help you:
Help us help you:
- Are you using the latest stable version of SABnzbd? Downloads page.
- Tell us what system you run SABnzbd on.
- Adhere to the forum rules.
- Do you experience problems during downloading?
Check your connection in Status and Interface settings window.
Use Test Server in Config > Servers.
We will probably ask you to do a test using only basic settings. - Do you experience problems during repair or unpacking?
Enable +Debug logging in the Status and Interface settings window and share the relevant parts of the log here using [ code ] sections.
Re: Special national characters in filenames are not handled
Are you adding something via newzbin? Their API converts characters not allowed in HTTP headers into underscores, so it is probably broken before it gets to SABnzbd.
As for the par2, that might be another issue, however I don't have an answer for that myself.
As for the par2, that might be another issue, however I don't have an answer for that myself.
Re: Special national characters in filenames are not handled
There are two par2 programs in the world.
One uses 8bit ASCII and the other UTF-8 and they are mutually incompatible.
More so when the creating and receiving platform differ.
Please email the NZB file to bugs at sabnzbd.org
One uses 8bit ASCII and the other UTF-8 and they are mutually incompatible.
More so when the creating and receiving platform differ.
Please email the NZB file to bugs at sabnzbd.org
Re: Special national characters in filenames are not handled
Yes, the nzb is downloaded from newzbine, but the real issue is burried inside the base .par2 file.
I found this workaround on the sourceforge page for par2, as a suggested workaround: (link: http://sourceforge.net/projects/parchiv ... ic/1843329)
------ snip --------
"Unicode issue" as in: "par2 clients don't support the optional unicode parts of the par2 specs and stupidly write 8-bit characters to the ascii (7-bit) 'name of the file' field of the File Description packet"?
Fixing that file set is easy. for ex: par2 r La\ Bête.par2 La*.rar The resulting rar files might or might not have an "ê" in the filename, depending on the fact if you and the creator of the par2 set use the same character encoding...
If clients had adhered to the specs and hadn't stored non-ascii characters in what should have been an ascii-only field, the clients would probably have had no other choice but to implement the optional unicode stuff to store non-ascii filenames years ago...
----------- snip --------
I tried the workaround and it does ideed work, unfortunately this requires a change in the way libnzbd invokes par2, or could I do that with the optional par2 parameters?
I found this workaround on the sourceforge page for par2, as a suggested workaround: (link: http://sourceforge.net/projects/parchiv ... ic/1843329)
------ snip --------
"Unicode issue" as in: "par2 clients don't support the optional unicode parts of the par2 specs and stupidly write 8-bit characters to the ascii (7-bit) 'name of the file' field of the File Description packet"?
Fixing that file set is easy. for ex: par2 r La\ Bête.par2 La*.rar The resulting rar files might or might not have an "ê" in the filename, depending on the fact if you and the creator of the par2 set use the same character encoding...
If clients had adhered to the specs and hadn't stored non-ascii characters in what should have been an ascii-only field, the clients would probably have had no other choice but to implement the optional unicode stuff to store non-ascii filenames years ago...
----------- snip --------
I tried the workaround and it does ideed work, unfortunately this requires a change in the way libnzbd invokes par2, or could I do that with the optional par2 parameters?
Re: Special national characters in filenames are not handled
This will only be covered in the pending 0.5.0 release.
The binary distributions will shipped with both variants of par2
and the right par2 program will be picked, based on the encoding of the par2-files.
It works OK in the latest code we have now (after I removed some remaining glitches).
Unfortunately this feature will not work for Linux, because we
only distribute a source package and rely on the installed par2.
I have to look at the suggested work-around, but at first glace
I'm not convinced of it's universal usability.
Thanks for reporting this.
You might be interested in signing up for the Release Test program.
The binary distributions will shipped with both variants of par2
and the right par2 program will be picked, based on the encoding of the par2-files.
It works OK in the latest code we have now (after I removed some remaining glitches).
Unfortunately this feature will not work for Linux, because we
only distribute a source package and rely on the installed par2.
I have to look at the suggested work-around, but at first glace
I'm not convinced of it's universal usability.
Thanks for reporting this.
You might be interested in signing up for the Release Test program.
Re: Special national characters in filenames are not handled
In regards to par2 on linux, you may want to look at this site http://www.chuchusoft.com/par2_tbb/index.html
I've had this running on my system for 1.5 days now, and it seems to work great.
I've had this running on my system for 1.5 days now, and it seems to work great.
Re: Special national characters in filenames are not handled
We distribute chuchusoft's par2 with the Windows and OSX binaries.
Chuchusoft is partially responsible for the current mess.
The "classic" par2 uses 8bit ASCII with unspecified encoding (usually Latin-1).
Chuchusoft decided to use UTF-8 instead without any attempt to
identify the new format and without bothering with backward compatibility.
This is why we need to look inside par2 files to make an educated guess
about which par2 version to use. This also means we not have a
proper solution for Linux where we depend on the par2 that happens to be installed.
For OSX we have another problem. When an upload has been created with
classic par2 on Windows, there's no working solution on OSX.
Both par2 variants will ask OSX about files in Latin-1 format, that OSX doesn't know.
If only chuchusoft's par2 would have the intelligence to translate Latin-1 to UTF-8,
the issue would be resolved.
We would gladly modify chuchusoft's code, if compiling it wasn't a form of black magic.
Having said that, I'll look into the possibilities of the "work-around".
Chuchusoft is partially responsible for the current mess.
The "classic" par2 uses 8bit ASCII with unspecified encoding (usually Latin-1).
Chuchusoft decided to use UTF-8 instead without any attempt to
identify the new format and without bothering with backward compatibility.
This is why we need to look inside par2 files to make an educated guess
about which par2 version to use. This also means we not have a
proper solution for Linux where we depend on the par2 that happens to be installed.
For OSX we have another problem. When an upload has been created with
classic par2 on Windows, there's no working solution on OSX.
Both par2 variants will ask OSX about files in Latin-1 format, that OSX doesn't know.
If only chuchusoft's par2 would have the intelligence to translate Latin-1 to UTF-8,
the issue would be resolved.
We would gladly modify chuchusoft's code, if compiling it wasn't a form of black magic.
Having said that, I'll look into the possibilities of the "work-around".
Re: Special national characters in filenames are not handled
Chuchusoft par2 has been removed from OSX binaries since 0.4.8 (I think)...shypike wrote: We distribute chuchusoft's par2 with the Windows and OSX binaries.
Chuchusoft is partially responsible for the current mess.
Re: Special national characters in filenames are not handled
I stand corrected.