Page 1 of 3
060b3: one faulty server spoiling the others?
Posted: March 17th, 2011, 4:20 am
by sander
Strange/annoying behaviour in 060b3 (but probably not b3 related):
It looks one faulty (=unreachable) newsprovider is spoiling the other connections, resulting in no download at first, then a fast download, and then a slow download. And strange info in
http://localhost:8080/sabnzbd/connections/
Facts:
- multiple providers configured
- one server (reader.ipv6.xsnews.nl) is unreachable
- as said: SAB does not download at all, then switches between fast and slow downloads
- on stdout, SAB is declaring other servers unreachable (which is not true; tested by hand)
-
http://localhost:8080/sabnzbd/connections/ is showing strange info too: Per server I see "Connections: Timed out" and then a list of connections for that server. Strange, isn't it?
Hypothesis:
- one faulty server is spoiling other servers
- SAB is mistakenly reporting other servers as bad, causing no (or slow) downloads
- SAB is focussing too much on a bad server.
Re: 060b3: one faulty server spoiling the others?
Posted: March 17th, 2011, 4:26 am
by shypike
This is already under investigation.
It seems that a server failing to connect at all holds up the downloader.
Servers that are just late at answering article queries are not an issue.
It also seems Unix specific, Windows is not hindered by this.
Re: 060b3: one faulty server spoiling the others?
Posted: March 17th, 2011, 4:38 am
by sander
shypike wrote:
This is already under investigation.
It seems that a server failing to connect at all holds up the downloader.
Servers that are just late at answering article queries are not an issue.
It also seems Unix specific, Windows is not hindered by this.
So a known bug. Is there a list of known bugs in the release notes?
Is it 060 related? I can vaguely remember the spoiling-problem from 0.5.x too.
I'm indeed on Linux.
BTW: I tried lowering the Timeout to 5 seconds (trying to make detection of unreachable servers better), but the WebGUI automatically puts all settings below 30 seconds back to 30. Why is the lower limit set to 30 seconds? I would say a normal, healthy server on a healty connection responds within a second, so a 5 second timeout would be enough?
Re: 060b3: one faulty server spoiling the others?
Posted: March 17th, 2011, 7:58 am
by shypike
Known bug is a big word.
It's only since the beginning of the week that another user
presented a reproducible scenario that made me suspicious.
It could be that it's an old error (as in present in older versions too).
The timeout in the server setup has nothing to do with this.
That timeout states how long you want wait for a NNTP level answer from the server.
Usenet providers frown upon short timeouts (most tell you to use 60sec as a minimum).
The problem you describe is the timeout for making the initial socket
connection to the server. This has nothing to do with the other timeout.
Re: 060b3: one faulty server spoiling the others?
Posted: March 17th, 2011, 9:16 am
by jcfp
shypike wrote:
It could be that it's an old error (as in present in older versions too).
Sure is, see:
http://forums.sabnzbd.org/index.php?topic=510.15
http://forums.sabnzbd.org/index.php?topic=1179.0
Re: 060b3: one faulty server spoiling the others?
Posted: March 17th, 2011, 9:37 am
by shypike
Both are different problems.
Re: 060b3: one faulty server spoiling the others?
Posted: March 17th, 2011, 11:24 am
by sander
shypike wrote:
Both are different problems.
In what way?
http://forums.sabnzbd.org/index.php?topic=1179.0 looks the same or at least similar to me. But ... you're the guru, shypike.
Re: 060b3: one faulty server spoiling the others?
Posted: March 17th, 2011, 12:25 pm
by shypike
Can you do a test for me?
In sabnzbd/newswrapper.py, line 170 reads:
Code: Select all
if sabnzbd.WIN32 and not block:
Thread(target=con, args=(self.sock, self.host, self.port, sslenabled, self)).start()
Can you change this to the following and see if that fixes things?
Code: Select all
if not block:
Thread(target=con, args=(self.sock, self.host, self.port, sslenabled, self)).start()
My initial testing shows that it does make quite a difference.
Re: 060b3: one faulty server spoiling the others?
Posted: March 17th, 2011, 12:50 pm
by jcfp
Configured a fake server, example.org/119/2 connections/enabled/optional. With the original code, download is completely blocked while attempting to connect to the fake server. With the suggested change, speed is maxed instantly, the status page on the web interface shows a timeout message for the fake server after a minute or two. Pausing the download half way, restoring the original code, and restarting the application marks the return of the zero download eternal wait state. Certainly looks like you got it.
Re: 060b3: one faulty server spoiling the others?
Posted: March 17th, 2011, 1:02 pm
by shypike
Thanks for the feedback.
This is very old code, mostly from the 0.2.5 era.
The Windows and Unix parts are different because Windows lacks non-blocking socket access.
So the programmer assumed that an in-line connect call is OK for Unix.
However, the connect will still block the downloader during the time it takes to make the connection,
despite having non-blocking calls.
If it takes long to make the connection (until the 10 sec timeout), the downloader is effectively stopped.
It looks like we just have to use the "Windows" method always,
which is to fire up a thread to make the connection asynchronously.
Somehow we have overlooked this (in retrospect) obvious problem.
I do remember testing with non-existing local addresses, those timeout quicker.
Also, a server that makes the connection, but only times out on NNTP traffic, is not a problem.
Re: 060b3: one faulty server spoiling the others?
Posted: March 17th, 2011, 1:20 pm
by sander
shypike wrote:
Can you do a test for me?
In sabnzbd/newswrapper.py, line 170 reads:
Code: Select all
if sabnzbd.WIN32 and not block:
Thread(target=con, args=(self.sock, self.host, self.port, sslenabled, self)).start()
Can you change this to the following and see if that fixes things?
Code: Select all
if not block:
Thread(target=con, args=(self.sock, self.host, self.port, sslenabled, self)).start()
My initial testing shows that it does make quite a difference.
Bingo! Problem gone!
You're a guru indeed! Impressive.
Re: 060b3: one faulty server spoiling the others?
Posted: March 17th, 2011, 5:09 pm
by shypike
sander wrote:
You're a guru indeed! Impressive.
More a little sad, that I didn't catch this before
Re: 060b3: one faulty server spoiling the others?
Posted: March 18th, 2011, 3:55 am
by sander
Ouch ... with the patched newswrapper.py I get the messages below. (EDIT: this happens with both recent and old posts)
After activating the original, unpatched newswrapper.py, I don't get the error messages.
So: is the patched newswrapper.py/SAB doing something on a non-existing pipe/socket?
Code: Select all
2011-03-18 09:27:11,030::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:27:11,033::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:27:11,033::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:27:11,036::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:27:11,038::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:27:11,041::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:27:11,041::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:27:11,042::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:27:11,043::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:27:11,044::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:33:12,311::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:33:12,312::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:33:12,314::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:33:12,354::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:38:08,467::INFO::[downloader:607] Thread [email protected]:119: server closed connection
2011-03-18 09:38:08,497::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:38:08,498::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:38:08,500::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:38:08,502::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:38:08,504::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:38:08,506::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:38:08,506::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:38:08,507::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:38:08,510::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 09:38:08,511::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
Re: 060b3: one faulty server spoiling the others?
Posted: March 18th, 2011, 5:21 am
by shypike
I didn't say I was finished testing, this was a first check.
Thanks for the info, I did not see this one yet.
Any specific scenario?
Does it happen immediately or after some time?
Re: 060b3: one faulty server spoiling the others?
Posted: March 18th, 2011, 6:00 am
by sander
shypike wrote:
I didn't say I was finished testing, this was a first check.
I know. Was my post rude?
shypike wrote:
Any specific scenario?
Does it happen immediately or after some time?
Strange: I would expect it to happen with a non-functional server enabled, but it even happens with two functional server enabled (and three other servers not enabled)
Code: Select all
11-03-18 11:56:58,923::INFO::[downloader:347] [email protected]:119: Initiating connection
2011-03-18 11:56:58,929::INFO::[downloader:347] [email protected]:119: Initiating connection
2011-03-18 11:56:58,934::INFO::[downloader:347] [email protected]:119: Initiating connection
2011-03-18 11:56:58,939::INFO::[downloader:347] [email protected]:119: Initiating connection
2011-03-18 11:56:58,944::INFO::[downloader:347] [email protected]:119: Initiating connection
2011-03-18 11:56:58,952::INFO::[downloader:347] [email protected]:119: Initiating connection
2011-03-18 11:56:58,972::INFO::[downloader:347] [email protected]:119: Initiating connection
2011-03-18 11:56:58,976::INFO::[downloader:347] [email protected]:119: Initiating connection
2011-03-18 11:56:58,981::INFO::[downloader:347] [email protected]:119: Initiating connection
2011-03-18 11:56:58,982::INFO::[downloader:607] Thread [email protected]:119: server closed connection
2011-03-18 11:56:58,982::INFO::[downloader:607] Thread [email protected]:119: server closed connection
2011-03-18 11:56:58,983::INFO::[downloader:607] Thread [email protected]:119: server closed connection
2011-03-18 11:56:58,983::INFO::[downloader:607] Thread [email protected]:119: server closed connection
2011-03-18 11:56:59,007::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 11:56:59,022::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 11:56:59,024::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 11:56:59,029::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 11:56:59,375::INFO::[downloader:536] Connecting [email protected]:119 finished
2011-03-18 11:56:59,382::INFO::[downloader:536] Connecting [email protected]:119 finished
2011-03-18 11:57:01,064::INFO::[downloader:536] Connecting [email protected]:119 finished
2011-03-18 11:57:01,164::INFO::[downloader:536] Connecting [email protected]:119 finished
2011-03-18 11:57:01,165::INFO::[downloader:536] Connecting [email protected]:119 finished
2011-03-18 11:57:03,993::INFO::[downloader:347] [email protected]:119: Initiating connection
2011-03-18 11:57:03,994::INFO::[downloader:347] [email protected]:119: Initiating connection
2011-03-18 11:57:03,996::INFO::[downloader:347] [email protected]:119: Initiating connection
2011-03-18 11:57:03,996::INFO::[downloader:347] [email protected]:119: Initiating connection
2011-03-18 11:57:03,998::INFO::[downloader:607] Thread [email protected]:119: server closed connection
2011-03-18 11:57:03,998::INFO::[downloader:607] Thread [email protected]:119: server closed connection
2011-03-18 11:57:03,999::INFO::[downloader:607] Thread [email protected]:119: server closed connection
2011-03-18 11:57:03,999::INFO::[downloader:607] Thread [email protected]:119: server closed connection
2011-03-18 11:57:04,043::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 11:57:04,044::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 11:57:04,044::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 11:57:04,045::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 11:57:08,998::INFO::[downloader:347] [email protected]:119: Initiating connection
2011-03-18 11:57:09,000::INFO::[downloader:347] [email protected]:119: Initiating connection
2011-03-18 11:57:09,001::INFO::[downloader:347] [email protected]:119: Initiating connection
2011-03-18 11:57:09,002::INFO::[downloader:347] [email protected]:119: Initiating connection
2011-03-18 11:57:09,004::INFO::[downloader:607] Thread [email protected]:119: server closed connection
2011-03-18 11:57:09,005::INFO::[downloader:607] Thread [email protected]:119: server closed connection
2011-03-18 11:57:09,005::INFO::[downloader:607] Thread [email protected]:119: server closed connection
2011-03-18 11:57:09,006::INFO::[downloader:607] Thread [email protected]:119: server closed connection
2011-03-18 11:57:09,066::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 11:57:09,067::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 11:57:09,067::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119
2011-03-18 11:57:09,067::INFO::[newswrapper:214] Failed to connect: (110, '[Errno 9] Bad file descriptor') [email protected]:119