Page 1 of 1
Weird issue: Post-processing accumulates many simultaneous jobs. Kills cpu/ram.
Posted: March 14th, 2011, 1:12 am
by Effenberg0x0
Hi, I'm not sure if what I am about to describe here is actually a problem, a new sabnzbd normal behavior for the latest Beta or if it has always been like this, so excuse me. I have never seen this happen.
1)
I only noticed because I receive e-mail from system to root saying CPU/RAM was dying. Went to linux command top and saw that sabnzbd and par2 were junk 100% CPU and RAM. Went to plush and, looking at my history, I had about 120 jobs with the spinner icon (!?) simultaneously. Quite frankly, I'm not sure but I always thought it would process one completed download at a time?
The oldest one was stuck at repairing. Par2 isn't too harsh on my system. Normally it fixes a large download in less than 3 minutes. After 15 minutes of waiting in vain, I killed par2. After that, I noticed sabnzbd wen't into unpredictable behavior. It couldn't use par2 and rar to normally process the younger completes downloads that were spinning in plush history. The jobs were spinning in there although no disk access to incomplete/DownloadName/* was happening.
New downloads were going on normally, and being pilled to history, but history was stuck for good. As I restarted the service, it seems no files were corrupted, nor I have lost anything. But no attempt at post-processing was done at all. Also, when I went to par2 the job that theoretically caused all the trouble, I easily repaired it in less than 1 minute. All the right blocks were there, no real issue.
So, I can't understand what really happened here.
2) It happened again later. This time, the oldest job was stuck on "Fetching: Fetching 2 blocks" for 2 hours while 50 other jobs accumulated on top of it. Memory, CPU was also entirely eaten by sabnzbd process. I had never noticed this fetching message. Is it actually trying to download extra par blocks? If so, this is a new feature or something I never noticed sab was capable of.
Hope I could explain it well enough (not an english speaker).
Regards,
Effenberg
Re: Weird issue: Post-processing accumulates many simultaneous jobs. Kills cpu/ram.
Posted: March 14th, 2011, 1:40 am
by Effenberg0x0
Ok, so I Googled a little and found out Sabnzbd actually does try to fetch more blocks when repair fails. This is news to me. I don't know why it never happened/worked here. I have other post at this same forums asking if there wasn't a way to automatically download more par2 files when needed... I'm lame.
Further looking showed me the issue here is related to this fetching process in both occurrences I mentioned. For some reason it gets stuck in-between par2 repair and fetching more blocks while a lot more jobs are being pilled up in the post-processing queue. For some reason, it starts to eat cpu and memory to a point where the systems becomes unusable. I believe although plush says this jobs are being post-processed, they actually are not.
Regards,
Effenberg
Re: Weird issue: Post-processing accumulates many simultaneous jobs. Kills cpu/ram.
Posted: March 14th, 2011, 1:47 am
by rascalli
There is a setting in sabnzbd+ to stop downloading while files are being post-processed .. maybe that is an option ?
And indeed when there are files missing , sabnzbd+ tries to get more blocks ..
Re: Weird issue: Post-processing accumulates many simultaneous jobs. Kills cpu/ram.
Posted: March 14th, 2011, 2:05 am
by Effenberg0x0
There is a setting in sabnzbd+ to stop downloading while files are being post-processed .. maybe that is an option ?
Yes, it has always been unset, so sabnzbd+ could download during post-process if it wanted to. Checking it seems unproductive (idle bandwidth, cpu, etc). All the server is supposed to do is download stuff. I'll try to go without it anyway.
But despite it failing / stalling the system or not, Man, I swear to God I never seen the fetching message. Until now I was manually searching, downloading extra par2 files. Look:
http://forums.sabnzbd.org/index.php?topic=6165.0 History would just say "Repair Failed". Can't explain why it was off somehow. It seems there's no such setup option to disable this behavior, right?
Re: Weird issue: Post-processing accumulates many simultaneous jobs. Kills cpu/ram.
Posted: March 14th, 2011, 5:15 am
by shypike
A hanging par2 is a possibility. It's only software, so not 100% error free.
Of course if can also indicate hardware problems or just lack of available memory.
SABnzbd has always downloaded the minimum of par2 files.
When the repair fails, it will download more par2 files and try again.
This behavior is always on, except when you choose not to verify and unpack at all.
Of course, SABnzbd is not error-free.
However, when par2 hangs, this is not caused by SABnzbd itself.
Since post-processing is sequential, the PP queue will be stuck when par2 hangs.
There's no sense in implementing a time out because repair times
are unpredictable and potentially extremely long.
My Atom-based download box occasionally needs half a day to do a repair.
Re: Weird issue: Post-processing accumulates many simultaneous jobs. Kills cpu/ram.
Posted: March 18th, 2011, 8:07 am
by FCrane
Maybe it would help to have an "Abort" or "Cancel" button for each post processing item. Like the "Delete" button that appears when we move the mouse over a download item. This way we could manually cancel hanging jobs and Sabnzbd+ could continue with the others.
Regards!
Re: Weird issue: Post-processing accumulates many simultaneous jobs. Kills cpu/ram.
Posted: March 24th, 2011, 10:31 am
by Effenberg0x0
It'ss a good idea. I've been also thinking about an option to set "max par2 processing time" / "max unrar processing time" settings in the config.
We know our systems. We know what is a reasonable time and what is not. And, obviously, the average lumberjack joe would have the option to not set the option at all.
Ideally, I would go geekier: "Seconds / MB" instead of a fixed time (considering you can be download 1MB or 1TB).
Now that I think of it again, I remembered it can be tested by implementing the restrictions to par2/unrar in the system itself. Will try it soon.
Regards,
Re: Weird issue: Post-processing accumulates many simultaneous jobs. Kills cpu/ram.
Posted: March 24th, 2011, 11:09 am
by shypike
Effenberg0x0 wrote:
We know our systems. We know what is a reasonable time and what is not.
You think so?
I have seen ranges from 1 minute to 12 hours on the same system.
The time is very hard to predict with only the given size and the gaps.
I don't trust the average user to fill in a number that's sufficiently safe.
Re: Weird issue: Post-processing accumulates many simultaneous jobs. Kills cpu/ram.
Posted: March 24th, 2011, 2:59 pm
by Effenberg0x0
I've never seen anything like this with an average of 5TB/day downloaded and processed. I Can't imagine what kind of operation your running there
Out of curiosity what size of download can take 12 hours on unrar + par2 r processes?
Honestly, I am pretty sure that if I establish something like MaxTimeToUnrar=10minutes and MaxTimeToPar2Repair=30minutes I would be playing it very safe for 90% of downloaded content. The sabnzbd+ server here is a AMD Phenom II X6 3.5GHZ 6 Cores with 1666 MHz Dual Channel 16GB RAM dedicated exclusively to running Sabnzbd on Linux. So I can't really complain of a slow system.
Re: Weird issue: Post-processing accumulates many simultaneous jobs. Kills cpu/ram.
Posted: March 24th, 2011, 3:07 pm
by shypike
Try a NAS with a paltry CPU and not quite enough memory
BTW: if par2 and unrar hang often, I would question the quality of the hardware.
I have very seldom seen a hanging par2 or unrar.
The manual cancel might be a good idea though,
but killing hanging processes is not a sure thing on all platforms.
Re: Weird issue: Post-processing accumulates many simultaneous jobs. Kills cpu/ram.
Posted: March 24th, 2011, 3:27 pm
by Effenberg0x0
Ah ok, I was thinking large (server, power desktop) and not the other way around (NAS)... Now I see, you are absolutely right. Running sabnzbd on a Nintendo DS, a coffee machine or something must create such results. I was about to ask you if you own a Usenet Host LOL
You know what, I think unrar is not so much of the issue. Par2, IMHO, gets confused with multicore and Linux distros are simply not as prepared for multicore as they could.
I tried the multicore Par lib, but with same results. Its weird. When you try to investigate what it is doing, you get puzzled. Things like the top command will tell you the machine is out of CPU time and RAM. Up to a point where the kernel panics or you end up with an emergency RSEIUB as the only way out. I thought, well, if I go look at the CPU temperature log daemon I should see some good variation from no-load to kernel-panic-load. Dude, not even one ºC degree on a system with a very poor cooler...
It still is very unclear for me. There are some articles here and there about the problems some apps may face with multicore CPUs.
I'll try something outside of sabnzbd process. I know killing it without passing a signal back to sabnzbd may be a problem. If I found out something interest I post here.
Re: Weird issue: Post-processing accumulates many simultaneous jobs. Kills cpu/ram.
Posted: March 24th, 2011, 3:42 pm
by shypike
Actually, you are right about multi-core par2 on Linux platforms.
I shouldn't rely only my own experience with Windows and OSX (where it's been very reliable for me).
We have no control over the par2 used on Linux platforms.