[Zaphod-Users] Queue
Kai Germaschewski
kai.germaschewski at unh.edu
Mon Nov 14 23:53:12 EST 2005
On Mon, 14 Nov 2005, Fekete, Balazs M. wrote:
> Lately, I am disappointed about zaphod. I submitted two jobs around
> noon, but neither of them got executed so far. I actually sent one of
> the jobs to our "junk yard" (a small cluster of five old desktop PCs
> with eight 500-800 MHz CPUs), which started immediately and finished in
> 4 hours.
I'm afraid zaphod being that busy is going to be the norm. Currently,
there are however usually a lot of ethernet-only nodes available, so you
may want to recompile your code without Myrinet, and it should give you a
much faster turnaround.
> I wonder if there is any tool to map the processor use at any time. I
> know, http://zaphod.sr.unh.edu/ganglia/index.php but I am not sure how
> to interpret those graphics. The gaps in the graphs are particularly
> disturbing.
Yeah, I'm not sure what's wrong with ganglia at this time. "showstate" is
another nice command which shows you the current load.
At this time, however, as you noted, zaphod is dead, and I cannot get it
back to life even with our remote power-cycling feature. I've no idea why
it died or what's going on, but this crash seems different from the other
recent ones in that it still responded to pings, just didn't do much else.
--Kai
More information about the Zaphod-Users
mailing list