[Zaphod-Users] nodes m187 and m182 are in state down
Kai Germaschewski
kai.germaschewski at unh.edu
Wed May 23 14:05:05 EDT 2007
On Wed, 23 May 2007, Saeid Jalali wrote:
> You can see from the following status of the job number 4123 that the nodes m187 and m182 are in state down.
> I wonder why in a day almost all the nodes one by one are shutted down!
Well, you are right that this is definitely an undesirable situation.
Now while it happens occasionally that a node crashes, this so far has
been a rare event, while it seems to occur rather frequently with your
jobs. Do you have any idea whether your jobs do something unusual?
One possibility I can think of would be that they are running out of
available memory and the node may swap itself to death.
--Kai
More information about the Zaphod-Users
mailing list