[Zaphod-Users] nodes m187 and m182 are in state down

Kai Germaschewski kai.germaschewski at unh.edu
Wed May 23 14:05:05 EDT 2007


On Wed, 23 May 2007, Saeid Jalali wrote:

> You can see from the following status of the job number 4123 that the nodes m187 and m182 are in state down.
>   I wonder why in a day almost all the nodes one by one are shutted down!

Well, you are right that this is definitely an undesirable situation.

Now while it happens occasionally that a node crashes, this so far has 
been a rare event, while it seems to occur rather frequently with your 
jobs. Do you have any idea whether your jobs do something unusual? 
One possibility I can think of would be that they are running out of 
available memory and the node may swap itself to death.

--Kai



More information about the Zaphod-Users mailing list