[premise-users] premise update

Robert Anderson rea at sr.unh.edu
Wed May 18 21:34:09 EDT 2022


The good news is that we have been able to get the Premise Lustre
functional again.

Two bad drives have been automatically replaced with the only two hot
spare drives in the enclosure.  
We have no cold spare drive(s) to add to the enclosure, so a few should
be purchased to best protect data integrity.   
The "bad" drives  have not be removed and they are  causing delays
while booting Lustre.

The other bad news is that while checking those disks a cable
management part of the storage enclosure failed in such a way that we
can no longer get one of the four storage drawers to fully close.  The
system appears to function with the drawer 3 inches out from it's fully
closed position.  But we  know the cabling is being pinched and we will
need to contact Seagate support for a possible replacement.

We plan to discuss our options in the morning and determine a plan to
move forward.  Very likely it will involve: removing the "dead" disks,
ordering replacement drives for hot & cold spares, and contacting
Seagate for a quote on fixing the internal cable management (and one
fan).   Depending on the length of time to replace the cable management
part(s) we have to decide what portion of Premise to bring online.  

If we bring everything online we will need future downtime to replace
the Luste cable management.  
If we leave Lustre offline until fully repaired half of the clusters
users will not have home areas and the majority of  Anaconda software
will be unavailable.  There may be additional  problems discovered  in
running without all of the normal Premise storage systems online.

That's the latest news.  It's very frustrating to have fought through
all of the software issues only to have a hardware cabling snag create
a roadblock.


-- 
Robert Anderson <rea at sr.unh.edu>
UNH RCC
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.sr.unh.edu/pipermail/premise-users/attachments/20220518/8880426b/attachment.html>


More information about the premise-users mailing list