[Trillian-users] trillian queue and runs
Maciolek, Mark
Mark.Maciolek at unh.edu
Thu Mar 15 08:52:20 EDT 2018
Kate,
The alps process has once again failed to work properly. I have restarted the alps process, I would recommend any user with their jobs in ‘H’ status to remove their jobs and resubmit them.
44538.sdb yout.bat samark 0 H workq
44545.sdb perp_D_2.pbs kgklein 0 H workq
44546.sdb PBS_job_script_ salme 0 H workq
44547.sdb PBS_job_script_ kvonkrusenstiern 0 H workq
44548.sdb PBS_job_script_ kvonkrusenstiern 0 H workq
44551.sdb PBS_job_script_ kvonkrusenstiern 0 H workq
44553.sdb st.pbs pai 0 H workq
44556.sdb ql.pbs pai 0 H workq
44558.sdb PBS_job_script_ salme 0 H workq
44559.sdb L20cm fs1036 0 H workq
44560.sdb PBS_job_script_ kvonkrusenstiern 0 H workq
44561.sdb shockTest mgorby 0 H workq
If that does not succeed, I will need to reboot trillian.
Mark
--Mark Maciolek
Network Administrator
Morse Hall Rm 338
http://www.unh.edu/research/support-units/research-computing-center
From: Kate von Krusenstiern [mailto:kvonkrus at gmail.com]
Sent: Thursday, March 15, 2018 8:44 AM
To: Maciolek, Mark <Mark.Maciolek at unh.edu>
Subject: trillian queue and runs
Caution - External Email
________________________________
Hi Mark,
I apologize if I'm just hammering a trillian issue you already know about, but I wanted to give you guys a heads up on an issue with the queue on trillian.
The queue doesn't seem to be registering finished runs, and thus not starting the runs in the queue. I noticed my run (purposely capped at 96 hours) says it's been running for 115 hours using apstat to check occupied nodes. When I checked the output of this run, it was did in fact stop computer at 96 hours.
When I use qstat to check the active batch jobs, my job is not listed as running. Qstat shows a total of 5 jobs running with 65 nodes total, different than the 11 jobs using all the nodes shown in apstat.
I know things have been busy with back to back nor'easters and spring break. I appreciate all you guys do to keep this super computer running. Hopefully this issue is something that can be resolved easily.
Thanks,
Kate von Krusenstiern
--
Kate von Krusenstiern
Center of Coastal and Ocean Mapping - University of New Hampshire
Graduate Research Assistant
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.sr.unh.edu/pipermail/trillian-users/attachments/20180315/e9e7e83a/attachment.html>
More information about the Trillian-users
mailing list