From: Joseph Jensen [mailto:jbj1@wildcats.unh.edu]
Sent: Monday, May 22, 2017 11:43 AM
To: Maciolek, Mark <Mark.Maciolek@unh.edu>
Subject: Trillian is on hold again

 

Hi Mark 

I am not sure if you are already aware, but Trillian is on hold, and all the programs that have been running have been going for longer that 3 days.

 

Joseph B. Jensen

 

 

Hi,

 

On Sunday reached max open files again:

 

2017-05-21 09:29:50: [8057] ------------------------------------------ resvconfirm msg

2017-05-21 09:29:50: [8057] type confirm uid 33040 gid 1000 apid 0 pagg 0 resId 0 numCmds 1

2017-05-21 09:29:50: [8057] File new reservation resId 38 pagg 0 flags 0x200

2017-05-21 09:29:50: [8057] Confirmed apid 125869 resId 38 pagg 0 flags 0x200 nids: 199

2017-05-21 09:29:50: [8057] openSocket:665: socket: Too many open files

2017-05-21 09:29:50: [8057] main:1683: parseXml error: ret 'TCP socket open failed' (timeout 0)

 

 

Have increased limit to 10000 will watch it but if jobs don’t start in the next 30 minutes will restart the ap scheduler.

 

Mark