| Author |
Topic  |
|
|
petty
BIAC Staff
    
USA
453 Posts |
Posted - Dec 05 2011 : 2:23:32 PM
|
The cluster will be potentially be down this Friday afternoon ( 12/9 ) due to the expansion of Munin.
We will find out this week if service engineers from BluArc will be expanding/updating Munin. If they are able to make it this Friday, then systems will need to be shut-down for a few hours.
I will reply when we have a finalized plan. Currently we are waiting to hear back, but i just wanted to give notice that the cluster nodes may be off Friday afternoon.
Thanks, -Chris |
|
|
petty
BIAC Staff
    
USA
453 Posts |
|
|
dvsmith
Advanced Member
    
USA
218 Posts |
Posted - Dec 10 2011 : 3:06:04 PM
|
Hey Chris,
Are some of the nodes still down? I was getting ready to restart a set of jobs, but it looks like something is weird with some of the nodes. My test jobs just sit in the queue indefinitely.
If these nodes are hung, will it affect my ability to work on other nodes without having to deal with random problems later? Put differently, if I start submitting a bunch of jobs, will I just wind up with a bunch of jobs that get stuck on these nodes? I'm happy waiting, if it lessens later headaches...
Thanks, David
[smith@hugin neglect_mvpa]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 3049142 0.25091 nodetest.s smith qw 12/10/2011 13:17:52 1 3049150 0.25043 nodetest.s smith qw 12/10/2011 13:18:41 1 3049155 0.25013 nodetest.s smith qw 12/10/2011 13:19:11 1 3049156 0.25007 nodetest.s smith qw 12/10/2011 13:19:17 1 3049157 0.25000 nodetest.s smith qw 12/10/2011 13:19:24 1
|
 |
|
|
petty
BIAC Staff
    
USA
453 Posts |
Posted - Dec 10 2011 : 8:13:55 PM
|
Two nodes are still down and three others are interactive only.
Jobs can't go to the down nodes |
Edited by - petty on Dec 11 2011 1:33:07 PM |
 |
|
| |
Topic  |
|