| Author |
Topic  |
|
|
petty
BIAC Staff
    
USA
453 Posts |
Posted - Nov 10 2011 : 2:16:29 PM
|
Duke OIT is performing power maintenance in the data center on our row this coming week.
As a result, the cluster nodes will be OFF two nights: Wed 11/16, Thu 11/17
They will be shutdown by 10pm 5:30pm and will be brought back up each day when maintenance is finished.
In other words, they won't be off for 2 days straight, but they will be down two consecutive nights.
As a result, any existing jobs will be killed when they are turned off and they will be inaccessible during the downtime.
|
Edited by - petty on Nov 16 2011 10:28:16 AM |
|
|
dvsmith
Advanced Member
    
USA
218 Posts |
Posted - Nov 11 2011 : 12:01:34 PM
|
Hey Chris,
I assume this will also kill daemon processes that are submitting jobs, but can you confirm whether that is the case? I can either pause or let the power cut off kill my daemons and then resume after the maintenance, but it would definitely be easier to just pause them each night (if that's feasible).
Thanks! David |
 |
|
|
petty
BIAC Staff
    
USA
453 Posts |
Posted - Nov 11 2011 : 3:14:28 PM
|
If they are on the head node then they won't die.
If all goes as planned the head node will stay on.
|
 |
|
|
dvsmith
Advanced Member
    
USA
218 Posts |
Posted - Nov 18 2011 : 09:43:32 AM
|
| Are all of the nodes back up now? I tried submitting to each node, but half of my jobs are stuck in the queue. |
 |
|
|
syam.gadde
BIAC Staff
    
USA
421 Posts |
Posted - Nov 18 2011 : 09:48:42 AM
|
| Due to some wierdness, the half of the cluster that was affected by the power outage is still down for the time being. |
 |
|
|
dvsmith
Advanced Member
    
USA
218 Posts |
Posted - Nov 18 2011 : 09:52:24 AM
|
| Ah ok... If I go ahead and restart my jobs, will the grid just avoid those nodes or will half my jobs get stuck in the queue (and/or fail)? |
 |
|
|
petty
BIAC Staff
    
USA
453 Posts |
Posted - Nov 18 2011 : 09:54:32 AM
|
right ... i have to go physically push the buttons on the down nodes due to a "feature ( bug? )" with the remote power management.
if you do "qhost" ... the nodes that are up won't have any blanks. |
 |
|
|
dvsmith
Advanced Member
    
USA
218 Posts |
Posted - Nov 18 2011 : 09:59:26 AM
|
| OK, thanks. Sorry for the inconvenience. |
 |
|
| |
Topic  |
|