Duke-UNC Brain Imaging and Analysis Center
BIAC Forums | Profile | Register | Active Topics | Members | Search | FAQ
 All Forums
 Support Forums
 Cluster Support
 Scheduled Downtime - 11/16-17

Note: You must be registered in order to post a reply.
To register, click here. Registration is FREE!

Screensize:
UserName:
Password:
Format Mode:
Format: BoldItalicizedUnderlineStrikethrough Align LeftCenteredAlign Right Horizontal Rule Insert HyperlinkInsert EmailInsert Image Insert CodeInsert QuoteInsert List
   
Message:

* HTML is OFF
* Forum Code is ON
Smilies
Smile [:)] Big Smile [:D] Cool [8D] Blush [:I]
Tongue [:P] Evil [):] Wink [;)] Clown [:o)]
Black Eye [B)] Eight Ball [8] Frown [:(] Shy [8)]
Shocked [:0] Angry [:(!] Dead [xx(] Sleepy [|)]
Kisses [:X] Approve [^] Disapprove [V] Question [?]

 
Check here to subscribe to this topic.
   

T O P I C    R E V I E W
petty Posted - Nov 10 2011 : 2:16:29 PM
Duke OIT is performing power maintenance in the data center on our row this coming week.

As a result, the cluster nodes will be OFF two nights:
Wed 11/16, Thu 11/17

They will be shutdown by 10pm 5:30pm and will be brought back up each day when maintenance is finished.

In other words, they won't be off for 2 days straight, but they will be down two consecutive nights.

As a result, any existing jobs will be killed when they are turned off and they will be inaccessible during the downtime.


7   L A T E S T    R E P L I E S    (Newest First)
dvsmith Posted - Nov 18 2011 : 09:59:26 AM
OK, thanks. Sorry for the inconvenience.
petty Posted - Nov 18 2011 : 09:54:32 AM
right ... i have to go physically push the buttons on the down nodes due to a "feature ( bug? )" with the remote power management.

if you do "qhost" ... the nodes that are up won't have any blanks.
dvsmith Posted - Nov 18 2011 : 09:52:24 AM
Ah ok... If I go ahead and restart my jobs, will the grid just avoid those nodes or will half my jobs get stuck in the queue (and/or fail)?
syam.gadde Posted - Nov 18 2011 : 09:48:42 AM
Due to some wierdness, the half of the cluster that was affected by the power outage is still down for the time being.
dvsmith Posted - Nov 18 2011 : 09:43:32 AM
Are all of the nodes back up now? I tried submitting to each node, but half of my jobs are stuck in the queue.
petty Posted - Nov 11 2011 : 3:14:28 PM
If they are on the head node then they won't die.

If all goes as planned the head node will stay on.



dvsmith Posted - Nov 11 2011 : 12:01:34 PM
Hey Chris,

I assume this will also kill daemon processes that are submitting jobs, but can you confirm whether that is the case? I can either pause or let the power cut off kill my daemons and then resume after the maintenance, but it would definitely be easier to just pause them each night (if that's feasible).

Thanks!
David

BIAC Forums © 2000-2010 Brain Imaging and Analysis Center Go To Top Of Page
This page was generated in 0.39 seconds. Snitz Forums 2000