Duke-UNC Brain Imaging and Analysis Center
BIAC Forums | Profile | Register | Active Topics | Members | Search | FAQ
 All Forums
 Support Forums
 Cluster Support
 Major Cluster Update - 20110708

Note: You must be registered in order to post a reply.
To register, click here. Registration is FREE!

Screensize:
UserName:
Password:
Format Mode:
Format: BoldItalicizedUnderlineStrikethrough Align LeftCenteredAlign Right Horizontal Rule Insert HyperlinkInsert EmailInsert Image Insert CodeInsert QuoteInsert List
   
Message:

* HTML is OFF
* Forum Code is ON
Smilies
Smile [:)] Big Smile [:D] Cool [8D] Blush [:I]
Tongue [:P] Evil [):] Wink [;)] Clown [:o)]
Black Eye [B)] Eight Ball [8] Frown [:(] Shy [8)]
Shocked [:0] Angry [:(!] Dead [xx(] Sleepy [|)]
Kisses [:X] Approve [^] Disapprove [V] Question [?]

 
Check here to subscribe to this topic.
   

T O P I C    R E V I E W
petty Posted - Jun 10 2011 : 3:06:10 PM
We are transitioning the cluster nodes to a newer operating system in the near future.

I've already converted one of the interactive nodes ( node54 ) for general testing purposes.


The install has all of the old packages ( with updated versions ) as well as some new packages available.

The current installed list can be seen on the wiki ( relevant sections is SL6 Build ):
http://wiki.biac.duke.edu/biac:cluster:packages

Please login to node54 to test some of your analyses. Please note that the default fsl and freesurfer packages are now the current releases. Therefore if you're using an old version you'll have to reference that directly. There are links from the wiki list of how to use different versions.

Please post here if there are issues found.

To get an interactive job on node54 use the following command:
qrsh -l hostname=node54 bash -li


I am also going to convert a couple of the batch nodes with a special queue, but i will respond when that is ready.


Thanks,
-Chris
13   L A T E S T    R E P L I E S    (Newest First)
dvsmith Posted - Jul 21 2011 : 10:54:10 PM
ah ok... guess I need to cut down on my failed processes...
petty Posted - Jul 21 2011 : 10:26:27 PM
looks like core dump files that were created when a process has died.

they're created in the current working directory when it happens
dvsmith Posted - Jul 21 2011 : 9:12:28 PM
Just a quick question: What are the core.* files that seem to pop up using the new OS? I don't remember seeing files like this with the old OS.

See http://www.duke.edu/~dvs3/core_files.png for an example of what I'm talking about.

Thanks,
David
petty Posted - Jul 08 2011 : 1:36:02 PM
I'm starting to transition over to the new operating system.

All the interactive nodes have been converted except 1, which i will temporarily leave if anyone needs it.
To get to the centos node directly:
qrsh -l hostname=node60 bash -li


I am also creating a queue with a couple remaining centos machines ( also temporarily ).

To be sure to use of one the centos machines use centos.q:
qsub -v EXPERIMENT=YourExp.01 -q centos.q yourscript.sh



petty Posted - Jun 28 2011 : 3:50:55 PM
Just as an update, There have been a couple kinks worked out and i will likely start transitioning to the new install on all the nodes next week.

petty Posted - Jun 22 2011 : 4:10:43 PM
Great.

Also, those are just warnings related to the new "display" being made with Xvfb ... nothing to worry about.
ark19 Posted - Jun 22 2011 : 2:40:20 PM
Great! Everything seems to work now. FYI, I get this new message in my output file:

SELinux: Disabled on system, not enabling in X server
(EE) config/hal: NewInputDeviceRequest failed (2)
(EE) config/hal: NewInputDeviceRequest failed (2)
(EE) config/hal: NewInputDeviceRequest failed (2)
(EE) config/hal: NewInputDeviceRequest failed (2)
(EE) config/hal: NewInputDeviceRequest failed (2)
(EE) config/hal: NewInputDeviceRequest failed (2)

Seems related to the graphics device, but not a problem.

Thanks,
Annchen
petty Posted - Jun 22 2011 : 11:08:57 AM
So the SL6 build was missing the virtual file buffer, which i know you guys are sending "graphics" to. I've installed that, can you test your script again?

I ran a simple test to call Xvfb, but mine doesn't fail if its not found.

-Chris
ark19 Posted - Jun 18 2011 : 08:59:20 AM
First I get:

{Warning: Unable to open display.}
> In spm at 648
In spm at 581
In spm at 666
In spm_figure at 625
In spm_figure at 175
In spm_check_registration at 24


Then shortly thereafter:

{??? Error using ==> set
Width and height must be > 0

Error in ==> spm_orthviews>bbox at 829
set(st.vols{i}.ax{1}.ax,'Units','pixels', ...

Error in ==> spm_orthviews at 225
bbox;

Error in ==> spm_check_registration at 44
handle(ij) = spm_orthviews('Image', images(ij),...
}


which I believe happens while SPM is trying to print a .ps file.

Thanks!
petty Posted - Jun 18 2011 : 06:29:56 AM
Could you please post your error?
ark19 Posted - Jun 17 2011 : 4:48:09 PM
Hi Chris,

Today I ran a test of the DNS.01 (Hariri lab) image processing pipeline on the OStest q as you directed in your post and need to report that it did not run successfully as usual.

Most of the pipeline ran normally, but about half way through it ran into difficulty, seemingly related to the fact that it was searching for a display (somewhere in an SPM function). I'm not sure if this is actually the problem or if it is indeed related to the new OS, but seeing as how the same job runs perfectly on the current cluster and not on the test, I thought I'd post here as a start.

Please let me know if this is going to be a problem for us or if you'd like more details about the issue. I'm new to our lab's processing pipeline, so I don't really have any good ideas as to what could be going on here!

Thanks,
Annchen Knodt
petty Posted - Jun 10 2011 : 3:45:59 PM
One more thing ...

If you see this warning, don't worry:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle
attack)!
It is also possible that the RSA host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
73:c0:eb:c9:51:23:eb:6b:24:f0:ec:f9:02:aa:dd:75.
Please contact your system administrator.
Add correct host key in /home/USER/.ssh/known_hosts to get rid of
this message.
Offending key in /home/USER/.ssh/known_hosts:20
RSA host key for node54 has changed and you have requested strict
checking.
Host key verification failed.


You'll need to remove the entry for for node51,node52,node54 from your known_hosts file.

Just delete the lines in ~/.ssh/known_hosts that reference the 3 testing nodes. When the full update is made, i'll remove everyone's hosts file, then you'll just need to respond yes to this:


The authenticity of host 'node54 (10.136.53.54)' can't be established.
RSA key fingerprint is 73:c0:eb:c9:51:23:eb:6b:24:f0:ec:f9:02:aa:dd:75.
Are you sure you want to continue connecting (yes/no)?


petty Posted - Jun 10 2011 : 3:36:38 PM
As a follow up ... I've also created a small queue for testing batch submissions.

Please send anything you would like to test to the OStest.q using the "-q" flag. Otherwise the job will go to the users.q

qsub -v EXPERIMENT=YourExp.01 -q OStest.q yourscript.sh


Please keep in mind this is a small queue for testing only, there are only 16 current slots.

I plan to leave this open for at least a week to address potential issues.

Again, please respond to this thread with issues/questions/concerns.

Thanks,
-Chris


BIAC Forums © 2000-2010 Brain Imaging and Analysis Center Go To Top Of Page
This page was generated in 0.33 seconds. Snitz Forums 2000