Duke-UNC Brain Imaging and Analysis Center
BIAC Forums | Profile | Register | Active Topics | Members | Search | FAQ
Username:
Password:
Save Password   Forgot your Password?
 All Forums
 Support Forums
 Cluster Support
 Major Cluster Update - 20110708
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic  

petty
BIAC Staff

USA
453 Posts

Posted - Jun 10 2011 :  3:06:10 PM  Show Profile  Reply with Quote
We are transitioning the cluster nodes to a newer operating system in the near future.

I've already converted one of the interactive nodes ( node54 ) for general testing purposes.


The install has all of the old packages ( with updated versions ) as well as some new packages available.

The current installed list can be seen on the wiki ( relevant sections is SL6 Build ):
http://wiki.biac.duke.edu/biac:cluster:packages

Please login to node54 to test some of your analyses. Please note that the default fsl and freesurfer packages are now the current releases. Therefore if you're using an old version you'll have to reference that directly. There are links from the wiki list of how to use different versions.

Please post here if there are issues found.

To get an interactive job on node54 use the following command:
qrsh -l hostname=node54 bash -li


I am also going to convert a couple of the batch nodes with a special queue, but i will respond when that is ready.


Thanks,
-Chris

Edited by - petty on Jul 08 2011 4:46:54 PM

petty
BIAC Staff

USA
453 Posts

Posted - Jun 10 2011 :  3:36:38 PM  Show Profile  Reply with Quote
As a follow up ... I've also created a small queue for testing batch submissions.

Please send anything you would like to test to the OStest.q using the "-q" flag. Otherwise the job will go to the users.q

qsub -v EXPERIMENT=YourExp.01 -q OStest.q yourscript.sh


Please keep in mind this is a small queue for testing only, there are only 16 current slots.

I plan to leave this open for at least a week to address potential issues.

Again, please respond to this thread with issues/questions/concerns.

Thanks,
-Chris

Go to Top of Page

petty
BIAC Staff

USA
453 Posts

Posted - Jun 10 2011 :  3:45:59 PM  Show Profile  Reply with Quote
One more thing ...

If you see this warning, don't worry:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle
attack)!
It is also possible that the RSA host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
73:c0:eb:c9:51:23:eb:6b:24:f0:ec:f9:02:aa:dd:75.
Please contact your system administrator.
Add correct host key in /home/USER/.ssh/known_hosts to get rid of
this message.
Offending key in /home/USER/.ssh/known_hosts:20
RSA host key for node54 has changed and you have requested strict
checking.
Host key verification failed.


You'll need to remove the entry for for node51,node52,node54 from your known_hosts file.

Just delete the lines in ~/.ssh/known_hosts that reference the 3 testing nodes. When the full update is made, i'll remove everyone's hosts file, then you'll just need to respond yes to this:


The authenticity of host 'node54 (10.136.53.54)' can't be established.
RSA key fingerprint is 73:c0:eb:c9:51:23:eb:6b:24:f0:ec:f9:02:aa:dd:75.
Are you sure you want to continue connecting (yes/no)?



Edited by - petty on Jun 13 2011 11:16:31 AM
Go to Top of Page

ark19
Junior Member

27 Posts

Posted - Jun 17 2011 :  4:48:09 PM  Show Profile  Send ark19 an AOL message  Reply with Quote
Hi Chris,

Today I ran a test of the DNS.01 (Hariri lab) image processing pipeline on the OStest q as you directed in your post and need to report that it did not run successfully as usual.

Most of the pipeline ran normally, but about half way through it ran into difficulty, seemingly related to the fact that it was searching for a display (somewhere in an SPM function). I'm not sure if this is actually the problem or if it is indeed related to the new OS, but seeing as how the same job runs perfectly on the current cluster and not on the test, I thought I'd post here as a start.

Please let me know if this is going to be a problem for us or if you'd like more details about the issue. I'm new to our lab's processing pipeline, so I don't really have any good ideas as to what could be going on here!

Thanks,
Annchen Knodt
Go to Top of Page

petty
BIAC Staff

USA
453 Posts

Posted - Jun 18 2011 :  06:29:56 AM  Show Profile  Reply with Quote
Could you please post your error?
Go to Top of Page

ark19
Junior Member

27 Posts

Posted - Jun 18 2011 :  08:59:20 AM  Show Profile  Send ark19 an AOL message  Reply with Quote
First I get:

{Warning: Unable to open display.}
> In spm at 648
In spm at 581
In spm at 666
In spm_figure at 625
In spm_figure at 175
In spm_check_registration at 24


Then shortly thereafter:

{??? Error using ==> set
Width and height must be > 0

Error in ==> spm_orthviews>bbox at 829
set(st.vols{i}.ax{1}.ax,'Units','pixels', ...

Error in ==> spm_orthviews at 225
bbox;

Error in ==> spm_check_registration at 44
handle(ij) = spm_orthviews('Image', images(ij),...
}


which I believe happens while SPM is trying to print a .ps file.

Thanks!
Go to Top of Page

petty
BIAC Staff

USA
453 Posts

Posted - Jun 22 2011 :  11:08:57 AM  Show Profile  Reply with Quote
So the SL6 build was missing the virtual file buffer, which i know you guys are sending "graphics" to. I've installed that, can you test your script again?

I ran a simple test to call Xvfb, but mine doesn't fail if its not found.

-Chris
Go to Top of Page

ark19
Junior Member

27 Posts

Posted - Jun 22 2011 :  2:40:20 PM  Show Profile  Send ark19 an AOL message  Reply with Quote
Great! Everything seems to work now. FYI, I get this new message in my output file:

SELinux: Disabled on system, not enabling in X server
(EE) config/hal: NewInputDeviceRequest failed (2)
(EE) config/hal: NewInputDeviceRequest failed (2)
(EE) config/hal: NewInputDeviceRequest failed (2)
(EE) config/hal: NewInputDeviceRequest failed (2)
(EE) config/hal: NewInputDeviceRequest failed (2)
(EE) config/hal: NewInputDeviceRequest failed (2)

Seems related to the graphics device, but not a problem.

Thanks,
Annchen
Go to Top of Page

petty
BIAC Staff

USA
453 Posts

Posted - Jun 22 2011 :  4:10:43 PM  Show Profile  Reply with Quote
Great.

Also, those are just warnings related to the new "display" being made with Xvfb ... nothing to worry about.
Go to Top of Page

petty
BIAC Staff

USA
453 Posts

Posted - Jun 28 2011 :  3:50:55 PM  Show Profile  Reply with Quote
Just as an update, There have been a couple kinks worked out and i will likely start transitioning to the new install on all the nodes next week.

Go to Top of Page

petty
BIAC Staff

USA
453 Posts

Posted - Jul 08 2011 :  1:36:02 PM  Show Profile  Reply with Quote
I'm starting to transition over to the new operating system.

All the interactive nodes have been converted except 1, which i will temporarily leave if anyone needs it.
To get to the centos node directly:
qrsh -l hostname=node60 bash -li


I am also creating a queue with a couple remaining centos machines ( also temporarily ).

To be sure to use of one the centos machines use centos.q:
qsub -v EXPERIMENT=YourExp.01 -q centos.q yourscript.sh



Go to Top of Page

dvsmith
Advanced Member

USA
218 Posts

Posted - Jul 21 2011 :  9:12:28 PM  Show Profile  Visit dvsmith's Homepage  Reply with Quote
Just a quick question: What are the core.* files that seem to pop up using the new OS? I don't remember seeing files like this with the old OS.

See http://www.duke.edu/~dvs3/core_files.png for an example of what I'm talking about.

Thanks,
David
Go to Top of Page

petty
BIAC Staff

USA
453 Posts

Posted - Jul 21 2011 :  10:26:27 PM  Show Profile  Reply with Quote
looks like core dump files that were created when a process has died.

they're created in the current working directory when it happens
Go to Top of Page

dvsmith
Advanced Member

USA
218 Posts

Posted - Jul 21 2011 :  10:54:10 PM  Show Profile  Visit dvsmith's Homepage  Reply with Quote
ah ok... guess I need to cut down on my failed processes...
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
BIAC Forums © 2000-2010 Brain Imaging and Analysis Center Go To Top Of Page
This page was generated in 0.47 seconds. Snitz Forums 2000