| Author |
Topic  |
|
|
autevsky
Starting Member
7 Posts |
Posted - Jan 13 2012 : 1:51:04 PM
|
Hi,
when running randomise_parallel I received a path error in my output saying: ** ERROR (nifti_image_read): failed to find header file for '/mnt/BIAC/munin.dhe.duke.edu/Huettel/Imagene.02/Analysis/TaskData/groupICA_AU/gica_n4_artifact.gica/DR_output_unpairedt_n4/dr_stage3b_ic0024_tfce_corrp_tstat1'
When looking that file up, the file doesn't exist (explaining the error). When talking to DVS, he thought it might be a problem with the dr_stage3b_ic0024.defragment file.
Any help would be appreciated!
Thanks, Amanda |
|
|
syam.gadde
BIAC Staff
    
USA
421 Posts |
Posted - Jan 13 2012 : 2:06:36 PM
|
| This may be related to the path problems you've been having. The defragment process runs at the end of all of the parallelized randomise fragments -- you could try running that defragment script on its own to see if it succeeds. |
 |
|
|
dvsmith
Advanced Member
    
USA
218 Posts |
Posted - Jan 13 2012 : 2:17:04 PM
|
Hey Syam,
Alternatively, could it be related to the chmod calls? I know Chris and I have changed some of the other FSL scripts to eliminate those lines, but I'm not sure how those particular sorts of changes interact with fsl_sub.
e.g., chmod +x ${DIRNAME}/${BASENAME}.defragment to bash ${DIRNAME}/${BASENAME}.defragment
We may have actually replaced the fsl_sub lines with bash calls to the script, but I think that would kill the parallel aspect of the processing...
Thanks! David
|
 |
|
|
syam.gadde
BIAC Staff
    
USA
421 Posts |
Posted - Jan 13 2012 : 2:20:33 PM
|
| Well, I have run randomise_parallel successfully several times in the last couple months, so unless there have been changes to those bash/chmod/fsl_sub calls in the meanwhile, they are probably not at fault. |
 |
|
|
petty
BIAC Staff
    
USA
453 Posts |
Posted - Jan 13 2012 : 2:28:38 PM
|
i've also run randomise_parallel successfully.
every single one of your failed jobs is reporting that the log directory can't be accessed: 01/13/2012 12:03:58 [2392:17016]: error: can't open output file "/mnt/BIAC/munin.dhe.duke.edu/Huettel/Imagene.02/Analysis/TaskData/groupICA_AU/gica_n4_artifact.gica/DR_output_unpairedt_n4/dr_stage3b_ic0024_logs/": No such file or directory
and we've attempted to access this directory numerous times without success ... cifs just hangs. So i think your culprit here is access patterns.
|
 |
|
|
dvsmith
Advanced Member
    
USA
218 Posts |
Posted - Jan 13 2012 : 2:31:18 PM
|
Access patterns -- meaning we're overloading Imagene.02 right now?
|
 |
|
|
petty
BIAC Staff
    
USA
453 Posts |
Posted - Jan 13 2012 : 2:37:32 PM
|
| yes, given that from the cluster, or my local machine trying to access anything inside of TaskData causes my terminal to hang indefinitely |
 |
|
|
dvsmith
Advanced Member
    
USA
218 Posts |
Posted - Jan 13 2012 : 2:39:56 PM
|
| OK, fail... I think others are working in different directories, so I guess it's our whole experiment. |
 |
|
|
petty
BIAC Staff
    
USA
453 Posts |
Posted - Jan 13 2012 : 2:53:28 PM
|
| yeah, you guys need a vacation. |
 |
|
|
petty
BIAC Staff
    
USA
453 Posts |
Posted - Jan 13 2012 : 2:57:16 PM
|
| Also, you might be able to get randomise to run ( not the parallel version ) on this particular job. |
 |
|
|
dvsmith
Advanced Member
    
USA
218 Posts |
Posted - Jan 13 2012 : 3:07:22 PM
|
Yeah, I'll take one after I graduate. ;-) We're trying the straight randomise right now -- and John and I can adjust our access pattern for our other stuff. I assume longer jobs are better than short jobs since we'll have fewer jobs going up and down? |
 |
|
| |
Topic  |
|