| Author |
Topic  |
|
|
mullette-gillman
Junior Member
 
USA
40 Posts |
Posted - Nov 05 2008 : 3:13:21 PM
|
Hi there,
Over the last week I have seen many FSL 4.1 jobs that fail once, but perform perfectly fine if I just rerun them. This is for both prestats and first level analyses, and the failure rate can be as high as 20% for a given set of jobs (40+). Again, I just delete the failure and rerun the script and it works fine; I do not believe this is a path error. I have spoken to one other person that has noticed the same thing happening over the last week.
For the 1st level jobs that fail I am told it is a memory issue in the FSL log. For prestats jobs it gives me one or more of the following error in the log: /usr/local/fsl-4.1.0-centos4_64/bin/slicetimer -i prefiltered_func_data_mcf --out=prefiltered_func_data_st -r 2.0 --odd ++ WARNING: nifti_read_buffer(prefiltered_func_data_mcf.nii.gz): data bytes needed = 557056 data bytes input = 480347 number missing = 76709 (set to 0) --------
The log files generated and emailed to me show the jobs concluding properly, but using only a max of 500megs of ram while correctly completed jobs run between 1 and 1.3 gigs of ram. Jobs have failed on at least nodes 3 and 7, so I don't think the error is node specific. The data is being stored on Goldman, btw.
Again, if I just resubmit the job it works fine.
Any thoughts?
Thanks, O'Dhaniel |
|
|
dvsmith
Advanced Member
    
USA
218 Posts |
Posted - Nov 05 2008 : 4:10:35 PM
|
does it always fail during slice-timing correction (i.e., the slicetimer program)? can you post other error messages? surely this is not the only one you're getting.
maybe updating fsl will help. they've released one or two patches since releasing 4.1.0 back in august. |
 |
|
|
mullette-gillman
Junior Member
 
USA
40 Posts |
Posted - Nov 05 2008 : 4:18:09 PM
|
I have already deleted the other failed jobs (so that I could rerun them and maintain my naming formats). For all the prestats jobs that I looked into (the majority) I saw the failure at the same stage.
For all 7 of 48 failures in this lastest batch, they succeeded in the first attempted re-running of their scripts. |
 |
|
|
dvsmith
Advanced Member
    
USA
218 Posts |
Posted - Nov 05 2008 : 8:57:13 PM
|
well, this is pretty weird and it's hard to make any conclusions without other error messages. maybe someone else can respond.
i can tell you that whenever i've had failed jobs in the past, there was never any evidence of them even starting because the cluster thought it was a path error on attempt #1 and worked on attempt #2 (i.e., the cluster didn't mount the experiment correctly).
if i were you, i would run things twice and have the second loop run the job again only if it failed the first time. |
 |
|
|
mullette-gillman
Junior Member
 
USA
40 Posts |
Posted - Nov 06 2008 : 10:47:27 AM
|
| Does anyone else have any ideas? |
 |
|
|
mullette-gillman
Junior Member
 
USA
40 Posts |
Posted - Nov 06 2008 : 2:38:01 PM
|
In the lastest batch, I had one failure in prestats, and this is the error message I received:
/usr/local/fsl-4.1.0-centos4_64/bin/fslmaths prefiltered_func_data_st -Tmean mean_func ++ WARNING: nifti_read_buffer(prefiltered_func_data_st.nii.gz): data bytes needed = 557056 data bytes input = 93254 number missing = 463802 (set to 0)
|
 |
|
|
dvsmith
Advanced Member
    
USA
218 Posts |
Posted - Nov 06 2008 : 3:47:42 PM
|
| since no one else seems to have any ideas, try reporting it to the fsl forums after running fslerrorreport in the failed directory. i suspect they'll probably just blame it on memory or a corrupt data file. |
 |
|
|
rl100
Starting Member
6 Posts |
Posted - Jul 15 2012 : 1:50:36 PM
|
I've also been getting this type of error on a small subset of jobs (~8-10/~600) and on different types of files in the stats folders - 2 examples below.
They were all relatively clustered together in terms of when they would have been submitted, so I wonder if it was a memory issue at a specific time?
++ WARNING: nifti_read_buffer(stats/res4d.nii.gz): data bytes needed = 606208 data bytes input = 457970 number missing = 148238 (set to 0)
++ WARNING: nifti_read_buffer(stats/corrections.nii.gz): data bytes needed = 606208 data bytes input = 323569 number missing = 282639 (set to 0) |
 |
|
|
petty
BIAC Staff
    
USA
453 Posts |
Posted - Jul 15 2012 : 2:44:06 PM
|
++ WARNING: nifti_read_buffer(stats/res4d.nii.gz): data bytes needed = 606208 data bytes input = 457970 number missing = 148238 (set to 0)
Thats telling you that there's 148238 missing from the data file. According to the header it should have the number listed in "data bytes needed", however there's only "data bytes input".
You need to go back to the step where this file was written to determine the write error and likely re-run that step to get complete data |
 |
|
| |
Topic  |
|