| Author |
Topic  |
|
|
clithero
Junior Member
 
37 Posts |
Posted - May 04 2010 : 1:10:37 PM
|
Hey all, I'm running a bunch of featquery jobs and I'm seeing a lot of randomly dropped jobs. They have the following traits - no error in log files - job dies before log file can be moved out of home directory - run fine when I resubmit (unless it drops again). I can tell my script to rerun them, but thought I would mention this. Anyone else having this issue today? Thanks, John |
|
|
francis.favorini
Forum Admin
    
USA
618 Posts |
Posted - May 04 2010 : 3:26:58 PM
|
| There seemed to be some network flakiness starting last evening and ending this morning. Might be the cause of your issue. |
IT Director, Brain Imaging and Analysis Center |
 |
|
|
Elizabeth.Selgrade
Starting Member
USA
1 Posts |
Posted - May 12 2010 : 12:13:12 PM
|
Hi everyone,
I'm having the same issue that John had. Some jobs run fine, while some nearly identical jobs get dropped. Any idea of what's up?
Thanks, Liz |
ESS |
 |
|
|
clithero
Junior Member
 
37 Posts |
Posted - May 28 2010 : 2:05:37 PM
|
Hello all,
I have been submitting FSL jobs and I'm seeing jobs from all levels (first, second, or third) randomly not complete....no output files on Munin generated, or even the fsf template. Most jobs go through as normal. Sometimes takes 2-3 times to get the job to run. Since the jobs don't finish, the log files stay in my home directory on Einstein. They all simply have the following in them (warned they might be binary when I open them):
ESC[HESC[J
Any thoughts?
Thanks, John |
 |
|
|
petty
BIAC Staff
    
USA
453 Posts |
Posted - May 28 2010 : 2:20:11 PM
|
same thing was happening to lars this morning ... there were some windows characters in his script (not sure how they got there) ... but his output files looked the same.
i did a dos2unix on his script and it re-ran without any issues. |
 |
|
|
djp16
Starting Member
9 Posts |
Posted - Jun 02 2010 : 1:25:59 PM
|
| I had a couple myself. I noticed they occurred on nodes 19 and 24, if it matters. I ran the same scripts with success on later submission. |
 |
|
|
petty
BIAC Staff
    
USA
453 Posts |
Posted - Jun 02 2010 : 1:55:34 PM
|
| mount manager was frozen on both of those nodes, thanks. |
 |
|
|
djp16
Starting Member
9 Posts |
Posted - Jun 04 2010 : 5:26:34 PM
|
| node 4 is having problems - June 4 |
 |
|
| |
Topic  |
|