Duke-UNC Brain Imaging and Analysis Center
BIAC Forums | Profile | Register | Active Topics | Members | Search | FAQ
 All Forums
 Support Forums
 Cluster Support
 dropped jobs

Note: You must be registered in order to post a reply.
To register, click here. Registration is FREE!

Screensize:
UserName:
Password:
Format Mode:
Format: BoldItalicizedUnderlineStrikethrough Align LeftCenteredAlign Right Horizontal Rule Insert HyperlinkInsert EmailInsert Image Insert CodeInsert QuoteInsert List
   
Message:

* HTML is OFF
* Forum Code is ON
Smilies
Smile [:)] Big Smile [:D] Cool [8D] Blush [:I]
Tongue [:P] Evil [):] Wink [;)] Clown [:o)]
Black Eye [B)] Eight Ball [8] Frown [:(] Shy [8)]
Shocked [:0] Angry [:(!] Dead [xx(] Sleepy [|)]
Kisses [:X] Approve [^] Disapprove [V] Question [?]

 
Check here to subscribe to this topic.
   

T O P I C    R E V I E W
clithero Posted - May 04 2010 : 1:10:37 PM
Hey all,
I'm running a bunch of featquery jobs and I'm seeing a lot of randomly dropped jobs. They have the following traits
- no error in log files
- job dies before log file can be moved out of home directory
- run fine when I resubmit (unless it drops again).
I can tell my script to rerun them, but thought I would mention this.
Anyone else having this issue today?
Thanks,
John
7   L A T E S T    R E P L I E S    (Newest First)
djp16 Posted - Jun 04 2010 : 5:26:34 PM
node 4 is having problems - June 4
petty Posted - Jun 02 2010 : 1:55:34 PM
mount manager was frozen on both of those nodes, thanks.
djp16 Posted - Jun 02 2010 : 1:25:59 PM
I had a couple myself. I noticed they occurred on nodes 19 and 24, if it matters. I ran the same scripts with success on later submission.
petty Posted - May 28 2010 : 2:20:11 PM
same thing was happening to lars this morning ... there were some windows characters in his script (not sure how they got there) ... but his output files looked the same.

i did a dos2unix on his script and it re-ran without any issues.
clithero Posted - May 28 2010 : 2:05:37 PM
Hello all,

I have been submitting FSL jobs and I'm seeing jobs from all levels (first, second, or third) randomly not complete....no output files on Munin generated, or even the fsf template. Most jobs go through as normal. Sometimes takes 2-3 times to get the job to run. Since the jobs don't finish, the log files stay in my home directory on Einstein. They all simply have the following in them (warned they might be binary when I open them):

ESC[HESC[J

Any thoughts?

Thanks,
John
Elizabeth.Selgrade Posted - May 12 2010 : 12:13:12 PM
Hi everyone,

I'm having the same issue that John had. Some jobs run fine, while some nearly identical jobs get dropped. Any idea of what's up?

Thanks,
Liz
francis.favorini Posted - May 04 2010 : 3:26:58 PM
There seemed to be some network flakiness starting last evening and ending this morning. Might be the cause of your issue.

BIAC Forums © 2000-2010 Brain Imaging and Analysis Center Go To Top Of Page
This page was generated in 0.32 seconds. Snitz Forums 2000