| T O P I C R E V I E W |
| lh115 |
Posted - Dec 23 2009 : 8:26:10 PM Hi,
I'm having trouble accessing qinteract. I had been logged into qinteract from earlier in the day. my windows froze so I closed them out. When I type 'qinteract' at the head node I get: "Your "qrsh" request could not be scheduled, try again later."
When I type qstatall, it lists my user name as still running two jobs at node4 even though I exited out of those jobs (so I would expect errors at the waiting queue.)
I don't if anyone else is having trouble.
Thanks, Lars |
| 15 L A T E S T R E P L I E S (Newest First) |
| dvsmith |
Posted - Apr 14 2011 : 10:17:07 AM thanks, chris! |
| petty |
Posted - Apr 14 2011 : 09:43:46 AM back to normal now. |
| dvsmith |
Posted - Apr 13 2011 : 10:51:23 PM I've been having trouble accessing the interactive node(s) since about 4pm. Is it down again?
[smith@hugin ~]$ qinteract Your "qrsh" request could not be scheduled, try again later.
|
| clithero |
Posted - Jan 19 2010 : 11:38:54 AM I think the daemon idea sounds very useful. I don't know what other programs besides eventstats are problematic, but I imagine you can see what eats up a lot of the space. Thanks. |
| petty |
Posted - Jan 18 2010 : 4:15:05 PM there is a way to restrict based on CPU usage and time on the queue. When this was tried before however everything failed.
I supposed i could write a daemon that just kills any process based on keywords in a list, or just not put certain programs on there .. clearly hoping for common courtesy isn't cutting it.
|
| clithero |
Posted - Jan 18 2010 : 4:04:49 PM Hi Chris, Great, thanks very much for fixing that.
Is there any way to have something automated that will kick users off for running such jobs? There obviously is not an effective incentive structure in place to prevent this.
Thanks again for restarting the node! |
| gunes |
Posted - Jan 18 2010 : 4:01:40 PM Thank you, Chris. You are awesome! :) Now I can finish my fsl analysis.
|
| petty |
Posted - Jan 18 2010 : 3:56:24 PM I went to the data center to reset the machine and look through the logs, so its currently back up.
Someone was running eventstats on the interactive node again, which caused the freeze. Please only use this node for non-memory intensive processes, (ie: visualizing results, configuring other jobs, etc).
If its a job that can be submitted to the grid-engine, then please set that up. This node's environment is not set-up for a bunch of users to login and just run analyses at will. The other nodes are restricted to take only 4 jobs at a time (one for each processor) because there is adequate memory to go around. |
| diaz |
Posted - Jan 18 2010 : 11:01:23 AM Looks like it's still hung - I'm getting the same error "Your 'qrsh' request could not be scheduled, try again later."
|
| clithero |
Posted - Jan 17 2010 : 2:56:19 PM I think my first attempt was at about 11 or so. |
| petty |
Posted - Jan 17 2010 : 2:52:23 PM the machine is hung again, when did this start happening to you? |
| clithero |
Posted - Jan 17 2010 : 12:45:07 PM I've been getting this message again while trying to start "qinteract":
Your "qrsh" request could not be scheduled, try again later.
I know that sometimes waiting for a while helps, but I've tried from several different terminals for the past few hours with no success.
Thanks, John |
| gunes |
Posted - Dec 29 2009 : 12:20:33 PM Great! Thank you. |
| petty |
Posted - Dec 29 2009 : 12:19:13 PM OK folks, node4 is back up after a trip over to the data center for a reset.
There was an eventstats process, and a flirt process that used all available memory on the node.
The logs indicated that the machine killed the jobs, but then it did not recover cleanly.
-chris |
| gunes |
Posted - Dec 28 2009 : 1:13:52 PM It has been like this since Lars posted on Wednesday. We will wait for you or Syam to come back to work. Happy Holidays! :) |