| T O P I C R E V I E W |
| vinod |
Posted - Jul 30 2007 : 2:44:56 PM I have seen Goldman disk space go down from 2TB to <400 GB in a span of about 45 days. I am sure someone tracks these things but just thought would highlight it still. |
| 15 L A T E S T R E P L I E S (Newest First) |
| francis.favorini |
Posted - Aug 09 2007 : 6:30:57 PM Connections to Goldman from Golgi are working again.
-Francis
|
| vinod |
Posted - Aug 09 2007 : 5:47:55 PM Goldman is back up but I can't connect to it through golgi. cifslogin basically hangs when I try to connect to goldman. |
| francis.favorini |
Posted - Aug 06 2007 : 4:20:13 PM It is back up. See here.
-Francis
|
| tankersley |
Posted - Aug 06 2007 : 3:53:09 PM Goldman seems to be crashed. Any ideas about when it might be back up?
Thanks,
Dharol |
| francis.favorini |
Posted - Aug 06 2007 : 2:25:44 PM There was a disk read error on Golgi. This may have caused Sharity to get confused. In order to reset your Sharity connection, it is best to cifslogout and then cifslogin. Some information is cached for up to 15 minutes, and if Sharity is confused, it may assume the cache is valid when it isn't.
-Francis
|
| vinod |
Posted - Aug 06 2007 : 10:02:26 AM Did people get a better idea behind what caused this problem last Wednesday? Or is it something random that could pop up again in the future? |
| Bethany |
Posted - Aug 02 2007 : 12:54:59 PM Hmm, but I cifslogin-ed and it didn't fix the problem. (It also didn't ask for my password, which I usually take to mean that I was already logged in.)
It did eventually stopped giving me error messages but also didn't run any batch files. I eventually concluded that the reason nothing happened when I ran the batch file is that the contents of the batch file had been emptied. (The file was still there but contained no data.) Since the system would't let me change the file -- whether I edited it with nedit or WordPad, the file remained empty after saving -- I eventually gave up and went to bed.
What's weird is that the batch file in question was definitely not empty when I started -- goldman was still running fine when I saved it last. |
| vinod |
Posted - Aug 02 2007 : 11:30:18 AM Thanks Jimmy. But I think there are still things that it does not explain. I am posting here in the hope that it will help the diagnosis if people are interested.
1. I did anticipate that being the issue and I tried cifslogin to goldman again. For a while, it did not work signifying possibly that goldman was physically down (though I could still see files from windows explorer even at this point). I can't remember what the error message was but I think it just kept coming back asking for password even after I enter it.
2. After a short while, cifslogin managed to log me in. This is when I could look at files on goldman even from golgi. But it did not include all files that were physically present. It did not show some of the files that were created from windows after the problem with goldman first happened.
3. When I ran FSL, I got the following error msg:
mkdir custom_timing_files cp /usr/local/sharity/var/mount/Goldman.Data/BIAC/Riskalloc.01/Analysis/FSL/46907/run06/46907_r06_rate.txt custom_timing_files/ev1.txt cp: custom_timing_files/ev1.txt: A parameter must be a directory.
When I looked for the "custom_timing_files" folder, I could see if from windows explorer but not from golgi. I kept getting an error msg saying "custom_timing_files" does not exist.
4. This morning, everything worked again. I did not have to relogin or do a cifslogin, just had to go back to root directory and change to the data directory again.
I hope this helps Vinod |
| vinod |
Posted - Aug 02 2007 : 11:15:49 AM >> A good test for that theory could happen tonight, when Goldman is taken down for maintenance.
You are referring to the "official" downtime tomorrow night and not tonight right? |
| jimmy.dias |
Posted - Aug 02 2007 : 10:48:16 AM Hey all,
I haven't done a thorough investigation, but I'm guessing what's happening is that Goldman, for some reason or another goes down, and then Sharity logs everyone off of the server because Goldman is no longer visible- after some kind of timeout. Jobs that are currently running expect that directory to always be there but subsequent attempts to access this directory result in the permission denied error everyone's seeing. As a short-term fix, you can re-login to Goldman using cifslogin and so long as Goldman doesn't go down for an extended period of time, you should be ok. A good test for that theory could happen tonight, when Goldman is taken down for maintenance.
Jimmy |
| luke.vicens |
Posted - Aug 02 2007 : 09:20:57 AM Sounds like problems I was encountering a lot last winter, which I was told were caused by instability of Sharity, the app that mounts goldman's (and other windows server) volumes on golgi. That would explain why they're accessible from windows but not from golgi. |
| vinod |
Posted - Aug 02 2007 : 07:45:41 AM It is running fine again this morning but it will be great if someone can explain what actually happened. It does not appear as if it was the usual server going down due to power failure or other things as one could see the file structure clearly from windows and even modify files. |
| Bethany |
Posted - Aug 01 2007 : 11:49:39 PM Meanwhile, every time I try to do something on Goldman I either get the dreaded, "The file access permissions do not allow the specified action" error or nothing happens at all (no errors but the script does not run) although I can see all the directories and files just fine. |
| vinod |
Posted - Aug 01 2007 : 11:16:15 PM Just to add to that note further, I can access all the files on Goldman using windows explorer just fine. I can go through directories on golgi using 'ls' and 'cd'. But the files that are shown on golgi are only a part of the files that I can see on my windows. And if I run fsl, I get error saying directories does not exist when it is trying to move the ev files from their location to custom_timing_files folder. Exploring further shows this directory again from windows explorer but not from inside golgi.
Is this a random thing or is there a reason why this happens? I tried to close everything and use cifslogin to goldman again. It did not help. |
| vinod |
Posted - Aug 01 2007 : 9:14:15 PM Something bizzare is happening right now. The jobs that I was running on golgi just quit. And funnily, there are some files on goldman that I can see using windows explorer but cannot see them from golgi. Any idea whats happening? Something similar also happened last night when goldman went down temporarily I guess and hence most jobs got stopped.
And now I cannot even see directories from golgi (getting permission denied) but everything seems to work fine from windows. |