Hussein Badakhchani’s Blog April 12, 2006 5:27 AM
Have you ever wondered how a running WebLogic domain would cope with having the domain root directory deleted? Well if you have been responsible for resilience testing or excessively use rm -rf you may have some idea of the problems you could face if you loose your domain root.
I have witnessed some pretty disastrous effects on WLS 5 and WLS 6.1 domains as a result of deleting the domain root but a recent cleansing event on a weblogic 8.1 domain proved to me just how resilient this product has become.
I was asked by a developer to look at a print out of a log file which showed an EmbeddedLDAPException indicating that configuration files could not be found. This was an odd error but the obvious action was to check for the existence of the files, at this point I found that the entire domain was missing from the file system!
After checking the other machines in the cluster and finding the domain was missing from those machines as well I decided to try and identify exactly when the domain was deleted. Luckily our log files are located in a different location outside of the domain root, searching through the server logs it appeared that the domain had been deleted some three days earlier based on the first instance of the Embedded LDAP error I found. I was shocked by this, how was it possible that deleting a WLS domain from a working environment could go unnoticed for three days! Examining the log files further I found no other errors related to loss of the domain, it seemed that everything had continued to function correctly even after the deletion of the domain root directory.
Finally I decided to approach the developers working on ghost environment and explain the cause of the exception to them, they were as surprised as I was. We spent the next few hours double checking the application functionality as well as monitoring the domain using the console. I tested the JDBC connection pools, took thread dumps of the servers, examined the memory consumption and used WLST to dump the states of EJB MBeans, everything seemed to be o.k.
The developers continued their testing on the ghost integration domain for another seven days, the only limitation was that the servers could not be restarted, we could even deploy applications using the console.
Of course the application deserves some credit for not having any dependency on the domain file system but I was well impressed by WLS. We still don’t know who it was that deleted the domain in the first instance, an individual we have named “The ghost in the Korn shell” and to whom we have since attributed a number of other odd events, but at least now I know what happens if I delete the domain root directory. I wonder how JBoss and WebSphere would handle such events?
On a side note standard nerds and geeks may be interested to know that this entry was written using “Das Keyboard”.
Comments
Comments are listed in date ascending order (oldest first)
-
If this has happened under *NIX environment (and it looks like it did), the reason WLS kept functioning may be the feature of the OS..
OS keeps reference count on all open inodes (files or directories) and only TRULY DELETES stuff when all references are removed. So if WL server opened the files, you can go ahead and delete them – no-one will see them anymore, except for WL server, because it still holds reference to it. So WLS can read/write to these “ghost” files, but after the process exits, reference counts will go to zero and the files will be gone forever.
Posted by: sLayer on April 12, 2006 at 10:09 AM
-
I would certainly expect the OS not to kill the WLS processes just because the files they rely on were deleted, I’m not sure how WLS would behave under Windows although I do remember that Windows tends to lock files that are in use by a process.
What impressed me though was the sheer lack of errors and the fact that appart from the LDAP exception every thing else seemed to be working fine. I can only explain this by assuming that much of the servers work is carried out in memory, and thats why removing these files had such little effect.
Posted by: hoos on April 13, 2006 at 2:36 AM
-
Great story Hoos! I agree with sLayer though – the files were probably not *really* deleted, but simply absent from your directory listings (and still available to the running processes). Still, that is neat.
I think that keyboard is kinda funny. It sort of illustrates the difficult some folk have in training themselves not to look at the keyboard. So, now they can simply avoid the training by removing the keyboard – effectively doing the same thing at greater cost but with ease and absolutely no chance to cheat
The strength zones sound interesting though. My pinky is powerless compared to my thunderous thumb!
Posted by: jonmountjoy on April 13, 2006 at 5:10 AM
-
I can now type without looking at the keyboard but its the shift key, punctuation and numbers that catch me out. Still I’ve only been using it for the last few days.
Posted by: hoos on April 13, 2006 at 7:48 AM
-
Actuall email recieved by me from Hoos on the first day of using Das Keyboard.
>> HELP.I.CANT.FIND.THE.SPACE.BAR11
My current client doesnt allow bring you own keyboard in, and my machine at home already has 2 keyboards attached (usb/wireless and ps/2 for Lilo) so my “das keyboard” is gathering dust.Posted by: simonvc on May 8, 2006 at 2:47 AM
-
If only your current employers knew how you used to “build weblogic domains”, just make sure you don’t turn that green field into a mine field…
On a more serious note, like some sort of crazy fool I bought the U.S version of “Das Keyboard” and as a result had to learn how to map the missing / and | key, in, erm, true hacker style I won’t let a missing key stop me!
Posted by: hoos on May 8, 2006 at 8:28 AM