linux
JBoss woes
cornet — Sat, 2009-11-07 02:09
JBoss, on the whole, does hold up surprisingly well. This can probably be attributed to our skilled developers who, to be fair, I don't always give enough credit to.
However every so often JBoss plays up and the symptoms presented seem to point to obvious problems. However all is not as it appears.
It's release day and shiny new software is just itching to be deployed. The sysadmin gets up early and rocks up to the office to go through the standard deployment procedure. If only things always went to plan!
We have a cluster of a number of servers and use the JBoss Farm Deployment service to deploy applications. It's fairly straight forward, you build and deploy .ear, .war, .spring, etc... files to the $JBOSS_HOME/server/default/farm/ directory and all the nodes pick up the new code.
Here comes the first gotcha, which we have known about for quite a while now. If you re-farm an already running package then by default it won't free itself from the PermGen heap so continuous redeploying will eventually mean you run out of PermGen memory.
The solution we have is to make sure we restart every node after deployment to clear out this memory.
What I've found out recently is this issue can be resolved by setting the following Java options:
-XX:+CMSPermGenSweepingEnabled -XX:+CMSClassUnloadingEnabled
which will free up the PermGen memory.
This will be going into testing out or dev and test environments shortly and hopefully on live (once we are sure there are no adverse affects). This should mean no restarting required in most cases.
I say in most cases as we do have some applications we can't "hot deploy". We have been instructed to do the following:
- Shutdown the build node
- Build and farm the application
- Shutdown all other nodes
- Bring up the build node
- Bring up the other nodes
This obviously leads to a complete outage lasting a few minutes, but we can live with that for the most part.
Once such morning a co-worker followed this procedure and all appeared to go fine. However not long after some of our application started throwing "Broken Pipe" exceptions. The applications in question were communicating with our JBoss clusting using RMI. From the exceptions this initially looked like some network issue. The load balancers (LVS ones) were checked but no issues. More investigation required...
The nodes throwing the exception were part of a 6 node Tomcat cluster communicating to a 3 node JBoss cluster. On closer exception only nodes 1 and 4 of the Tomcat cluster were throwing exceptions. These were restarted but to no avail.
Then I remembered that we we do Source Hashing on our LVS nodes. Source hashing is used to make sure the same clients hit the servers, normally for session tracking purposes, this helped with diagnosis.
I found that 1 JBoss cluster node was at fault, but there were no exceptions in the logs. Further more most transactions were working fine. Just to be safe JBoss was restarted on the offending node but no difference. On with investigation I guess...
Eventually I found something that didn't make sense. $JBOSS_HOME/server/default/tmp/deploy had a timestamp older than I expected.
This directory is used to hold the expanded files from $JBOSS_HOME/server/default/farm/ and should disappear when JBoss is shut down. I shutdown JBoss again and, for whatever reason, it still remained. So I deleted the directory by hand and started up JBoss. Sure enough the "Broken Pipe" exceptions disappeared.
I shutdown JBoss on offending node again and this time it removed the directory. Started up again and all fine.
After much playing around I've no idea what causes this. I know that if JBoss doesn't shut down correctly then this directory can remain causing clustering issues (which really don't make sense to me) but I've seen a number of occasions on our Test Environment where this directory has remained after a successful shutdown.
To make sure this doesn't happen again I've modified out start scripts to check for the presence of this directory and refuse to start up if it exists.
Fingers crossed we won't see this issue again.
Failover hosts using Xen, DRBB and Heartbeat
cornet — Wed, 2009-04-08 20:19
After quite a lot of reading and a morning playing I managed to get failover Xen hosts working.
The idea was to have 2 physical servers to run 2 (or more) Xen hosts between them. If one server was to die or needed some work doing
on it then the domU would automatically move to the other node.
I've done some testing and all appears to work fine. However let me stress that this is not live migration so you would suffer about a minute or so outage
(not really a big deal in the grand scheme of things).
Click the "Read More" button for full details on the setup.
Sun Java on Debian/Ubuntu
cornet — Thu, 2008-09-18 15:09
It would appear that even if you install ONLY the sun java package it
drags in some GNU java stuff as well.
You can go through everything in /etc/alternatives and update it
manually but that is somewhat time consuming.
Instead just run:
sudo /usr/sbin/update-java-alternatives --set java-1.5.0-sun
and this will sort everything :)
Dropbox
cornet — Wed, 2008-09-17 23:15
Somone appears to have created a online storage and sync thing that actually works:
Dropbox - Home - Secure backup, sync and sharing made easy.
Few grips thou:
- It requires nautilus
- Not obvious how to create an account (it's not on the website, the client does it)
- The initial dialog box (which you use to sign up for an account) takes an age to appear
All that aside it works wonderfully, copy file into ~/Dropbox and it syncs automatically :)
... oh and it versions files too
... and only sends binary diffs
Another blog added
cornet — Thu, 2008-08-07 06:18
Another one I've just added to my feedlist,
some interesting stuff, especially on the MySQL front.
He is also the author of the MySQL Master-Master Replication Manager
which can be found here:
http://code.google.com/p/mysql-master-master/
definitely something I'll be looking at.