Plan for Oracle clustering now, not after the outage

Backups are better than no backups, but every business critical database server should also have a clustering strategy that can be realistically adopted, particularly if you use Oracle.

As a DBA I quickly learnt that un-tested database backups or un-tested system restore plans were almost as worthless to the business as having no backups at all.  Every morning I made sure that overnight backups ran and their dump files successfully restored, to either warm standby, reporting or development servers.  By 10am I would know whether the backup files sat on disk were worth the bytes they were taking up.    For the rest of the day the transaction logs were regularly shipped to a standby server s keeping data changes secure until the next night’s backups.

However, hardware failures taught me this wasn’t enough.  The minute a business critical service went offline questions were being asked about when it’d be back up again, despite the business saying they could tolerate up to 30 minutes downtime in business hours.  What I quickly learnt was that no amount of scripting and syncing could stop a standalone database server being a single point of failure and it’s at that point the DBA could be the first person in your business to say “it’s time we got a cluster”.

With the majority of my DBA time spent with SQL Server moving to a failover cluster was relatively painless.  Microsoft has made the integration of Windows Cluster Services and SQL Server so tight that as a DBA virtually nothing changes once clustering is deployed.  Detached data files can still be copied from server to server and databases restored from live to dev without any conversions.  In essence, Microsoft have created a turnkey upgrade path that with the right hardware and licensing in place removes a SQL Server instance’s dependency on a specific operating system instance or  server chassis.

However, in the Oracle database server world things aren’t as straight forward.  Operating system portability and cost means there’s not always an obvious clustering technology choice like there is in the Microsoft world and each option there is has a different price tag and gives different flexibility.  For the remainder of this article I am going to discuss three different technologies which achieve similar goals – increasing the availability of your Oracle database instance:

Native Oracle tools – Real Application Clusters (RAC)
Third party commercial tools – Symantec Cluster Server
Native operating system tool s – Red Hat Cluster Suite

The only platform agnostic option comes from Oracle themselves through the Real Application Cluster (RAC) add-on.  In the past RAC had a reputation for being expensive, complex and poor performance however in the last few revisions it has matured and the install base has grown large enough to make DBAs with RAC skills easier to find.  You will need someone with RAC experience to install and coach you enough to get started especially as it will be your critical systems using it but it’s a single vendor solution that also offers true active/active clustering.  Deeper down in the internals you’ll need a logical volume manager for data file storage, typically Oracle’s ASM, so you’ll need to plan for no more direct file access for cold backups and making database copies from clustered to standalone systems also requires minor tweaks to your processes, but it’s all manageable with practice.  In summary, RAC delivers native clustering functionality that you know will work and that Oracle will support, but potentially at high financial licensing costs.  Because of this I’ll now look at two traditionally cheaper options.

There are third party commercial technologies which take a standalone Oracle database instance and make it highly available, that is should the underlying host server or operating system suffer an outage Oracle will be re-started on a standby server, all while Oracle itself isn’t aware of what’s actually happened.  Symantec Cluster Server is one example of these commercially developed products.  Sold with credible enterprise dependable support Symantec’s product offers a potentially more cost effective solution than RAC because while it maybe not supply the active/active functionality of RAC it can also be used to cluster other applications, perhaps the web or application server services on a multi-role server.   Therefore, like the Microsoft solution this line of clustering products can take your existing standalone instance and with the right hardware infrastructure give you an out the box solution to increasing the availability of your Oracle instance backed by the vendor creditability that some organisations look for.

The final option I’m going to discuss is one that is becoming increasingly popular with technical Linux sysadmins; the Red Hat Cluster Suite.  Essentially a collection of scripts and monitoring processes the Red Hat solution’s configuration and capabilities are limited only by the sysadmin’s scripting skills.  Using node monitoring tools and failover scripts, included apart part of the operating system, can be used to make any suitable application clustered giving applications such as MySQL a similar level of availability as a clustered Oracle instance.  This solution obviously isn’t for the feint hearted and the low financial cost is obviously reflected by the assumed level of operating system knowledge.  However, for organisations confident about their internal levels of Red Hat abilities this maybe a very cost effective solution.

I will conclude by saying that Oracle unlike SQL Server does not have such an obvious clustering path however there are options to suit every pocket depth, buying strategy and technical ability.  All will hopefully achieve your goal, the key point is to decide your strategy at your own pace, not in a rush as a knee jerk reaction to a recent system failure.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s