Skip to main content's Oracle Grid Database cluster crash!

This must have been a lot of sweating nights for the Salesforce folks, one of Oracle's customers!
But believe me , I know how things go.
  • Database is grid control, who wants a standby.
  • 10g is easy, who wants a backup dba. Well can we fire the dba too? Sysadmin's can do it, no?
  • Everything is on SAN, someone will restore it.
  • Cluster crash? Never heard of it, funny the series that I'm writing (part V) which spoke about understanding your architecture and proactively work towards it's continuity.
  • MTTR, what is that?
  • MTTR, we have it set to 3 mins! (Hey have you ever tested it? Anywhere! Somewhere!)
  • Backup restore, have you tested it?
  • Do you have a valid test environment?
  • Do you have anything that looks like a test environment? Anything? Something?
I know the management team there is looking hard for someone to blame. I just hope the poor sysadmin or dba isn't the only one who will take the heat! Management ought to stand up to take it's responsibility as well.

We need to understand together
  • disks fail
  • clusters crash (with all kinds of errors which need desperate attention all the time!)
  • backups fail
  • restores fail
  • It's always happening when you're asleep
  • It happens most of the times in weekends
Technologies like grid computing or RAC etc are thoroughly tested technologies. What we do need to realize is that we cannot just rely on technologies but also have a proper plan for business continuity!

And I don't think hatred had anything to do with it, or did it?


Popular posts from this blog

DeepLearningTrucker Part 1

Avastu Blog is migrating to; 1st Jan 2009 live


I will send out emails personally to those who are using my link(s) on their sites.

Thanks much for your co-operation and hope you enjoy the new site and its cool new features :-)

Not like the site is unlive or something..on the contrary, its beginning to get a lot of attention already. Well most of the work is done, you don't have to worry about anything though:

What won't change

Links/Referrals: I will be redirecting the links (all links which you may have cross-posted) to - so you don't have to do anything in all your posts and links. Although, I would urge however that you do change the permalinks, especially on your blogs etc yourselfThis blog is not going away anywhere but within a few months, I will consider discontinuing its usage. I won't obviously do …

Cloud Security: Eliminate humans from the "Information Supply Chain on the Web"

My upcoming article, part - 3 data center predictions for 2009, has a slideshot talking about the transition from the current age to the cloud computing age to eventually the ideation age- the age where you will have clouds that will emote but they will have no internal employees.

Biggest management disasters occur because internal folks are making a mess of the playground.

Om's blog is carrying an article about Cloud security and it is rather direct but also makes a lot of sense:

I don’t believe that clouds themselves will cause the security breaches and data theft they anticipate; in many ways, clouds will result in better security. Here’s why: Fewer humans –Most computer breaches are the result of human error; only 20-40 percent stem from technical malfunctions. Cloud operators that want to be profitable take humans out of the loop whenever possible.Better tools – Clouds can afford high-end data protection and security monitoring tools, as well as the experts to run them. I trust…