One Long and Nasty Recovery

One of the occupational hazards of being a production DBA is the proverbial 2am call when the production Oracle database crashes.

Oracle has steadily gotten better at self-correcting the issues that cause crashes, and in a well maintained environment with properly sized servers, UPS power, solid backups and Change Control, such outages are mercifully rare.

When they do occur however, it will quickly expose hitherto unforeseen gaps in the production support plan and test the professionalism of all concerned.

The following is a summary of a recent production outage, the symptoms, the causes and the resolutions. Although Oracle’s Metalink system and Oracle Support Services assisted in the identification of the solution, a great deal of work was required from the production DBA staff, and so some of these notes may prove useful for others either as study or worst-case, as verification for anyone facing the same scenario.

Continue reading