A disk volume on the Solaris zone that hosts REPL, RS25 and CAS databases ran out of space at 05:17 11-22-2008.
Existing Oracle database connections were unaffected. Some or most new database connections were denied. Campus applications that use persistent, pooled database connections were likely unaffected.
Our logs indicate failed connection attempts between 07:00 and 08:30.
Campuses that had failed REPL related processes during the 07:00 -
08:30 window can probably assume that this was the cause.
This
is related to attempting to debug various backup related issues.
Detailed logging was enabled, but there was insufficient space for the
additional log volume.
Please forward the above to potentially affected customers.
Details:
The
Legato backup software is not flexible on installation locations. When
combined with our use of lightweight zones,
the constraints imposed by zone disk layout an the limitations of
Oracle's RMAN backup agents, some Legato related logs end up on the
root file system. This normally is not a good practice in Unix-like
systems, but limitations of the various software stacks more or less
force us into this configuration.
The action taken was to log
into the global zone, and increase the zfs quota for the database zone,
then clean up the logs and change log rotation frequency and
compression parameters.
We will re-evaluate zone mount points,
Legato agent log locations, and Oracle RMAN configuration to see if it
is possible to fool Legato into moving its logs off of the root slice
without breaking RMAN.