Twice now in the last 4 days at some
point during the night websites go down because the server is unable to connect to the
database. At this point everything else is still running (apache ect) just the database
is dead.
When I log into ssh with root to
investigate, I have read only permissions everywhere which is what I suspect is the
cause of the mysql server dying.
I've checked
the mysql logs, system logs, basically every log file I can find, nothing indicates an
error anywhere around when the problems starts (or even during the entire day). It is
like a switch is just flipped, I then restart the system and things are fine again...
until a few days later?
There was 2G
of free ram the last time this happened, 1.5G free the first time. Minimal cpu usage
(< 30%).
Any ideas?
Answer
Disk errors are one possible cause of the "I have read only permissions
everywhere" condition. Some types of hardware or kernel-level disk faults can lead to an
inconsistent and corrupted filesystem, so the kernel will protectively force the
filesystem into "read-only" mode when it detects such a fault. If the disk containing
your root filesystem has a hiccup, anything trying to write to the disk would start to
see permission errors. Programs that don't need to write to the disk (like Apache or
SSHd) will probably continue to work just
fine.
The next time the database fault happens,
check your kernel log message buffer BEFORE YOU REBOOT for any indications of a disk
error. You'll probably have to use the 'dmesg' command, because if your '/var/log'
directory is part of the root filesystem, the syslog daemon won't be able to write the
error messages to the '/var/log/messages' file on the disk. Also, the contents of the
kernel log buffer will be lost when you reboot, so you might want to use 'ssh' or 'scp'
to copy that data elsewhere.
Comments
Post a Comment