Discussion:
Auto fsck filesystems upon reboot if disk corrupt?
Josh Rivel
2008-10-17 19:11:53 UTC
Permalink
So here's the scenario. We have roughly 800 boxes out in various locations running OpenSolaris snv_81. There is no technical staff at these locations. Often if there's a connectivity issue, the staff will just power cycle the machine, which is not ideal. Sometimes there is disk corruption as a result of these repeated power cycles and the machine will go into system maintenance mode.

I'm wondering if there's anyway to configure the OS to auto fsck the filesystem upon reboot if the OS detects the drives as being "unclean", because what happens is that if the box goes into maintenance mode, we need to build a new one and ship it out to the remote location and as a result they are down for 24 hours or so.

There are no technical people on site at these locations, and asking them to console in with a PC using Hyperterminal or whatever isn't really an option unfortunately. (We've requested that they don't power cycle it, but they don't seem to listen to that request)

I'm not sure if this is possible at all, but figured I'd check to see.

Thanks!
--
This message posted from opensolaris.org
Frank Batschulat (Home)
2008-10-18 10:48:39 UTC
Permalink
Post by Josh Rivel
So here's the scenario. We have roughly 800 boxes out in various locations running OpenSolaris snv_81. There is no technical staff at these locations. Often if there's a connectivity issue, the staff will just power cycle the machine, which is not ideal. Sometimes there is disk corruption as a result of these repeated power cycles and the machine will go into system maintenance mode.
I'm wondering if there's anyway to configure the OS to auto fsck the filesystem upon reboot if the OS detects the drives as being "unclean", because what happens is that if the box goes into maintenance mode, we need to build a new one and ship it out to the remote location and as a result they are down for 24 hours or so.
There are no technical people on site at these locations, and asking them to console in with a PC using Hyperterminal or whatever isn't really an option unfortunately. (We've requested that they don't power cycle it, but they don't seem to listen to that request)
I'm not sure if this is possible at all, but figured I'd check to see.
it would be possible if you decide to hack around in the
/sbin/mountall script and its function checkfs(), for UFS a preen fsck (fsck -p) is done,
that is fsck will attempt to fix anything obvious and safe by itself after the
log has been rolled, anything dangerous or questionable is defered to require operator
intervention.

you may change to this be 'fsck -y -o f' instead, that way you tell fsck to do
everything automatically (-y assume yes to all questions fsck usually raised to the operator)

BUT - that is a dangerous thing to do in general and should be used with caution
and awareness that fsck will possibly discard everything it does not like or
does not understand and such you may loose the ability to backup data from
a corrupted file system upfront before letting fsck fix everything that way.

you may want to consider a more recent build or Opensolaris 2008.05 (build 86)
or the upcomming Opensolaris 2008.11 instead and move to ZFS root file system instead.

cheers
frankB
Francis Liu
2008-10-19 12:43:01 UTC
Permalink
A bit of a silly question, but perhaps there is an alternative to filesystems-that-go-bung-on-reboot...

Would you consider mounting the drives ro? Perhaps mount most fs readonly, and make the fs that gets written to separate from the others, so that it will almost always reboot clean.

Then at least, you could fix the data volumes remotely...
--
This message posted from opensolaris.org
Frank.Batschulat-UdXhSnd/
2008-10-19 12:51:58 UTC
Permalink
Post by Francis Liu
A bit of a silly question, but perhaps there is an alternative to filesystems-that-go-bung-on-reboot...
Would you consider mounting the drives ro? Perhaps mount most fs readonly, and make the fs that gets written to separate from the others, so that it will almost always reboot clean.
Then at least, you could fix the data volumes remotely...
in general good idea, minor nit, you can not mount UFS root R/O.

---
frankB
Richard L. Hamilton
2008-10-19 19:20:23 UTC
Permalink
Post by Francis Liu
Post by Francis Liu
A bit of a silly question, but perhaps there is an
alternative to filesystems-that-go-bung-on-reboot...
Post by Francis Liu
Would you consider mounting the drives ro? Perhaps
mount most fs readonly, and make the fs that gets
written to separate from the others, so that it will
almost always reboot clean.
Post by Francis Liu
Then at least, you could fix the data volumes
remotely...
in general good idea, minor nit, you can not mount
UFS root R/O.
---
frankB
Sure you can (it mounts read-only and changes that to read-write if/when it's clean).
What you can't do is _use_ it like that, since a number of files are expected to be
writable.

LiveCDs solve that problem by various means; perhaps similar solutions could be adapted.
But the result would be a rather non-standard configuration...
--
This message posted from opensolaris.org

Loading...