mdadm - manage MD devices aka Linux Software Raid.
mdadm [mode] <raiddevice> [options] <component-devices>
Linux Software RAID devices are implemented through the md (Multiple Devices) device driver.
Currently, Linux supports LINEAR md devices, RAID0 (striping), RAID1 (mirroring), RAID4, RAID5, RAID6, RAID10, MULTIPATH, and FAULTY.
MULTIPATH is not a Software RAID mechanism, but does involve multiple devices. For MULTIPATH each device is a path to one common physical storage device.
FAULTY is also not true RAID, and it only involves one device. It provides a layer over a true device that can be used to inject faults.
Options are:
When creating an array, the homehost will be recorded in the superblock. For version-1 superblocks, it will be prefixed to the array name. For version-0.90 superblocks part of the SHA1 hash of the hostname will be stored in the later half of the UUID.
When reporting information about an array, any array which is tagged for the given homehost will be reported as such.
When using Auto-Assemble, only arrays tagged for the given homehost will be assembled.
This value can be set with --grow for RAID level 1/4/5/6. If the array was created with a size smaller than the currently active drives, the extra space can be accessed using --grow. The size can be given as max which means to choose the largest size that fits on all current drives.
When used with --build, only linear, stripe, raid0, 0, raid1, multipath, mp, and faulty are valid.
Not yet supported with --grow.
The layout of the raid5 parity block can be one of left-asymmetric, left-symmetric, right-asymmetric, right-symmetric, la, ra, ls, rs. The default is left-symmetric.
When setting the failure mode for faulty the options are: write-transient, wt, read-transient, rt, write-persistent, wp, read-persistent, rp, write-all, read-fixable, rf, clear, flush, none.
Each mode can be followed by a number which is used as a period between fault generation. Without a number, the fault is generated once on the first relevant request. With a number, the fault will be generated after that many request, and will continue to be generated every time the period elapses.
Multiple failure modes can be current simultaneously by using the "--grow" option to set subsequent failure modes.
"clear" or "none" will remove any pending or periodic failure modes, and "flush" will clear any persistent faults.
To set the parity with "--grow", the level of the array ("faulty") must be specified before the fault mode is specified.
Finally, the layout options for RAID10 are one of 'n', 'o' or 'p' followed by a small number. The default is 'n2'.
n signals 'near' copies. Multiple copies of one data block are at similar offsets in different devices.
o signals 'offset' copies. Rather than the chunks being duplicated within a stripe, whole stripes are duplicated but are rotated by one device so duplicate blocks are on different devices. Thus subsequent copies of a block are in the next drive, and are one chunk further down.
f signals 'far' copies (multiple copies have very different offsets). See md(4) for more detail about 'near' and 'far'.
The number is the number of copies of each datablock. 2 is normal, 3 can be useful. This number can be at most equal to the number of devices in the array. It does not need to divide evenly into that number (e.g. it is perfectly legal to have an 'n2' layout for an array with an odd number of devices).
To help catch typing errors, the filename must contain at least one slash ('/') if it is a real file (not 'internal' or 'none').
Note: external bitmaps are only known to work on ext2 and ext3. Storing bitmap files on other filesystems may result in serious problems.
The argument can also come immediately after "-a". e.g. "-ap".
If --scan is also given, then any auto= entries in the config file will over-ride the --auto instruction given on the command line.
For partitionable arrays, mdadm will create the device file for the whole array and for the first 4 partitions. A different number of partitions can be specified at the end of this option (e.g. --auto=p7). If the device name ends with a digit, the partition names add a 'p', and a number, e.g. "/dev/home1p3". If there is no trailing digit, then the partition names just have a number added, e.g. "/dev/scratch3".
If the md device name is in a 'standard' format as described in DEVICE NAMES, then it will be created, if necessary, with the appropriate number based on that name. If the device name is not in one of these formats, then a unused minor number will be allocated. The minor number will be considered unused if there is no active array for that number, and there is no entry in /dev for that number and with a non-standard name.
Giving the literal word "dev" for --super-minor will cause mdadm to use the minor number of the md device that is being assembled. e.g. when assembling /dev/md0, will look for super blocks with a minor number of 0.
The sparc2.2 option will adjust the superblock of an array what was created on a Sparc machine running a patched 2.2 Linux kernel. This kernel got the alignment of part of the superblock wrong. You can use the --examine --sparc2.2 option to mdadm to see what effect this would have.
The super-minor option will update the preferred minor field on each superblock to match the minor number of the array being assembled. This can be useful if --examine reports a different "Preferred Minor" to --detail. In some cases this update will be performed automatically by the kernel driver. In particular the update happens automatically at the first write to an array with redundancy (RAID level 1 or greater) on a 2.6 (or later) kernel.
The uuid option will change the uuid of the array. If a UUID is given with the "--uuid" option that UUID will be used as a new UUID and will NOT be used to help identify the devices in the array. If no "--uuid" is given, a random uuid is chosen.
The name option will change the name of the array as stored in the superblock. This is only supported for version-1 superblocks.
The homehost option will change the homehost as recorded in the superblock. For version-0 superblocks, this is the same as updating the UUID. For version-1 superblocks, this involves updating the name.
The resync option will cause the array to be marked dirty meaning that any redundancy in the array (e.g. parity for raid5, copies for raid1) may be incorrect. This will cause the raid system to perform a "resync" pass to make sure that all redundant information is correct.
The byteorder option allows arrays to be moved between machines with different byte-order. When assembling such an array for the first time after a move, giving --update=byteorder will cause mdadm to expect superblocks to have their byteorder reversed, and will correct that order before assembling the array. This is only valid with original (Version 0.90) superblocks.
The summaries option will correct the summaries in the superblock. That is the counts of total, working, active, failed, and spare devices.
Each of these options require that the first device list is the array to be acted upon and the remainder are component devices to be added, removed, or marked as fault. Several different operations can be specified for different devices, e.g. mdadm /dev/md0 --add /dev/sda1 --fail /dev/sdb1 --remove /dev/sdb1 Each operation applies to all devices listed until the next operations.
If an array is using a write-intent bitmap, then devices which have been removed can be re-added in a way that avoids a full reconstruction but instead just updated the blocks that have changed since the device was removed. For arrays with persistent metadata (superblocks) this is done automatically. For arrays created with --build mdadm needs to be told that this device we removed recently with --re-add.
Devices can only be removed from an array if they are not in active use. i.e. that must be spares or failed devices. To remove an active device, it must be marked as faulty first.
This usage assembles one or more raid arrays from pre-existing components. For each array, mdadm needs to know the md device, the identity of the array, and a number of component-devices. These can be found in a number of ways.
In the first usage example (without the --scan) the first device given is the md device. In the second usage example, all devices listed are treated as md devices and assembly is attempted. In the third (where no devices are listed) all md devices that are listed in the configuration file are assembled.
If precisely one device is listed, but --scan is not given, then mdadm acts as though --scan was given and identify information is extracted from the configuration file.
The identity can be given with the --uuid option, with the --super-minor option, can be found in the config file, or will be taken from the super block on the first component-device listed on the command line.
Devices can be given on the --assemble command line or in the config file. Only devices which have an md superblock which contains the right identity will be considered for any array.
The config file is only used if explicitly named with --config or requested with (a possibly implicit) --scan. In the later case, /etc/mdadm.conf is used.
If --scan is not given, then the config file will only be used to find the identity of md arrays.
Normally the array will be started after it is assembled. However if --scan is not given and insufficient drives were listed to start a complete (non-degraded) array, then the array is not started (to guard against usage errors). To insist that the array be started in this case (as may work for RAID1, 4, 5, 6, or 10), give the --run flag.
If an auto option is given, either on the command line (--auto) or in the configuration file (e.g. auto=part), then mdadm will create the md device if necessary or will re-create it if it doesn't look usable as it is.
This can be useful for handling partitioned devices (which don't have a stable device number - it can change after a reboot) and when using "udev" to manage your /dev tree (udev cannot handle md devices because of the unusual device initialisation conventions).
If the option to "auto" is "mdp" or "part" or (on the command line only) "p", then mdadm will create a partitionable array, using the first free one that is not in use, and does not already have an entry in /dev (apart from numeric /dev/md* entries).
If the option to "auto" is "yes" or "md" or (on the command line) nothing, then mdadm will create a traditional, non-partitionable md array.
It is expected that the "auto" functionality will be used to create device entries with meaningful names such as "/dev/md/home" or "/dev/md/root", rather than names based on the numerical array number.
When using this option to create a partitionable array, the device files for the first 4 partitions are also created. If a different number is required it can be simply appended to the auto option. e.g. "auto=part8". Partition names are created by appending a digit string to the device name, with an intervening "p" if the device name ends with a digit.
The --auto option is also available in Build and Create modes. As those modes do not use a config file, the "auto=" config option does not apply to these modes.
If a homehost has been specified (either in the config file or on the command line), mdadm will look further for possible arrays and will try to assemble anything that it finds which is tagged as belonging to the given homehost. This is the only situation where mdadm will assemble arrays without being given specific device name or identify information for the array.
If mdadm finds a consistent set of devices that look like they should comprise an array, and if the superblock is tagged as belonging to the given home host, it will automatically choose a device name and try to assemble the array. If the array uses version-0.90 metadata, then the minor number as recorded in the superblock is used to create a name in /dev/md/ so for example /dev/md/3. If the array uses version-1 metadata, then the name from the superblock is used to similarly create a name in /dev/md. The name will have any 'host' prefix stripped first.
If mdadm cannot find any array for the given host at all, and if --auto-update-homehost is given, then mdadm will search again for any array (not just an array created for this host) and will assemble each assuming --update=homehost. This will change the host tag in the superblock so that on the next run, these arrays will be found without the second pass. The intention of this feature is to support transitioning a set of md arrays to using homehost tagging.
The reason for requiring arrays to be tagged with the homehost for auto assembly is to guard against problems that can arise when moving devices from one host to another.
This usage is similar to --create. The difference is that it creates an array without a superblock. With these arrays there is no difference between initially creating the array and subsequently assembling the array, except that hopefully there is useful data there in the second case.
The level may raid0, linear, multipath, or faulty, or one of their synonyms. All devices must be listed and the array will be started once complete.
This usage will initialise a new md array, associate some devices with it, and activate the array.
If the --auto option is given (as described in more detail in the section on Assemble mode), then the md device will be created with a suitable device number if necessary.
As devices are added, they are checked to see if they contain raid superblocks or filesystems. They are also checked to see if the variance in device size exceeds 1%.
If any discrepancy is found, the array will not automatically be run, though the presence of a --run can override this caution.
To create a "degraded" array in which some devices are missing, simply give the word "missing" in place of a device name. This will cause mdadm to leave the corresponding slot in the array empty. For a RAID4 or RAID5 array at most one slot can be "missing"; for a RAID6 array at most two slots. For a RAID1 array, only one real device needs to be given. All of the others can be "missing".
When creating a RAID5 array, mdadm will automatically create a degraded array with an extra spare drive. This is because building the spare into a degraded array is in general faster than resyncing the parity on a non-degraded, but not clean, array. This feature can be over-ridden with the --force option.
When creating an array with version-1 metadata a name for the host is required. If this is not given with the --name option, mdadm will chose a name based on the last component of the name of the device being created. So if /dev/md3 is being created, then the name 3 will be chosen. If /dev/md/home is being created, then the name home will be used.
The General Management options that are valid with --create are:
This usage will allow individual devices in an array to be failed,
removed or added. It is possible to perform multiple operations with
on command. For example:
mdadm /dev/md0 -f /dev/hda1 -r /dev/hda1 -a /dev/hda1
will firstly mark
/dev/hda1
as faulty in
/dev/md0
and will then remove it from the array and finally add it back
in as a spare. However only one md array can be affected by a single
command.
MISC mode includes a number of distinct operations that operate on distinct devices. The operations are:
Having --scan without listing any devices will cause all devices listed in the config file to be examined.
This usage causes mdadm to periodically poll a number of md arrays and to report on any events noticed. mdadm will never exit once it decides that there are arrays to be checked, so it should normally be run in the background.
As well as reporting events, mdadm may move a spare drive from one array to another if they are in the same spare-group and if the destination array has a failed drive but no spares.
If any devices are listed on the command line, mdadm will only monitor those devices. Otherwise all arrays listed in the configuration file will be monitored. Further, if --scan is given, then any other md devices that appear in /proc/mdstat will also be monitored.
The result of monitoring the arrays is the generation of events. These events are passed to a separate program (if specified) and may be mailed to a given E-mail address.
When passing event to program, the program is run once for each event and is given 2 or 3 command-line arguments. The first is the name of the event (see below). The second is the name of the md device which is affected, and the third is the name of a related device if relevant, such as a component device that has failed.
If --scan is given, then a program or an E-mail address must be specified on the command line or in the config file. If neither are available, then mdadm will not monitor anything. Without --scan mdadm will continue monitoring as long as something was found to monitor. If no program or email is given, then each event is reported to stdout.
The different events are:
If mdadm was told to monitor an array which is RAID0 or Linear, then it will report DeviceDisappeared with the extra information Wrong-Level. This is because RAID0 and Linear do not support the device-failed, hot-spare and resync operations which are monitored.
Only Fail , FailSpare , DegradedArray , SparesMissing , and TestMessage cause Email to be sent. All events cause the program to be run. The program is run with two or three arguments, they being the event name, the array device and possibly a second device.
Each event has an associated array device (e.g. /dev/md1) and possibly a second device. For Fail, FailSpare, and SpareActive the second device is the relevant component device. For MoveSpare the second device is the array that the spare was moved from.
For mdadm to move spares from one array to another, the different arrays need to be labelled with the same spare-group in the configuration file. The spare-group name can be any string. It is only necessary that different spare groups use different names.
When mdadm detects that an array which is in a spare group has fewer active devices than necessary for the complete array, and has no spare devices, it will look for another array in the same spare group that has a full complement of working drive and a spare. It will then attempt to remove the spare from the second drive and add it to the first. If the removal succeeds but the adding fails, then it is added back to the original array.
Currently the only support available is to
Note that when an array changes size, any filesystem that may be stored in the array will not automatically grow to use the space. The filesystem will need to be explicitly told to use the extra space.
A RAID1 array can work with any number of devices from 1 upwards (though 1 is not very useful). There may be times which you want to increase or decrease the number of active devices. Note that this is different to hot-add or hot-remove which changes the number of inactive devices.
When reducing the number of devices in a RAID1 array, the slots which are to be removed from the array must already be vacant. That is, the devices that which were in those slots must be failed and removed.
When the number of devices is increased, any hot spares that are present will be activated immediately.
Increasing the number of active devices in a RAID5 is much more effort. Every block in the array will need to be read and written back to a new location. From 2.6.17, the Linux Kernel is able to do this safely, including restart and interrupted "reshape".
When relocating the first few stripes on a raid5, it is not possible to keep the data on disk completely consistent and crash-proof. To provide the required safety, mdadm disables writes to the array while this "critical section" is reshaped, and takes a backup of the data that is in that section. This backup is normally stored in any spare devices that the array has, however it can also be stored in a separate file specified with the --backup-file option. If this option is used, and the system does crash during the critical period, the same file must be passed to --assemble to restore the backup and reassemble the array.
A write-intent bitmap can be added to, or removed from, an active array. Either internal bitmaps, or bitmaps stored in a separate file can be added. Note that if you add a bitmap stored in a file which is in a filesystem that is on the raid array being affected, the system will deadlock. The bitmap must be on a separate filesystem.
mdadm --query /dev/name-of-device
This will find out if a given device is a raid array, or is part of
one, and will provide brief information about the device.
mdadm --assemble --scan
This will assemble and start all arrays listed in the standard config file
file. This command will typically go in a system startup file.
mdadm --stop --scan
This will shut down all array that can be shut down (i.e. are not
currently in use). This will typically go in a system shutdown script.
mdadm --follow --scan --delay=120
If (and only if) there is an Email address or program given in the
standard config file, then
monitor the status of all arrays listed in that file by
polling them ever 2 minutes.
mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/hd[ac]1
Create /dev/md0 as a RAID1 array consisting of /dev/hda1 and /dev/hdc1.
echo 'DEVICE /dev/hd*[0-9] /dev/sd*[0-9]' > mdadm.conf
mdadm --detail --scan >> mdadm.conf
This will create a prototype config file that describes currently
active arrays that are known to be made from partitions of IDE or SCSI drives.
This file should be reviewed before being used as it may
contain unwanted detail.
echo 'DEVICE /dev/hd[a-z] /dev/sd*[a-z]' > mdadm.conf
mdadm --examine --scan --config=mdadm.conf >> mdadm.conf
This will find what arrays could be assembled from existing IDE and
SCSI whole drives (not partitions) and store the information is the
format of a config file.
This file is very likely to contain unwanted detail, particularly
the
devices=
entries. It should be reviewed and edited before being used as an
actual config file.
mdadm --examine --brief --scan --config=partitions
mdadm -Ebsc partitions
Create a list of devices by reading
/proc/partitions,
scan these for RAID superblocks, and printout a brief listing of all
that was found.
mdadm -Ac partitions -m 0 /dev/md0
Scan all partitions and devices listed in
/proc/partitions
and assemble
/dev/md0
out of all such devices with a RAID superblock with a minor number of 0.
mdadm --monitor --scan --daemonise > /var/run/mdadm
If config file contains a mail address or alert program, run mdadm in
the background in monitor mode monitoring all md devices. Also write
pid of mdadm daemon to
/var/run/mdadm.
mdadm --create --help
Provide help about the Create mode.
mdadm --config --help
Provide help about the format of the config file.
mdadm --help
Provide general help.
If you're using the /proc filesystem, /proc/mdstat lists all active md devices with information about them. mdadm uses this to find arrays when --scan is given in Misc mode, and to monitor array reconstruction on Monitor mode.
The config file lists which devices may be scanned to see if they contain MD super block, and gives identifying information (e.g. UUID) about known MD arrays. See mdadm.conf(5) for more details.
While entries in the /dev directory can have any format you like, mdadm has an understanding of 'standard' formats which it uses to guide its behaviour when creating device files via the --auto option.
The standard names for non-partitioned arrays (the only sort of md array available in 2.4 and earlier) either of
where NN is a number. The standard names for partitionable arrays (as available from 2.6 onwards) is one of
Partition numbers should be indicated by added "pMM" to these, thus "/dev/md/d1p2".
The latest version of mdadm should always be available from
mdadm.conf(5), md(4).
raidtab(5), raid0run(8), raidstop(8), mkraid(8).
Закладки на сайте Проследить за страницей |
Created 1996-2024 by Maxim Chirkov Добавить, Поддержать, Вебмастеру |