|s i s t e m a o p e r a c i o n a l m a g n u x l i n u x||~/ · documentação · suporte · sobre|
In order to decide how to get the most of your devices you need to know what technologies are available and their implications. As always there can be some tradeoffs with respect to speed, reliability, power, flexibility, ease of use and complexity.
Many of the techniques described below can be stacked in a number of ways to maximise performance and reliability, though at the cost of added complexity.
This is a method of increasing reliability, speed or both by using multiple disks in parallel thereby decreasing access time and increasing transfer speed. A checksum or mirroring system can be used to increase reliability. Large servers can take advantage of such a setup but it might be overkill for a single user system unless you already have a large number of disks available. See other documents and FAQs for more information.
For Linux one can set up a RAID system using either software
A summary of available hardware RAID solutions for Linux is available at Linux Consulting.
SCSI-to-SCSI controllers are usually implemented as complete cabinets with drives and a controller that connects to the computer with a second SCSI bus. This makes the entire cabinet of drives look like a single large, fast SCSI drive and requires no special RAID driver. The disadvantage is that the SCSI bus connecting the cabinet to the computer becomes a bottleneck.
A significant disadvantage for people with large disk farms is that there is a limit to how many SCSI entries there can be in the /dev directory. In these cases using SCSI-to-SCSI will conserve entries.
Usually they are configured via the front panel or with a terminal connected to their on-board serial interface.
PCI-to-SCSI controllers are, as the name suggests, connected to the high speed PCI bus and is therefore not suffering from the same bottleneck as the SCSI-to-SCSI controllers. These controllers require special drivers but you also get the means of controlling the RAID configuration over the network which simplifies management.
Currently only a few families of PCI-to-SCSI host adapters are supported under Linux.
A number of operating systems offer software RAID using ordinary disks and controllers. Cost is low and performance for raw disk IO can be very high. As this can be very CPU intensive it increases the load noticeably so if the machine is CPU bound in performance rather then IO bound you might be better off with a hardware PCI-to-RAID controller.
Real cost, performance and especially reliability of software vs. hardware RAID is a very controversial topic. Reliability on Linux systems have been very good so far.
The current software RAID project on Linux is the
RAID comes in many levels and flavours which I will give a brief overview of this here. Much has been written about it and the interested reader is recommended to read more about this in the Software RAID HOWTO.
There are also hybrids available based on RAID 0 or 1 and one other level. Many combinations are possible but I have only seen a few referred to. These are more complex than the above mentioned RAID levels.
RAID 0/1 combines striping with duplication which gives very high transfers combined with fast seeks as well as redundancy. The disadvantage is high disk consumption as well as the above mentioned complexity.
RAID 1/5 combines the speed and redundancy benefits of RAID5 with the fast seek of RAID1. Redundancy is improved compared to RAID 0/1 but disk consumption is still substantial. Implementing such a system would involve typically more than 6 drives, perhaps even several controllers or SCSI channels.
Volume management is a way of overcoming the constraints of fixed sized partitions and disks while still having a control of where various parts of file space resides. With such a system you can add new disks to your system and add space from this drive to parts of the file space where needed, as well as migrating data out from a disk developing faults to other drives before catastrophic failure occurs.
The system developed by Veritas has become the defacto standard for logical volume management.
Volume management is for the time being an area where Linux is lacking.
One is the virtual partition system project VPS that will reimplement many of the volume management functions found in IBM's AIX system. Unfortunately this project is currently on hold.
Another project is the Logical Volume Manager project that is similar to a project by HP.
The Linux Multi Disk (md) provides a number of block level features in various stages of development.
RAID 0 (striping) and concatenation are very solid and in production quality and also RAID 4 and 5 are quite mature.
It is also possible to stack some levels, for instance mirroring (RAID 1) two pairs of drives, each pair set up as striped disks (RAID 0), which offers the speed of RAID 0 combined with the reliability of RAID 1.
In addition to RAID this system offers (in alpha stage) block level
volume management and soon also translucent file space.
Since this is done on the block level it can be used in combination
with any file system, even for
Think very carefully what drives you combine so you can operate all drives
in parallel, which gives you better performance and less wear. Read more
about this in the documentation that comes with
Unfortunately The Linux software RAID has split into two trees, the old stable versions 0.35 and 0.42 which are documented in the official Software-RAID HOWTO and the newer less stable 0.90 series which is documented in the unofficial Software RAID HOWTO which is a work in progress.
Hint: if you cannot get it to work properly you have forgotten to set
Disk compression versus file compression is a hotly debated topic especially regarding the added danger of file corruption. Nevertheless there are several options available for the adventurous administrators. These take on many forms, from kernel modules and patches to extra libraries but note that most suffer various forms of limitations such as being read-only. As development takes place at neck breaking speed the specs have undoubtedly changed by the time you read this. As always: check the latest updates yourself. Here only a few references are given.
Access Control List (ACL) offers finer control over file access
on a user by user basis, rather than the traditional owner, group
and others, as seen in directory listings (
This uses part of a hard disk to cache slower media such as CD-ROM. It is available under SunOS but not yet for Linux.
This is a copy-on-write system where writes go to a different system than the original source while making it look like an ordinary file space. Thus the file space inherits the original data and the translucent write back buffer can be private to each user.
There is a number of applications:
SunOS offers this feature and this is under development for Linux.
There was an old project called the Inheriting File Systems (
Sun has an informative page on translucent file system.
It should be noted that Clearcase (now owned by Rational) pioneered and popularized translucent filesystems for software configuration management by writing their own UNIX filesystem.
This trick used to be very important when drives were slow and small, and some file systems used to take the varying characteristics into account when placing files. Although higher overall speed, on board drive and controller caches and intelligence has reduced the effect of this.
To understand the strategy we need to recall this near ancient piece of knowledge and the properties of the various track locations. This is based on the fact that transfer speeds generally increase for tracks further away from the spindle, as well as the fact that it is faster to seek to or from the central tracks than to or from the inner or outer tracks.
Most drives use disks running at constant angular velocity but use (fairly) constant data density across all tracks. This means that you will get much higher transfer rates on the outer tracks than on the inner tracks; a characteristics which fits the requirements for large libraries well.
Newer disks use a logical geometry mapping which differs from the actual physical mapping which is transparently mapped by the drive itself. This makes the estimation of the "middle" tracks a little harder.
In most cases track 0 is at the outermost track and this is the general assumption most people use. Still, it should be kept in mind that there are no guarantees this is so.
Hence seek time reduction can be achieved by positioning frequently
accessed tracks in the middle so that the average seek distance and
therefore the seek time is short. This can be done either by using
The latter trick is suitable for news spools where the empty directory structure can be placed in the middle before putting in the data files. This also helps reducing fragmentation a little.
This little trick can be used both on ordinary drives as well as RAID systems. In the latter case the calculation for centring the tracks will be different, if possible. Consult the latest RAID manual.
The speed difference this makes depends on the drives, but a 50 percent improvement is a typical value.
The same mechanical head disk assembly (HDA) is often available with a number of interfaces (IDE, SCSI etc) and the mechanical parameters are therefore often comparable. The mechanics is today often the limiting factor but development is improving things steadily. There are two main parameters, usually quoted in milliseconds (ms):
After voice coils replaced stepper motors for the head movement the improvements seem to have levelled off and more energy is now spent (literally) at improving rotational speed. This has the secondary benefit of also improving transfer rates.
Some typical values:
This shows that the very high end drives offer only marginally better access times then the average drives but that the old drives based on stepper motors are significantly worse.
As latency is the average time taken to reach a given sector, the formula is quite simply
Clearly this too is an example of diminishing returns for the efforts put into development. However, what really takes off here is the power consumption, heat and noise.
There is also a
Linux Yoke Driver
available in beta which
is intended to do hot-swappable transparent binding of
one Linux block device to another. This means that if you
bind two block devices together,
One of the advantages of a layered design of an operating system
is that you have the flexibility to put the pieces together
in a number of ways.
For instance you can cache a CD-ROM with
There is a near infinite number of combinations available but my recommendation is to start off with a simple setup without any fancy add-ons. Get a feel for what is needed, where the maximum performance is required, if it is access time or transfer speed that is the bottle neck, and so on. Then phase in each component in turn. As you can stack quite freely you should be able to retrofit most components in as time goes by with relatively few difficulties.
RAID is usually a good idea but make sure you have a thorough grasp of the technology and a solid back up system.
Next Previous Contents