Anatomy of Hard Disks (Hard Drives) - What they are and how they work?

harddiskA hard disk is part of a unit, often called a "disk drive," "hard drive," or "hard disk drive," that stores and provides relatively quick access to large amounts of data on an electromagnetically charged surface or set of surfaces. Today's computers typically come with a hard disk that contains several billion bytes (gigabytes) of storage.

A hard disk is really a set of stacked "disks," each of which, like phonograph records, has data recorded electromagnetically in concentric circles or "tracks" on the disk. A "head" (something like a phonograph arm but in a relatively fixed position) records (writes) or reads the information on the tracks. Two heads, one on each side of a disk, read or write the data as the disk spins. Each read or write operation requires that data be located, which is an operation called a "seek." (Data already in a disk cache, however, will be located more quickly.)

A hard disk/drive unit comes with a set rotation speed varying from 4500 to 7200 rpm. Disk access time is measured in milliseconds. Although the physical location can be identified with cylinder, track, and sector locations, these are actually mapped to a logical block address (LBA) that works with the larger address range on today's hard disks.

Physical Geometry

harddiskgeo
The physical geometry of a hard disk is the actual physical number of heads, cylinders and sectors used by the disk. On older disks this is the only type of geometry that is ever used--the physical geometry and the geometry used by the PC are one and the same. The original setup parameters in the system BIOS are designed to support the geometries of these older drives. Classically, there are three figures that describe the geometry of a drive: the number of cylinders on the drive ("C"), the number of heads on the drive ("H") and the number of sectors per track ("S"). Together they comprise the "CHS" method of addressing the hard disk. This method of description is described in more detail in this description of CHS mode addressing.

At the time the PC BIOS interfaces to the hard disk were designed, hard disks were simple. They had only a few hundred cylinders, a few heads and all had the same number of sectors in each track. Today's drives do not have simple geometries; they use zoned bit recording and therefore do not have the same number of sectors for each track, and they use defect mapping to remove bad sectors from use. As a result, their geometry can no longer be described using simple "CHS" terms. These drives must be accessed using logical geometry figures, with the physical geometry hidden behind routines inside the drive controller. For a comparison of physical and logical geometry, see this page on logical geometry.

Often, you have to request detailed specifications for a modern drive to find out the true physical geometry. Even then you might have problems--I called one major drive manufacturer when first writing the site, and the technician had no idea what I was talking about. He kept giving me the logical parameters and insisting they were the physical ones. Finally, I asked him how his drive could have 16 heads when it had only 3 platters, and he got very confused.

Tip: It's easy to tell if you are looking at physical or logical hard disk geometry numbers. Since no current hard drive has the same number of sectors on each track, if you are given a single number for "sectors per track", that must be a logical parameter. Also, I am aware of no current hard disk product that uses 8 platters and either 15 or 16 heads. However, all modern, larger IDE/ATA hard disks have a nominal logical geometry specification of 15 or 16 heads, so either of those numbers is a dead giveaway.


"Every drive dies; not every drive really lives."
-- Braveheart meets 21st century technology.

A hard disk's performance is its most important characteristic--right up until the point where it stops working. Then, suddenly, you don't care how fast it is--or rather, was. You just want it to start working again. (OK, stop groaning already about the quote. )

Hard Disk Reliability

Many people take their hard disk drives for granted, and don't think about their reliability much (other than worrying about their disk crashing some day). While the technology that hard disks use is very advanced, and reliability today is much better than it has ever been before, the nature of hard drives is that every one will, some day, fail. It is important to understand how drives fail and why, and how to interpret what manufacturers claims about reliability really mean.

There are a number of different specifications used by hard disk drive manufacturers to indicate the quality and reliability of their products. Some of these, such as MTBF, are frequently discussed (but not always all that well understood). Others are obscure and typically of interest only to hard drive aficionados. All are important to those who care about hard disk quality--which should be anyone who stores data on a hard disk. In this section I discuss the most important of these specifications, what they mean, and perhaps most importantly, what they don't mean! You'll also find some discussion of specifications in the section on quality and reliability issues, particularly temperature specifications and noise specifications.

Note: In addition to the hard-disk-specific numbers explained in this section, hard disks usually come with a number of environmental specifications that dictate how they should and should not be used in order to operate reliably. These are essentially the same as those provided for power supplies. The only caveat about applying the power supply environmental specifications here is that hard drives are more sensitive to altitude than most components and can fail when operated at altitudes over 10,000 feet;

"KILLS - BUGS - DEAD!"
-- TV commercial for RAID bug spray

There are many applications, particularly in a business environment, where there are needs beyond what can be fulfilled by a single hard disk, regardless of its size, performance or quality level. Many businesses can't afford to have their systems go down for even an hour in the event of a disk failure; they need large storage subsystems with capacities in the terabytes; and they want to be able to insulate themselves from hardware failures to any extent possible. Some people working with multimedia files need fast data transfer exceeding what current drives can deliver, without spending a fortune on specialty drives. These situations require that the traditional "one hard disk per system" model be set aside and a new system employed. This technique is called Redundant Arrays of Inexpensive Disks or RAID. ("Inexpensive" is sometimes replaced with "Independent", but the former term is the one that was used when the term "RAID" was first coined by the researchers at the University of California at Berkeley, who first investigated the use of multiple-drive arrays in 1987.)

The fundamental principle behind RAID is the use of multiple hard disk drives in an array that behaves in most respects like a single large, fast one. There are a number of ways that this can be done, depending on the needs of the application, but in every case the use of multiple drives allows the resulting storage subsystem to exceed the capacity, data security, and performance of the drives that make up the system, to one extent or another. The tradeoffs--remember, there's no free lunch--are usually in cost and complexity.

Redundant Arrays of Inexpensive Disk

Originally, RAID was almost exclusively the province of high-end business applications, due to the high cost of the hardware required. This has changed in recent years, and as "power users" of all sorts clamor for improved performance and better up-time, RAID is making its way from the "upper echelons" down to the mainstream. The recent proliferation of inexpensive RAID controllers that work with consumer-grade IDE/ATA drives--as opposed to expensive SCSI units--has increased interest in RAID dramatically. This trend will probably continue. I predict that more and more motherboard manufacturers will begin offering support for the feature on their boards, and within a couple of years PC builders will start to offer systems with inexpensive RAID setups as standard configurations. This interest, combined with my long-time interest in this technology, is the reason for my recent expansion of the RAID coverage on this site from one page to 80.

Unfortunately, RAID in the computer context doesn't really kill bugs dead. It can, if properly implemented, "kill down-time dead", which is still pretty good.

The operation of your hard disk drives is controlled by the interface from the system to the hard disk itself. This interface is the conduit for addressing instructions and commands, sent to the hard disk to select what data is requested, and then a conduit for the data itself, flowing to and from the system. The system BIOS plays a role in the operation of the hard disk, as it provides the standard software routines that allow applications and operating systems such as DOS to access the hard disk. It is also the cause of many configuration and capacity limitation problems that many users have when setting up their hard disks, especially newer ones on older systems.

BIOS and operating system

The BIOS and operating system play an important role in how your hard disk is used. While the BIOS itself has taken more of a "back seat" role to direct access by the operating system over the last few years, it is still there "in the mix" in several ways. This section takes a brief look at the impact of the BIOS on hard disk setup and access.

This section takes a look at issues related to how the BIOS and operating system interact with the hard disk, and BIOS-related issues and problems. This includes a full look at the many capacity limitations inherent in using IDE/ATA interface drives, and other BIOS restrictions on hard disk capacity. Many of the items in this section are really of relevance only to IDE/ATA drives; SCSI drives use their own BIOS and a different addressing mechanism from IDE/ATA, and so suffer from fewer of these problems. However, some BIOS issues affect SCSI as well, because of problems associated with operating system limitations.

Hard Disk Interfaces and Configuration

The interface that the hard disk uses to connect to the rest of the PC is in some ways as important as the characteristics of the hard disk itself. The interface is the communication channel over which all the data flows that is read from or written to the hard disk. The interface can be a major limiting factor in system performance. The choice of interface also has an essential impact on system configuration, compatibility, upgradability and other factors.

Over time, several different standards have evolved to control how hard disks are connected to the other major system components used in the PC. These have tended to build upon one another, and often use confusing and overlapping terminology. The result has been a great deal of confusion surrounding the entire subject. Each time a new variant or enhancement of an interface is introduced, the interface becomes just a bit more confusing, particularly for those trying to use older hardware, or to mix newer and older devices.

To help you understand what can be a baffling subject, this section of the site takes a comprehensive look at the different interfaces used to connect hard disks to the PC. I begin by discussing two obsolete interfaces no longer used, and also provide brief coverage of some "alternative" interfaces that are not commonly employed by typical PC users, but are important for special applications. Most of the focus is on the two interfaces most often used on the hard disk. I discuss in detail IDE/ATA and its enhancements, with a focus on clarifying the confusion that surrounds the use of this most popular PC interface. I then cover SCSI, the more advanced and flexible interface that dominates the business workstation and server world, and is becoming the choice of a growing number of performance-oriented desktop PC users.

Note: This part of the site is in the discussion of hard disks, and so they will be my primary focus. However, many other devices use the same interfaces that hard disks do; where appropriate, distinctions between how hard disks and other devices use the interfaces will be specified. Otherwise, you can assume that using the interface for optical drives and similar storage devices will be similar to how hard disks use the interface.

Hard Disk Interface(s)

There are a few ways in which a hard disk can connect/interface with:
  • * (A)dvanced (T)echnology (A)ttachment (Also known as IDE, ATAPI and Parallel ATA)
  • * (S)erial ATA
  • * SCSI(aka Scuzzy)
There are variants of each interface, and this article will not do justice to the different types of ATA, SATA and SCSI interfaces. Thus, it will only highlight the more common interfaces as used by the home user.

ATA (IDE, ATAPI, PATA)
  • ATA is a common interface used in many personal computers before the emergence of SATA. It is the least expensive of the interfaces.
Disadvantages
  • * Older ATA adapters will limit transfer rates according to the slower attached device (debatable)
  • * Only ONE device on the ATA cable is able to read/write at one time
  • * Limited standard for cable length (up to 18inches/46cm)
Advantages
  • * Low costs
  • * Large capacity
SATA
  • SATA is basically an advancement of ATA.
Disadvantages
  • * Slower transfer rates compared to SCSI
  • * Not supported in older systems without the use of additional components
Advantages
  • * Low costs
  • * Large capacity
  • * Faster transfer rates compared to ATA (difference is marginal at times though)
  • * Smaller cables for better heat dissipation
SCSI

SCSI is commonly used in servers, and more in industrial applications than home uses.

Disadvantages
  • * Costs
  • * Not widely supported
  • * Many, many different kinds of SCSI interfaces
  • * SCSI drives have a higher RPM, creating more noise and heat
Advantages
  • * Faster
  • * Wide range of applications
  • * Better scalability and flexibility in Arrays (RAID)
  • * Backward compatible with older SCSI devices
  • * Better for storing and moving large amounts of data
  • * Tailor made for 24/7 operations
  • * Reliability