Bytes Link logo

Hard Disk Drive Optimization

by Herb Wong, ocug@singularitytechnology.com - June 01, 2002 at 22:59:01:


Hard disk drives have evolved into highly reliable and extremely inexpensive mass storage devices. A few simple ideas may help improve your daily computing experiences.

Size Matters Computer science has a standard set of definitions for quantities. A thousand bytes is 1,000 bytes (ten to the third power). A kilobyte is 1,024 bytes (two to the tenth power). A million bytes is 1,000,000 (ten to the sixth power; 1,000 times 1,000) and megabyte is 1,048,576 bytes (two to the twentieth power; 1,024 times 1,024). Finally, a billion bytes is 1,000,000,000 (ten to the ninth power; 1,000 times1,000 times 1,000), at least in the United States, and a gigabyte is 1,073,741,824 bytes (two to the thirtieth power; 1,024 times 1,024 times 1,024).

Computer marketing sells you short. Advertising redefines many standard computer science terms. A kilobyte becomes 1,000 bytes, a megabyte becomes 1,000,000 bytes, and a gigabyte becomes 1,000,000,000 bytes.

In other words, 1,000 equals 1,024, 1,000,000 equals 1,048,576, and 1,000,000 equals 1,073,741,824. Of course, this works in the favor of the manufacturer and shortchanges you. An advertised “twenty gigabyte” drive should be almost one and a half billion bytes larger.

To be clear and accurate, distinguish between one thousand bytes and one kilobyte, one million bytes and one megabyte; and one billion bytes and one gigabyte whenever possible.

Partitions The foundation of a hard disk drive's file system is a partition. The framework is the formatting of logical drives (C:, D:, E:, etc.). There are two types of partitions under Windows, primary and extended.

A primary partition can contain the boot drive (which can load the operating system at startup) after formatting (with the FORMAT utility). An extended partition can contain one or more logical drives that cannot be boot drives. The partitions and logical drives are created using the FDISK utility. The logical drives must be formatted with the FORMAT utility before they can contain data.

Under older versions of Windows, a physical hard disk drive can contain a single primary partition, a single extended partition, or both a primary partition and an extended partition. Technically, those are the only things that Windows can recognize. The rest of the world is decades more advanced.

Under older versions of Windows (and DOS), logical drive letters are assigned (at least initially during OS installation) to hard disk drives according to a few simple rules. The physical hard drives are inspected in sequence (port 0's master, port 0's slave, then port 1's master, port 1's slave, etc.) and logical drive letters are first assigned to each primary partition that is found.

Next, the first physical hard drive is inspected again and logical drive letters are assigned to any logical drive contained in an extended partition (if it exists). Each of the remaining physical hard disk drives is inspected in sequence (port 0's master, port 0's slave, then port 1's master, port 1's slave, etc.).

I suggest labeling logical drives with a naming convention to facilitate having a collection of drives in a computer or network. I abbreviate the manufacturer's name, drive capacity, number of the drive (first, second, third, etc.), and logical drive number (a primary partition's logical drive number is given 0 and an extended partition's logical drive starts at 1).

Suppose a computer contains a Seagate 80 gigabyte drive and a Maxtor 40 gigabyte drive. Both contain a single primary partition and an extended partition with two logical drives. The drive letters C: through H: would go to: SEA80GB#1P0, MX40GB#1P0, SEA80GB#1P1, SEA80GB#1P2, MX40GB#1P1, and MX40GB#1P2, respectively.

Suppose the same computer contains a Seagate 80 gigabyte drive and a Maxtor 40 gigabyte drive. Now the Seagate contains a single primary partition and an extended partition with two logical drives; and the Maxtor contains only an extended partition with three logical drives. The drive letters C: through H: would go to: SEA80GB#1P0, SEA80GB#1P1, SEA80GB#1P2, MX40GB#1P0, MX40GB#1P1, and MX40GB#1P2, respectively.

This (or similar) naming convention helps to easily identify where data resides. There has been many an occasion where the label told me that I was in the wrong folder on the wrong drive.

Change Drive Letter And Path The first thing that I do after installing Windows is to change the drive letter of all the CD/DVD-ROM (read only memory) and CD-R/RW drives. I change the first CD-ROM to drive R:, the DVD to drive V: (or to drive R: if there is no CD-ROM), and the CD-R/RW to drive W:. Now I can stick in an optical disc into every machine and know what drive letter it is.

If you don't change the drive letters as above, bad things can happen. If you add another hard disk drive, that hard disk drive's letters will appear after the CD-ROM's.

ScanDisk Microsoft's ScanDisk is a lightweight utility that is best known for automatically running after an improper exit from Windows. Most common causes are pressing the hardware reset button due to a frozen system, a “blue screen of death” crash, or worse.

A sector is the basic storage unit of 512 bytes. A cluster is a series of logical sectors whose total size is predetermined by the operating system’s file system. For example, FAT32 (ideally) uses 4-kilobyte clusters on partitions that are less than 8-gigabytes, 8-kilobyte clusters on partitions that are less than 16-gigabytes, 16-kilobyte clusters on partitions that are less than 32-gigabytes, and 32-kilobyte clusters on partitions that are greater than 32-gigabytes, The hard disk drive may not physically store data in this manner, but may perform translations as needed.

Windows uses a FAT (file allocation table) as a filename directory. Every file has an entry in the FAT that contains a pointer to the first cluster of the file itself. If the current cluster is not large enough to contain the whole file, a pointer within the cluster will point to the next cluster in the chain. It is possible for a file to be composed of millions of clusters.

ScanDisk’s primary functionality is to test the chain of clusters that comprise each file. A file that contains a pointer to a cluster owned by another file is a cross-linked file. An allocated cluster that is not contained in a valid file is a lost cluster. A FAT entry may be invalid if it does not point to a file.

Run ScanDisk after occasions when Windows was not able to complete a “Shutdown” instruction by itself.

Defragmentation Files are created and deleted in the normal course of operation. Some files were persistent and some were temporary (for use during the current session or for some short operation).

Imagine a wall with many random groups of bricks (deleted file clusters) removed. As a new large batch of bricks (the clusters in a single new file) comes in, they fill in the random openings as possible. These new bricks now appear in groups on various different rows.

In a hard disk drive, bricks are clusters and the many rows of bricks are tracks. The read/write heads take a long time (for computers) to step from track to track. A file read/write operation is fastest if the file is in consecutive clusters and in adjacent tracks. If the clusters in a single file are scattered randomly about the disk, reading/writing will take a large amount of time since stepping is so slow.

By periodically running the defragmentation utility, the gaps in the fragmented files are removed. Disk performance can be noticeably improved. Do not reset or turn off the power to the computer until you have terminated the defragmentation. Nasty things can happen if you disrupt a defragmentation.

Microsoft claims to enhance load times on some of their operating systems by moving boot files to the outer tracks of the disk (since they have the fastest data transfer rate). In addition, they claim that some files (such as .DLL) are intentionally split up so that important sections of the file are moved to the outer tracks for speed. All of this rearranging of files is “done in the background.” This explains some of the mysterious and otherwise inexplicable hard disk drive activity that occurs on some computers.

Many people maintain a small logical drive as a place to hold file before burning them to a CD-R/RW. The files are subsequently deleted. Rumor has it that some file systems do not efficiently clean up after such system. It is claimed that the logical drive must be formatted (or defragmented) again to be sure that the drive is truly defragmented.

Two Heads Are Better Than One A system that has two physical hard disk drives can be slightly optimized by changing a few system defaults. Under Windows, the boot drive (drive C:) is the default drive for temporary files (ex. - C:\temp) and system memory-page swap-files (ex. – pagefile.sys). Writing these files can be significantly faster if these defaults are changed to another physical hard disk drive.

The explanation is quite simple. As the first physical hard disk drive’s read/write heads are locating specific files, the second physical hard disk drive’s read/write heads can quickly and efficiently move to other files without the interference that would result if only a single drive was serving the same purpose.

Cable select The low cost of hard disk drives allows even basic systems to have two devices. Since there is not a Microsoft operating system that fills everyone’s needs, many people want to run two different operating systems on one computer.

The lowest hardware cost technique is to use two existing drives and select the boot operation system through a minor change in the BIOS. This circumvents the need to purchase additional software or drive bays with removable drive chassis.

Follow these procedures. Remove every hard disk drive, except for the target hard disk drive. Set the jumpers on the hard disk drive to master. Create the primary and extended partitions as desired. Install your operating system of choice on the primary partition and format the logical drives as needed. Test this configuration to your satisfaction.

Repeat this procedure for the second target drive. You can install any operating system you choose on the second drive.

Change the jumpers on both hard disk drives to the cable select setting. The jumper settings are printed on most newer hard disk drives. Install both of these hard disk drives on the same ribbon cable on the primary port (usually labeled “port 0”) of the motherboard.

You can now change the hard disk drive to boot from by selecting the appropriate drive from the BIOS (CMOS setup). In the AwardBIOS Setup Utility, select the “Boot” menu. Under the “IDE Hard Disk” item, select the desired hard disk drive as the onscreen instructions indicate. For example, I have a Seagate drive indicated by “[ST380021A]” next to the “IDE Hard Disk” text. The secondary drive might be indicated by something like “[MX548075A].” To boot from this device, change the setting to “IDE Hard Disk [MX548075A].”

Using the cable select feature of the ATA hard disk drive to change boot devices may not be the fastest way to change between operating systems. However, if you want a very inexpensive hardware solution, it does not get any better than this.

I left out some minor installation details. If you have further questions and do not think that you can get this to work, then you probably cannot. Any questions about using this cable select technique will be answered with instructions to read this article. There are too many variables involved in this type of system configuration to attempt to provide useful answers. However, any further information about this little documented feature will be used to update this page.

Performance A hard disk drive transfers data faster from the outer tracks than the inner tracks. The outer tracks are longer and contain more data. As a result, for each revolution, outer tracks transfer more data. The outer tracks are filled first.

Sooner or later, you’ll fill up your hard disk drive. It seems to have gotten slower and slower. It has! The inner tracks on a hard disk drive may transfer data about half as fast as the outer tracks!

One way to minimize this effect is to create several logical drives (on a physical disk) and use the second one (ex. – drive D:). Now you can put all of the other junk files on the inner tracks of drive E:, F:, G:, etc. Drive D: remains towards the outer tracks and your important data is faster.

When you buy a hard disk drive, look for the sustained transfer rate (data transfer rate). Of any single specification, this one will give you the best indication of a hard disk drives performance. Average access time is also important for random access applications (such as databases).

Safety First Do not mix the data that you create and cherish with installed program files! Create several logical drives and store the valuable things on a separate drive. Now it is easy to identify all of your important data. It is anything on drive D:!

Going one-step further, I create a subdirectory (folder) called C:\Herbert Wong, Jr. (you might call your folder something else) that contains other folders of the supremely important data (.\telephone numbers, .\finances, etc.). Another folder (D:\ASUSP3V4X-SystemInstallationFiles) might contain every driver and document file needed to reinstall that computer.

Now I know where to find everything easily. I can easily determine if it will fit on a CD-R. And, on those many occasions over the years, when Windows self destructs, I can confidently format and reinstall on drive C: without fear of losing anything of much importance.

Relocate your MY DOCUMENTS and Favorites folders to another drive (ex. – E:\MY DOCUMENTS). Then you will not even lose those files during an emergency.

The best way to back up a hard disk drive is to use another hard disk drive as the storage medium. The price of tape backup drives and blank tapes is much too great to be effective in a home environment. CD-R/RW and DVD-R/RW drives and media are also too expensive.

Conclusion Hard disk drives are larger, less expensive, and more reliable than ever before. With a little planning and a lot of maintenance, a new hard disk drive will well for years.

This article first appeared in the North Orange County Computer Club’s (http://www.noccc.org/) Orange Bytes newsmagazine for May 2002. You can find the latest version of this article at http://www.singularitytechnology.com/articles/HDDOp timization.html. You can contact me, Herbert Wong, Jr. at NOCCCArticles@SingularityTechnology.com.



Return to Articles Listing
Home | About NOCCC | Special Interest Groups | Calendar | Membership Information
Meeting Location | Links | Orange Bytes Newsmagazines | Classified Ads | Search the Web


Site Disclaimer Suggestions? E-Mail to webmaster@noccc.org
Content suggestions? editor@noccc.org
Last update: 6/01/2002

Copyright © 1995-2002 by North Orange County Computer Club. All rights reserved. Articles by NOCCC authors may be reprinted by other user groups without permission provided they are unaltered and the publication acknowledges the author thereof and NOCCC. Articles contained herein by authors from other organizations retain their original copyright.
Site assistance by CitiVU.