Share via


Anatomy of Hard Disk Drives – A Deep Look into Hard Drives

By: Brenton Blawat

Prior to actually getting into which hard disks are the best technology to pursue, I am going to explain the different parts of the hard disk. This is essential when getting into the configuration of the disks to understand how the disks work. This is really simple stuff!

http://brentblawat.files.wordpress.com/2008/08/disk2-thumb.jpg?w=881&h=536

Shown above is the basic architecture of a hard disk drive. There are tracks, sectors and heads:

  • Tracks can be thought of as guides for where the data should be placed on the hard drive.
  • Sectors are logical segments or groups where the 1s and 0s are stored. This allows the disk head to SKIP over data that it doesn’t need to read.
  • The Head is what reads the data (1s and 0s) from each sector. ** **

Simple enough?

 

http://brentblawat.files.wordpress.com/2008/08/disk3-thumb.jpg?w=842&h=444

Ever wonder why hard drives are so thick? Shown above is a 3D look into a hard drive. In most modern hard drives (not solid state hard drives) are made with multiple Platters. Each Platter, in this topology has Tracks, Sectors, and a Head

Still Simple?

http://brentblawat.files.wordpress.com/2008/08/diskspeed-thumb1.jpg?w=789&h=725

Now we need to talk about hard disk Speed. The “Speed“, as most people refer to it, is measured in revolutions per minute (RPM). An RPM is the measuring of the number of times a platter completely rotates in a 60 second period of time.

The actual Speed is a complex equation which involves RPM, Seek Time, Rotational Latency, Interface Type, and Access Time. With these individual items combined, mathematically one can calculate the theoretical transfer rate of the hard drive**.**

So what is the different between all of these things? 

  • Revolutions Per MinuteRPM is the measuring of the number of times a platter completely rotates in a 60 second period of time. (yes I repeated it)
  • Rotational Latency (delay)- RPM and Rotational Latency are directly correlated.The Rotational Latency is an average time for the Head to read the entire track of a disk. In theory, the faster the hard drive is spinning, the less time it takes to read the track.
RPM Measured at Spindle Rotational Latency AVG
4200 7.14 ms
5400 5.55 ms
7200 4.17 ms
10000 3 ms
15000 2 ms

 

  • Seek TimeSeek Time is measured by the average time it takes for the mechanical arm to move between different Tracks on a hard drive. This is important in the instance where data is scattered between different tracks on a hard disk. The faster the mechanical arm can move between the tracks, the faster the data can be accessed for use.

 

  • Interface Type – The Interface Type was once thought to be the biggest “speed” issue with hard disk drives. Since the interfaces were so slow, (UATA/133 was only at 133mbps), the theoretical speed was significantly limited to the interface type. However, with the introduction of Serial ATA, the theoretical transfer rates are increased to over 10.0 gb/s which currently ( 2008 ) is not touched by modern hard disk drives.

 

 

  • Access Time – The Access Time is a combined metric. Access Time is determined by the Interface Type, Rotational Latency, and RPM of a hard disk drive. This is why critics are so concerned about the Access Time of the Hard Drives. It provides the best rounded number for measuring the hard drive performance. Hardware manufacturers like to measure in Seek Time as its a mechanical measurement much like horsepower is to automotive.

 

What else can affect the “Speed” of a hard disk?

 http://brentblawat.files.wordpress.com/2008/08/disk5-thumb.jpg?w=755&h=587

When modern hard disk manufacturers determined that the hard disk is the slowest part of a computer, they started to integrate Read/Write Caches on the hard disk itself. A Read/Write Cache is designed to allow small amounts of data to be stored in memory to be written to the disk when the hard drive is busy performing other tasks. This significantly improved system performance as the CPU doesn’t have to wait on the write operation to complete prior to completing other processes.

Well there is an issue with this… Since the hard drive is powered from the power supply, if the power fails or a power surge occurs, any data that is stored in the read/write cache is gone. That is why currently the industry is not releasing hard drives with 1GB of write cache. The loss of 1 GB is significant to the stability of a system (and your sanity). New motherboard manufacturers are now providing the ability to integrate a Battery Backed Cache on the motherboard to be able to store the write cache. This not only significantly improves the speed of the system, but even in the event of a power failure, the data is safe for 72+ hours. This is similar to what GOOD RAID controllers (discussed in a different article) do.

 

http://brentblawat.files.wordpress.com/2008/08/disk6-thumb.jpg?w=965&h=701

I always like to use the “compiling a program in Visual Studio example” when explaining BBC because compiling code is a very hard disk intensive operation. (Please keep in mind this example is a very high level look at what a write cache does.)

There are three core operations that occur when compiling code:

  1. Reading the lines of code line by line
  2. Processing those lines of code and outputting their results
  3. Writing the results back to the hard disk

In systems without battery backed cache, the hard drive will read a line of code, process the line, write the line to the hard drive, then “rinse and repeat”. This causes the hard drive to STOP reading from the current sector, jump to a different track, then write to that track, then when needing to process the next line of code, jump back to the original track. This creates a situation called Disk Thrashing where the disk reads and writes from two different physical locations.

With the write cache, the computer has the ability to perform all three of the operations simultaneously while compiling code.

1. Compiling code

  • Reading the Lines of code line by line
  • Processing those lines of code and outputting their results
  • Writing the results to the battery backed write cache

2. Writing the Write Cache to the hard disk.

Your system no longer has to wait for the hard drive to complete a write to the disk prior to reading more information from the hard disk. There is more complexity to the write cache operations but to save your sanity, I’ve chosen to leave them out.

 

Circling back to Disk Trashing for a second… Its Important!

http://brentblawat.files.wordpress.com/2008/08/diskthrash-thumb.jpg?w=545&h=672

 

Disk Thrashing actually occurs more frequently than most people think. In fact, as long as you have a page file on your computer, you will ALWAYS have Disk Thrashing. Why? When a page file is enabled on a computer, the memory operations are stored in a file on the root of c:\ named pagefile.sys. As shown in the image above, when utilizing a program which uses a lot of memory, the the head has to move between the pagefile.sys and code.vb file.

This causes an issue called Disk Trashing. The head actually thrashes between the two tracks on the same platter. This is what makes that clicking or “thinking” noise in your hard drive. Recalling Seek Time / Access Time, you will know that when using multiple files, the head has to seek between the different sectors on the different tracks. Everytime it seeks to a new track, it takes X number of ms to get to each sector which means you have to wait X number of ms between accessing multiple files.

This significantly reduces the performance of the computer and ultimately reduces the longevity of the hard disk. The mechanical drive for moving the mechanical arm on the hard drive can fail when excessive disk thrashing occurs over multiple years.

NOTE: Please note that this is NOT a reason to disable your page file. A page file is required in systems which frequently run out of RAM. Also this issue may not occur if the page file is on a different platter than the data being accessed. This also is not true if the page file is stored on a different physical drive than the drive being accessed.