Sunday, August 3, 2008

Byte My Hard Drive

My name is Dan, and I am a music addict. My musical tastes range from blues to reds…That was a nerdy visual spectra pun…Probably too early for that kind of garbage…Come to think of it, there’s never a good time for a visual spectra pun…I’m sorry…You should be getting my formal apology in the mail. Anyway, let’s just say that my music collection is vast. I most likely have whatever you’re thinking of right now. Well, maybe not that. We’re talking about music here, Jesus. For reasons of national security I will not fully disclose the extent of my music library in this forum. Recently I’ve been backing up my older CDs to the computer, and because of that I’m always in the market for more storage space. Hard drives are getting larger and larger and less expensive every day. I ran across this press release the other day, and it got me to thinking. What exactly is 1.5 terabytes or information? What can I store on a drive with that much storage space? The amount of information that can be stored on a hard drive is almost made trivial by its size. Something that can fit in my hand can’t be too impressive after all (that’s what she said…heh, heh…ahem…). Or can it? Let’s walk our way up to a terabyte in order to get some perspective on the amount of information that we are talking about, shall we?

Computers make decisions based on two pretty benign numbers: 1 and 0. Similar to a switch, 1 is “on”, and 0 is “off”. It’s obviously more complicated, but that is the general idea. This switch is called 1 bit of information. A byte is made up of 8 bits. One byte is roughly equivalent to one character of text. Not too interesting. The next step on our walk is from bytes to kilobytes. A kilobyte (KB) is 1000 bytes. Each of the posts on the blog contains about 20 KB of information, or 20 KB of B.S. depending on who you talk to. Not particularly impressive, either. So let’s step from KBs to megabytes. Each of these steps is an increase of a factor of 1000, so it follows that there are 1000 KB in a megabyte (meg). At the megabyte scale, we start talking about information in terms of everyday use. The first hard drives available were 5 to 10 megs. The pictures from your digital camera are about two megs each. Songs are about 3 or 4 megs when they are compressed. Everything Shakespeare ever wrote can be stored in a 5 meg file. A CD can hold about 750 megs of info. Our next stop along the way is at the gigabyte (gig) scale. Now we’re talking about our usual storage devices. DVDs contain about 4 gigs of data. Standard hard drives on new computers are 80-160 gigs. Media players are from 1 to 5 gig or so. One gig can store about 300 songs, so a 250 gig hard drive can hold around 80,000 songs. That’s roughly equivalent to listening to the radio 24/7 for 6 months without any repeat songs or commercials. It’s important to mention here that 1 gig is 1 billion bytes since the next step on our walk lands us in a place where things can get interesting.

One terabyte is 1000 times larger than a gig. That’s 1000 billion bytes of information, or 1 with 12 zeros after it. Just as a reference, the distance from the earth to the sun is 92,955,820.5 miles. Multiply that by 10,000, and we’re now in the ballpark of the scale of a terabyte. Here is where it starts to get weird. One terabyte is the equivalent of recording about one meg of information per second over the course of a month. This means that with a one terabyte hard drive you could conceivably record every second of your life with digital audio and video. On top of that you would still have enough space to include speech-to-text voice recognition, visual text recognition, and keystroke recording for every piece of text that you type. So everything that you say, read or type will be fully searchable and able to be indexed or bookmarked. Every book, sign, newspaper, email, and conversation will be right there for you to recall at a stroke of the keypad – essentially a complete external memory that is 100% accurate. Did she tell me that last week? When did I see that article? Was I talking in my sleep last night? But wait! There’s more! There’s still a little space left over. You can also record some sort of telemetry like heart rate or blood pressure monitoring or GPS tracking on top of it all. Did I like that movie? Does her driving scare me? What store was I in when I saw those assless chaps for $20? All stored and searchable on one terabyte for the low, low price of a thousand bucks. By the way, ten terabytes will hold the entirety of the information in the Library of Congress.

From what’s available in the present we can take a peek into the future. The next step from a terabyte is a petabyte. Servers of this size should be available in 5 or ten years, I would guess. One petabyte is enough storage to dedicate at least one meg of information to every human on the planet. That would be the equivalent (by today’s standards) of a directory of every person on the planet with enough room for a picture and a sound clip or video. Two petabytes will store the information in all of the academic research libraries in the U.S. If we stretch a bit further, we reach the exabyte - a billion billion bytes (1018). Nothing too interesting about an exabyte…oh, except that 5 exabytes of storage will hold enough information to record all of the words ever spoken by humans…fully searchable. All compressed into a storage device that can fit into the palm of my hand. It may take a couple of years to perform a search with today’s technology, but I think you get the point.

Now that we have a grasp on the numbers, let’s talk about the implications of something as seemingly benign as digital storage capacity. Apart from storing a brazillion songs, all of your ripped DVDs, porn and digital publications like journals and books there are unintended consequences of throwing around this much information as I hinted at above. Our entire lives or the lives of our children will have the capacity of being stored, searched, and indexed without consent. The trends found in that information can be used to track movements, make predictions about things like future spending and voting habits or healthcare expenses. Information about every purchase, digital mammogram, x-ray, or MRI will be (already is) available at the touch of a button and will passed around like a bag of Oreos at fat camp. Prices and availability of services can then be tailored to a specific individual (your cost is $42.37, or sorry, we can’t offer that insurance package to you). It’s not too far of a leap to suppose that small concepts like “private” conversations or “secret” information will fade away eventually, and the idea of personal privacy will be distorted beyond the point of recognition relative to what we consider private today. All directly affected by something as simple as hard drive storage space. Not too much of a leap, is it? Hey, but at least I can listen to the complete Milli Vanilli recordings in three languages whenever I want to.

No comments: