The Next Unsolved Problem in Computing

I have a prediction: Within two years, someone will solve the next big unsolved problem in computing - What the hell do we do about storage?

Laptop drives are tanking. Perpendicular recording helps a bit, but speeds are way down. There are standard 3.5″ terabyte hard drives now, two HD blue laser disc formats are about to become recordable. We’re filling up hard drives just like we did yesterday. I am assured that sizable drives will keep coming in volume (pun unintented), but there’s a problem that’s much more urgent than before.

You have a database-backed music library. (Shouldn’t be hard to postulate.) Thanks to the internet - legal, illegal, doesn’t matter - thanks to MP3 ripping from CDs, tapes and LPs and thanks to better quality, it’s exploding in size. You can no longer fit it on your internal drive. On a desktop computer, it’s no problem, but on a laptop you’ll want to stuff it on an external drive. How do you not completely cripple the rest of the database when part of the library is absent?

(Yes, yes, you can opt to not use a database for your music. At this point, the whole database vs tree of folders argument is irrelevant: the problem is symptomatic and appears in higher scales.)

This problem requires deep thought. It requires smart file systems, smart applications, smart hardware, smart standards and a very smart solution. It may not require all of them, but it’s going to have to be solved.

Comments [+]

  1. I don’t really understand the problem. I currently house my music library on an external drive, yet the database stays in my home folder on the computer. If a song is on the external drive and inaccessible (because it is not plugged in), then the song doesn’t play. But songs do not have to reside exclusively on the external hard drive, so if the song happens to by on my computer, I can play it without the external HD!

    So, what exactly is the problem?

    By http://kyle.rove.myopenid.com/ · 2007.06.26 16:45

  2. Right - but you can’t keep a song on both the external drive and the internal drive. Consider a music library you’d want to move between two computers. Or even better, consider a music library you’d want to share within the house or with a neighbor (over wireless) that you’d all like to add music to. (The database itself would be the problem file here.)

    By Jesper · 2007.06.26 17:23

  3. To clarify a bit more what I’m after (and I’m not talking about backup): Almost no one stores everything on just one hard drive anymore. We use several hard drives. We also share files with others a lot more.

    The fundamental model for most software is still that it’s okay to assume that all data is available in one place and only one place, and that the app is the only one dealing with it right now.

    I foresee a shift from this level to another level - very similar to the single-threaded and multithreaded approaches to programming, actually. If I could say exactly what this entailed I’d probably be a lot richer, but I have a feeling it’s going to solve most of the problems we have today where everything would work if it weren’t for how file storage currently works.

    By Jesper · 2007.06.26 17:53

  4. ZFS should move us in that direction, moving away from physical drive == logical volume. Of course, how the OS handles removing a physical drive from the system remains to be seen.

    By http://kyle.rove.myopenid.com/ · 2007.06.26 19:30

  5. I was looking at something like this a while back, loosely based on LBFS. The idea was that somewhere in The Cloud we would store your file index, and then just cache the bits you were actually using locally. If people shared clouds, transferring a file would just mean copying the appropriate signatures into the user’s account.

    By benstiglitz · 2007.06.26 23:50

  6. Right. That’s something like what I mean.

    By Jesper · 2007.06.27 00:58

  7. I think that what you are looking for is a P2P style database of sorts. I would almost call this a P2P-Raid 5. The ability to essentially store the files in a “cloud” as mentioned above such that any one point (or more points depending on the alg.) could possibly be off line but you will still be able to gain access to the file you are interested in. I think actually that this is the next use case for P2P networking and is a case that could be made to large data centers for maintaining the integrity of the data for their companies. Could work in a Disaster recovery setup etc. etc.

    By http://openid.aol.com/vwdiesel · 2007.07.02 13:58

Leave a comment

Your e-mail address is never shown. If you type a line break in the comment, it will show up as a line break (naturally). The following HTML is allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

(required)

(required)


Please note: Your comment will not show up at once. Unless you're spamming or being abusive, you have nothing to worry about. (Read the full policy.)