星期四, 五月 14, 2009

[Readings]A Conversation with Jim Gray

A Conversation with Jim Gray.
http://queue.acm.org/detail.cfm?id=864078

DAVE PATTERSON What is the state of storage today?

JIM GRAY We have an embarrassment of riches in that we're able to store more than we can access. Capacities continue to double each year, while access times are improving at 10 percent per year. So, we have a vastly larger storage pool, with a relatively narrow pipeline into it.

...

DP Will serial attached disks make it easier?

JG That should be a huge help. And Firewire is a help. Another option is to send whole computers. I've been sending NTFS disks (the Windows file system format), and not every Linux system can read NTFS. So lately I'm sending complete computers. We're now into the 2-terabyte realm, so we can't actually send a single disk; we need to send a bunch of disks. It's convenient to send them packaged inside a metal box that just happens to have a processor in it. I know this sounds crazy—but you get an NFS or CIFS server and most people can just plug the thing into the wall and into the network and then copy the data.

...

DP What are storage costs today?

JG ... Our chore is to figure out how to waste storage space to save administration. That includes things like RAID, disk snapshots, disk-to-disk backup, and much simpler administration tools.

...

DP To change the subject, I think one of the reasons you received the Turing Award was for your contributions in transactions, or for making the databases work better. Is that a fair characterization?

JG It's hard to know. There is this really elegant theory about transactions, having to do with concurrency and recovery. A lot of people had implemented these ideas implicitly. I wrote a mathematical theory around it and explained it and did a fairly crisp implementation of the ideas. The Turing Award committee likes crisp results. The embarrassing thing is that I did it with a whole bunch of other people. But I wrote the papers, and my name was often first on the list of authors. So, I got the credit.

But to return to your question, the fundamental premise of transactions is that we needed exception handling in distributed computations. Transactions are the computer equivalent of contract law. If anything goes wrong, we'll just blow away the whole computation. Absent some better model, that's a real simple model that everybody can understand. There's actually a mathematical theory lying underneath it.

In addition, there is a great deal of logic about keeping a log of changes so that you can undo things, and keeping a list of updates, called locks, so that others do not see your uncommitted changes.The theory says: "If that's what you want to do, this is what you have to do to get it."

The algorithms are simple enough so most implementers can understand them, and they are complicated enough so most people who can't understand them will want somebody else to figure it out for them. It has this nice property of being both elegant and relevant.

没有评论: