August 3rd, 2010 | Published in Google Research
Of the three themes of our 2010 Faculty Summit, cloud computing was the one that pervaded all others, from security in the cloud to the presumption of cloud infrastructure behind the social web. But in our more focused discussion on cloud computing last Thursday, we started with the premise of “prodigiousness,” a concept introduced by Afred Spector, VP of Research and Special Initiatives.
While we all know that systems are huge and will get even huger, the implications of this size on programmability, manageability, power, etc. is hard to comprehend. Alfred noted that the Internet is predicted to be carrying a zetta-byte (1021 bytes) per year in just a few years. And growth in the number of processing elements per chip may give rise to warehouse computers of having 1010 or more processing elements. To use systems at this scale, we need new solutions for storage and computation. It was these solutions we focused on throughout our discussions.
In the plenary talk, Andrew Fikes spoke on storage system opportunities. Among many topics, he talked about shifting engineering foci to storage management and optimization not just on an individual cluster of co-located systems, but across geographically distributed clusters. The goal is so-called planetary-scale systems. This brings up all manner of diverse challenges ranging from the need to continually balance storage vs. transmission costs, the need to account for variable network latency characteristics, and the desire to optimize storage (e.g., by physically storing only one copy of a file that many feel they have rights to, or own).
We had a few roundtables in the afternoon for deeper discussions. In the table I led, we discussed two systems for “programming the data center” developed by systems researchers at Google Seattle/Kirkland. The first, Dremel, is a scalable, interactive ad-hoc query system for analysis of read-only nested databases. Dremel was recently presented in a paper at VLDB (Dremel: Interactive Analysis of Web-Scale Datasets, Sergey Melnik, Andrey Gubarev, Jing Jing Long, Geoffrey Romer, Shiva Shivakumar, Matt Tolton, Theo Vassilakis. In Proceedings of the 36th Int'l Conf on Very Large Data Bases, 2010). The system serves as the foundational technology behind BigQuery, a product launched in limited preview mode at Google I/O in May.
We also discussed FlumeJava, a Java library that makes it easy to develop, test and run efficient data-parallel pipelines at data center scale. FlumeJava was developed by programming languages researchers at Google Seattle, and is currently in widespread use within Google. It was presented at the recent PLDI conference (FlumeJava: easy, efficient data-parallel pipelines, Craig Chambers, Ashish Raniwala, Frances Perry, Stephen Adams, Robert R. Henry, Robert Bradshaw, Nathan Weizenbaum. In Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation). The work reflects Google’s commitment to programming language and compiler technologies at scale.
The field of data center programming has progressed substantially in the last 10 years. Dremel and FlumeJava systems represent abstractions of a higher level than the MapReduce construct we previously introduced, and we think they are easier to use (within their domain of applicability) and more automatically optimizable. With time, the field will discover new “instructions” and even better abstractions leading us to a point where computations which run on nearly unlimited processors can be expressed as easily as sequential programs. We are working hard to make progress here, and I look forward to reporting on our progress in the future.