April 23rd, 2008 | Published in Google Research
The emergence of extremely large datasets, well beyond the capacity of almost any single computer, has challenged traditional and contemporary methods of analysis in the research world. While a simple spreadsheet or modest database remains sufficient for some research, problems in the domain of "computational science," which explores mathematical models via computational simulation, require systems that provide huge amounts of data storage and computer processing (current research areas in computational science include climate modeling, gene sequencing, protein mapping, materials science and many more). As an added hurdle, this level of computational infrastructure is often not affordable to research teams, who usually work with significant budgetary restrictions.
Fortunately, as the Internet technology industry expands its global infrastructure, accessing world class distributed computational and storage resources can be as simple as visiting a website. Building on its Academic Cloud Computing Initiative (ACCI) announced last October, Google and IBM, with the National Science Foundation, announced in February the CluE initiative to address this particular need. After coordinating the technical details with Google and IBM, the NSF posted the official solicitation of proposals last week.
Our primary goal in participating in the CluE initiative is to encourage the understanding, further refinement and --importantly-- targeted application of the latest distributed computing technology and methods across many academic disciplines. Engaging educators and researchers with the new potential of distributed computing for processing and analyzing extremely large datasets is an invaluable investment for any technology company to make, and Google in particular is pleased to make a contribution to the academic community that has enabled so many recent advances in the industry.
We're looking forward to an eclectic collection of proposals from the NSF's solicitation. We believe many will leverage the power of distributed computing to produce a diverse range of knowledge that will provide long term benefit to both the research community and the public at large. We also hope that Google's contribution to this low cost, open source approach to distributed computing will allow many more in the academic community to take advantage of this pervasive technological shift.
More details, including information on how to apply for access to these resources, is available on the NSF site.