More Google Cluster Data

November 29th, 2011 | Published in Google Research, Uncategorized

Posted by John Wilkes, Principal Software Engineer

Google has a strong interest in promoting high quality systems research, and we believe that providing information about real-life workloads to the academic community can help.

In support of this we published a small (7-hour) sample of resource-usage information from a Google production cluster in 2010 (research blog on Google Cluster Data). Approximately a dozen researchers at UC Berkeley, CMU, Brown, NCSU, and elsewhere have made use of it.

Recently, we released a larger dataset. It covers a longer period of time (29 days) for a larger cell (about 11k machines) and includes significantly more information, including:

the original resource requests, to permit scheduling experiments
request constraints and machine attriibutes
machine availability and failure events
some of the reasons for task exits
(obfuscated) job and job-submitter names, to help identify repeated or related jobs
more types of usage information
CPI (cycles per instruction) and memory traffic for some of the machines

Note that this trace primarily provides data about resource requests and usage. It contains no information about end users, their data, or access patterns to storage systems and other services.

More information can be found via this link, which will (after a short questionnaire) take you to a site that provides access instructions, a description of the data schema, and information about how the data was derived and its meaning.

We hope this data will facilitate a range of research in cluster management. Let us know if you find it useful, are willing to share tools that analyze it, or have suggestions for how to improve it.

Google Data

More Google Cluster Data

Leave a Response

Categories

Tags