Google Data » Research Admin

Academic Successes in Cluster Computing

Research Admin — Thu, 22 Dec 2011 23:00:00 +0000

Posted by Alfred Spector, VP of Research

Access to massive computing resources is foundational to Research and Development. Fifteen awardees of the National Science Foundation (NSF) Cluster Exploratory Service (CLuE) program have been applying large scale computational resources donated by Google and IBM.

Overall, 1,328 researchers have used the cluster to perform over 120 million computing tasks on the cluster and in the process, have published 49 scientific publications, educated thousands of students on parallel computing and supported numerous post-doctoral candidates in their academic careers. Researchers have used the program for such diverse fields as astronomy, oceanography and linguistics. Besides validating MapReduce as a useful tool in academic research, the program has also generated significant scientific knowledge.

Three years later, there are many viable, affordable alternatives to the Academic Cloud Computing Initiative, so we have decided to bring our part of the program to a close. It has been a great opportunity to collaborate with IBM, the NSF and the many universities on this program. It was state-of-the-art four years ago when it was started; now, Academic Cloud Computing is a worldwide phenomena and there are many low-cost cloud computing options that provide viable alternatives to the Academic Cloud Computing Initiative.

ACM Fellows for 2011

Research Admin — Thu, 08 Dec 2011 15:30:00 +0000

Posted by Alfred Spector, Google Research

Cross-posted with the Official Google Blog

Congratulations to three Googlers elected ACM Fellows

It gives me great pleasure to share that the Association for Computing Machinery (ACM) has announced that three Googlers have been elected ACM Fellows in 2011. The ACM is the world’s largest educational and scientific computing society, and the Fellows Program celebrates the exceptional contributions of leaders in the computing field. This year the society has selected Amit Singhal, Peter S. Magnusson and Amin Vahdat for their outstanding work, which has provided fundamental knowledge to the field.

The recently-named Fellows join 14 prior Googler ACM Fellows and other professional society honorees in exemplifying our extraordinarily talented people. On behalf of Google, I congratulate our colleagues. They embody Google’s commitment to innovation with impact, and I hope that they’ll serve as inspiration to students as well as the broader community of computer scientists.

You can read more detailed summaries of their achievements below, including the official citations from the ACM.

Dr. Amit Singhal, Google Fellow

For contributions to search and information retrieval

Since 2000, Dr. Amit Singhal has been pioneering search as the technical lead for Google's core search algorithms. He is credited with most of the information retrieval design decisions in Google Search – a massive system that has responded to hundreds of billions of queries. More than anyone, Amit has a deep understanding of Google’s entire algorithmic system. He is responsible for prioritization and has overseen the development of numerous algorithmic signals and their progression over time. He is the clear thought and managerial leader who has led critically important initiatives at the company. Among many other things, Amit catalyzed Universal Search, which returns multi-modal results from all available corpora; he was the force behind Realtime Search, which returns results from dynamic corpora with low latency; and he championed Google Instant, which returns search results as the user types.

Prior to joining Google, Amit boasted a prolific publication record averaging 5 publications/year from 1996-9 while at AT&T Labs. Since that time, you could say Google Search has been one long, sustained publication demonstrating a constant advancement in the state of the art of information retrieval.

Peter S. Magnusson, Engineering Director

For contributions to full-system simulation

Peter has made a tremendous impact by driving full-system simulation. His approach was so advanced, it can be used in real world production of commercial CPUs and prototyping of system software. Starting in 1991, Peter began to challenge the notion that simulators could not be made fast enough to run large workloads, nor accurate enough to run commercial operating systems. His innovations in simulator design culminated in Simics, the first academic simulator that could boot and run commercial multiprocessor workloads. Simics saw huge academic success and has been used to run simulations for research presented in several hundred subsequent publications.

Peter founded Virtutech in 1998 to commercially develop Simics, and he ultimately forged and became the leader in a new market segment for software tools. With Peter at the helm, Virtutech pushed Simics beyond several performance barriers to make it the first simulator to exceed 1 billion instructions per second and the first simulator to model over 1,000 processors. Peter joined Google in 2010 to work with cloud computing.

Dr. Amin Vahdat, Principal Engineer

For contributions to data center scalability and management

Amin’s work made an impact at Google long before he arrived here. Amin is known for conducting research through bold, visionary projects that combine creativity with careful consideration of the engineering constraints needed to make them applicable in real world applications. Amin’s infrastructure ideas have underpinned the shift in the computing field from the pure client-server paradigm to a landscape in which major web services are hosted “in the cloud” across multiple data centers. In addition to pioneering “third-party cloud computing” through his work on WebOS and Rent-A-Server in the mid-90s, Amin has made important advancements in managing wide-area consistency between data centers, scalable modeling of data center applications, and building scalable data center networks.

Amin’s innovations have penetrated and broadly influenced the networking community within academia and industry, including Google, and his research has been recapitulated and expanded upon in a number of publications. Conferences that formerly did not even cover data centers now have multiple sessions covering variants of what Amin and his team have proposed. At Google, Amin continues to drive next-generation data center infrastructure focusing on Software Defined Networking and new opportunities from optical technologies. This is emblematic of Amin’s ability to build real systems, and perhaps more significantly, convince people of their value.

Our second round of Google Research Awards for 2011

Research Admin — Tue, 06 Dec 2011 16:00:00 +0000

Posted by Maggie Johnson, Director of Education & University Relations

We’ve just finished the review process for the latest round of the Google Research Awards, which provide funding to full-time faculty working on research in areas of mutual interest with Google. We are delighted to be funding 119 awards across 21 different focus areas for a total of $6 million. The subject areas that received the highest level of support this time were systems and infrastructure, human-computer interaction, social and mobile. In addition, 24% of the funding was awarded to universities outside the U.S.

One way in which we measure the impact of the research award program is through surveys of Principal Investigators (PIs) and their Google sponsors (a Googler with whom grantees can discuss research directions, provide progress updates, engage in knowledge transfer, etc.). Here are some highlights from our most recent survey, covering projects funded over the last two years:

433 papers were published as a result of a Google research award
126 projects made data sets or software publicly available
63 research talks were given by sponsored PIs at Google offices

An important aspect of the program is that it often gives early career academics a head start on their research agenda. Many new PIs commented on how a Google research award allowed them to explore their initial ideas and build a foundation for obtaining more significant funding from other sources. This type of seed funding is especially hard to get in the current economic environment.

The goal of the research award program is to initiate and sustain strong collaborations with our academic colleagues. The collaborations take many forms, from working on a project together, to co-writing a paper, to coming to Google to give a research talk. Whatever the form, the most important aspect is building strong relationships that last. Case in point, many of our focused awards (multi-year, unrestricted grants that include access to Google’s tools, technology and expertise) started as Google research awards.

Congratulations to the well-deserving recipients of this round’s awards, and if you are interested in applying for the next round (deadline is April 15), please visit our website for more information.

2011 Google China Faculty Summit in Hangzhou

Research Admin — Fri, 02 Dec 2011 19:30:00 +0000

Posted by Aimin Zhu, University Relationship Manager, Google China

We just wrapped up a highly successful 2011 Google China Faculty Summit in Hangzhou, China. On November 17 and 18, Googlers from China and the U.S. gathered with more than 80 faculty members representing more than 45 universities and institutes, including Tsinghua University, Peking University and The Chinese Academy of Sciences. The two-day event revolved around the theme of “Communication, Exploration and Expansion,” with day one covering research and day two focusing on academic development.

The summit provided a unique setting for both sides to share the results of their research and exchange ideas. Speakers included:

Maggie Johnson, director of education and university relations at Google, presenting on innovation in Google research and global university relations programs,
Dr. Boon-Lock Yeo, head of engineering and research for Google China, providing an overview of innovation in China engineering and corporate social responsibility efforts and accomplishments, and
Prof. Edward Chang, director of research for Google China, delivering a keynote on mobile information management and retrieval.

The discussions on November 17 focused on two tracks, mobile computing and natural language processing, while discussions on November 18 focused on curriculum development with a special focus on Android app development. The attendees also spent time discussing joint research and development between universities and industry.

This summit is part of a continuing to effort to collaborate with Chinese universities in order to support education in China. Click here for a list of the variety of education programs we have launched there in recent years. We look forward to expanding partnership opportunities in the future.

Discovering Talented Musicians with Acoustic Analysis

Research Admin — Wed, 02 Nov 2011 15:00:00 +0000

Posted by Charles DuHadway, YouTube Slam Team, Google Research

In an earlier post we talked about the technology behind Instant Mix for Music Beta by Google. Instant Mix uses machine hearing to characterize music attributes such as its timbre, mood and tempo. Today we would like to talk about acoustic and visual analysis -- this time on YouTube. A fundamental part of YouTube's mission is to allow anyone anywhere to showcase their talents -- occasionally leading to life-changing success -- but many talented performers are never discovered. Part of the problem is the sheer volume of videos: forty eight hours of video are uploaded to YouTube every minute (that’s eight years of content every day). We wondered if we could use acoustic analysis and machine learning to pore over these videos and automatically identify talented musicians.

First we analyzed audio and visual features of videos being uploaded. We wanted to find “singing at home” videos -- often correlated with features such as ambient indoor lighting, head-and-shoulders view of a person singing in front of a fixed camera, few instruments and often a single dominant voice. Here’s a sample set of videos we found.

Then we estimated the quality of singing in each video. Our approach is based on acoustic analysis similar to that used by Instant Mix, coupled with a small set of singing quality annotations from human raters. Given these data we used machine learning to build a ranker that predicts if an average listener would like a performance.

While machines are useful for weeding through thousands of not-so-great videos to find potential stars, we know they alone can't pick the next great star. So we turn to YouTube users to help us identify the real hidden gems by playing a voting game called YouTube Slam. We're putting an equal amount of effort into the game itself -- how do people vote? What makes it fun? How do we know when we have a true hit? We're looking forward to your feedback to help us refine this process: give it a try*. You can also check out singer and voter leaderboards. Toggle “All time” to “Last week” to find emerging talent in fresh videos or all-time favorites.

Our “Music Slam” has only been running for a few weeks and we have already found some very talented musicians. Many of the videos have less than 100 views when we find them.

And while we're excited about what we've done with music, there's as much undiscovered potential in almost any subject you can think of. Try our other slams: cute, bizarre, comedy, and dance*. Enjoy!

Related work by Google Researchers:
“Video2Text: Learning to Annotate Video Content”, Hrishikesh Aradhye, George Toderici, Jay Yagnik, ICDM Workshop on Internet Multimedia Mining, 2009.

* Music and dance slams are currently available only in the US.

Fresh Perspectives about People and the Web from Think Quarterly

Research Admin — Wed, 28 Sep 2011 19:00:00 +0000

Posted by Allison Mooney, Christina Park, and Caroline McCarthy, The Think Quarterly Team

There’s a lot of research, analysis and insights—from inside and outside Google—that we use in building our products and making decisions. To share what we’ve learned with our partners, we created Think Quarterly. It’s intended to be a snapshot of what Google and other industry leaders are talking about and inspired by right now.

Today we’re launching our second edition, the “People” issue, exploring the latest technologies connecting us and the big ideas driving society forward. It also includes some of the research and analysis that helps us shape our strategies.

For those who love data as much as we do, here are a few articles worth reading:

“Following Generation Z,” in which Google research scientist Ed Chi details what he’s learned from monitoring the course of digital innovation and mapping patterns of digital technology use in the future

“Predicting the Present,” by chief economist Hal Varian, about how publicly available search tools can help anyone gain valuable insights into the behavior of web users and predict what they might do next

“Power to the People,” by Meg Pickard, anthropologist turned head of digital engagement at Guardian News and Media, about tracking the influence and power of online communities

“From Cash to Contentment,” about the use of happiness as a measurable metric of success, with insights coming from Nobel Prize winner Joseph Stiglitz

Click here to read all the articles, and if you have a suggestion for our next issue please tell us here. We hope you enjoy (and +1) it!

Sorting Petabytes with MapReduce – The Next Episode

Research Admin — Wed, 07 Sep 2011 23:50:00 +0000

Posted by Grzegorz Czajkowski, Marián Dvorský, Jerry Zhao, and Michael Conley, Systems Infrastructure

Almost three years ago we announced results of the first ever "petasort" (sorting a petabyte-worth of 100-byte records, following the Sort Benchmark rules). It completed in just over six hours on 4000 computers. Recently we repeated the experiment using 8000 computers. The execution time was 33 minutes, an order of magnitude improvement.

Our sorting code is based on MapReduce, which is a key framework for running multiple processes simultaneously at Google. Thousands of applications, supporting most services offered by Google, have been expressed in MapReduce. While not many MapReduce applications operate at a petabyte scale, some do. Their scale is likely to continue growing quickly. The need to help such applications scale motivated us to experiment with data sets larger than one petabyte. In particular, sorting a ten petabyte input set took 6 hours and 27 minutes to complete on 8000 computers. We are not aware of any other sorting experiment successfully completed at this scale.

We are excited by these results. While internal improvements to the MapReduce framework contributed significantly, a large part of the credit goes to numerous advances in Google's hardware, cluster management system, and storage stack.

What would it take to scale MapReduce by further orders of magnitude and make processing of such large data sets efficient and easy? One way to find out is to join Google’s systems infrastructure team. If you have a passion for distributed computing, are an expert or plan to become one, and feel excited about the challenges of exascale then definitely consider applying for a software engineering position with Google.

Google at the Joint Statistical Meetings in Miami

Research Admin — Mon, 22 Aug 2011 22:06:00 +0000

Posted by Marianna Dizik, Statistician

The Joint Statistical Meetings (JSM) were held in Miami, Florida, this year. Nearly 5,000 participants from academia and industry came to present and discuss the latest in statistical research, methodology, and applications. Similar to previous years, several Googlers shared expertise in large-scale experimental design and implementation, statistical inference with massive datasets and forecasting, data mining, parallel computing, and much more.

Our session "Statistics: The Secret Weapon of Successful Web Giants" attracted over one hundred people; surprising for an 8:30 AM session! Revolution Analytics reviewed this in their official blog post "How Google uses R to make online advertising more effective"

The following talks were given by Googlers at JSM 2011. Please check the upcoming Proceedings of the JSM 2011 for the full papers.

Statistical Plumbing: Effective use of classical statistical methods for large scale applications

Author(s): Ni Wang, Yong Li, Daryl Pregibon, and Rachel Schutt

Parallel Computations in R, with Applications for Statistical Forecasting

Author(s): Murray Stokely and Farzan Rohani and Eric Tassone

Conditional Regression Models

Author(s): William D. Heavlin

The Effectiveness of Display Ads

Author(s): Tim Hesterberg and Diane Lambert and David X. Chan and Or Gershony and Rong Ge

Measuring Ad Effectiveness Using Continuous Geo Experiments

Author(s): Jon Vaver and Deepak Kumar and Jim Koehler

Post-Stratification and Network Sampling

Author(s): Rachel Schutt and Andrew Gelman and Tyler McCormick

Google has participated at JSM each year since 2004. We have been increasing our involvement significantly by providing sponsorship, organizing and giving talks at sessions and roundtables, teaching courses and workshops, hosting a booth with new Google products demo, submitting posters, and more. This year Googlers participated in sessions sponsored by ASA sections for Statistical Learning and Data Mining, Statistics and Marketing, Statistical Computing, Bayesian Statistical Science , Health Policy Statistics, Statistical Graphics, Quality and Productivity, Physical and Engineering Sciences, and Statistical Education.

We also hosted the Google faculty reception, which was well-attended by faculty and their promising students. Google hires a growing number of statisticians and we were happy to participate in JSM again this year. People had a chance to talk to Googlers, ask about working here, encounter elements of Google culture (good food! T-shirts! 3D puzzles!), meet old and make new friends, and just have fun!

Thanks to everyone that presented, attended, or otherwise engaged with the statistical community at JSM this year. We’re looking forward to seeing you in San Diego next year.

A new MIT center for mobile learning, with support from Google

Research Admin — Tue, 16 Aug 2011 16:00:00 +0000

Posted by Hal Abelson, Professor of Computer Science and Engineering, MIT

MIT and Google have a long-standing relationship based on mutual interests in education and technology. Today, we took another step forward in our shared goals with the establishment of the MIT Center for Mobile Learning, which will strive to transform learning and education through innovation in mobile computing. The new center will be actively engaged in studying and extending App Inventor for Android, which Google recently announced it will be open sourcing.

The new center, housed at MIT’s Media Lab, will focus on designing and studying new mobile technologies that enable people to learn anywhere, anytime, with anyone. The center was made possible in part by support from Google University Relations and will be run by myself and two distinguished MIT colleagues: Professors Eric Klopfer (science education) and Mitchel Resnick (media arts and sciences).

App Inventor for Android—a programming system that makes it easy for learners to create mobile apps for Android smartphones—currently supports a community of about 100,000 educators, students and hobbyists. Through the new initiatives at the MIT Center for Mobile Learning, App Inventor will be connected to MIT’s premier research in educational technology and MIT’s long track record of creating and supporting open software.

Google first launched App Inventor internally in order to move it forward with speed and focus, and then developed it to a point where it started to gain critical mass. Now, its impact can be amplified by collaboration with a top academic institution. At MIT, App Inventor will adopt an enriched research agenda with increased opportunities to influence the educational community. In a way, App Inventor has now come full circle, as I actually initiated App Inventor at Google by proposing it as a project during my sabbatical with the company in 2008. The core code for App Inventor came from Eric Klopfer’s lab, and the inspiration came from Mitch Resnick’s Scratch project. The new center is a perfect example of how industry and academia can collaborate effectively to create change enabled by technology, and we look forward to seeing what we can do next, together.

Our Faculty Institute brings faculty back to the drawing board

Research Admin — Fri, 12 Aug 2011 18:00:00 +0000

Posted by Nina Kim Schultz, Google Education Research

Cross-posted with the Official Google Blog

School may still be out for summer, but teachers remain hard at work. This week, we hosted Google’s inaugural Faculty Institute at our Mountain View, Calif. headquarters. The three-day event was created for esteemed faculty from schools of education and math and science to explore teaching paradigms that leverage technology in K-12 classrooms. Selected via a rigorous nomination and application process, the 39 faculty members hail from 19 California State Universities (CSUs), as well as Stanford and UC Berkeley, and teach high school STEM (Science, Technology, Engineering and Math) teachers currently getting their teaching credentials. CSU programs credential 60 percent of California’s teachers—or 10 percent of all U.S. K-12 teachers—and one CSU campus alone can credential around 1,000 new teachers in a year. The purpose of gathering together at the Institute was to ensure our teachers’ teachers have the support they need to help educators adjust to a changing landscape.

There is so much technology available to educators today, but unless they learn how to use it effectively, it does little to change what is happening in our classrooms. Without the right training and inspiration, interactive displays become merely expensive projection screens, and laptops simply replace paper rather than shifting the way teachers teach and students learn. Although the possibilities for technology use in schools are endless, teacher preparation for the 21st century classroom also has many constraints. For example: beyond the expense involved, there’s the time it costs educators to match a technological innovation to the improvement of pedagogy and curriculum; there’s a distinct shift in thinking that needs to take place to change classrooms; and there’s an essential challenge to help teachers develop the dispositions and confidence to be lifelong evaluators, learners and teachers of technology, instead of continuing to rely on traditional skill sets that will soon be outdated.

The Institute featured keynote addresses from respected professors from Stanford and Berkeley, case studies from distinguished high school teachers from across California, hands-on technology workshops with a variety of Google and non-Google tools, and panels with professionals in the tech-education industry. Notable guests included representatives from Teach for America, The New Teacher Project, the Department of Education and Edutopia. Topics covered the ability to distinguish learning paths, how to use technology to transform classrooms into project-based, collaborative spaces and how to utilize a more interactive teaching style rather than the traditional lecture model.

On the last day of the Institute, faculty members were invited to submit grant proposals to scale best practices outside of the meeting. Deans of the participating universities will convene at the end of the month to further brainstorm ways to scale new ideas in teacher preparation programs. Congratulations to all of the faculty members who were accepted into the inaugural Institute, and thank you for all that you do to help bring technology and new ways of thinking into the classroom.

This program is a part of Google’s continued commitment to supporting STEM education. Details on our other programs can be found on www.google.com/education.

Culturomics, Ngrams and new power tools for Science

Research Admin — Wed, 10 Aug 2011 22:51:00 +0000

Posted by Erez Lieberman Aiden and Jean-Baptiste Michel, Visiting Faculty at Google

Four years ago, we set out to create a research engine that would help people explore our cultural history by statistically analyzing the world’s books. In January 2011, the resulting method, culturomics, was featured on the cover of the journal Science. More importantly, Google implemented and launched a web-based version of our prototype research engine, the Google Books Ngram Viewer.

Now scientists, scholars, and web surfers around the world can take advantage of the Ngram Viewer to study a vast array of phenomena. And that's exactly what they've done. Here are a few of our favorite examples.

Poverty
Martin Ravallion, head of the Development Research Group at the World Bank, has been using the ngrams to study the history of poverty. In a paper published in the journal Poverty and Public Policy, he argues for the existence of two ‘poverty enlightenments’ marked by increased awareness of the problem: one towards the end of the 18th century, and another in the 1970s and 80s. But he makes the point that only the second of these enlightenments brought with it a truly enlightened idea: that poverty can be and should be completely eradicated.

The Science Hall of Fame
Adrian Veres and John Bohannon wondered who the most famous scientists of the past two centuries were. But there was no hall of fame for scientists, or a committee that determines who deserves to get into such a hall. So they used the ngrams data to define a metric for celebrity – the milliDarwin – and algorithmically created a Science Hall of Fame listing the most famous scientists born since 1800. They found that things like a popular book or a major controversy did more to increase discussion of a scientist than, for instance, winning a Nobel Prize.

(Other users have been exploring the history of particular sciences with the Ngram Viewer, covering everything from neuroscience to the nuclear age.)

The History of Typography
When we introduced the Ngram Viewer, we pointed out some potential pitfalls with the data. For instance, the ‘medial s’ ( ſ ), an older form of the letter s that looked like an integral sign and appeared in the beginning or middle of words, tends to be classified as an instance of the letter ‘f’ by the OCR algorithm used to create our version of the data. Andrew West, blogging at Babelstone, found a clever way to exploit this error: using queries like ‘husband’ and ‘hufband’ to study the history of medial s typography, he pinned down the precise moment when the medial s disappeared from English (around 1800), French (1780), and Spanish (1760).

People are clearly having a good time with the Ngram Viewer, and they have been learning a few things about science and history in the process. Indeed, the tool has proven so popular and so useful that Google recently announced that its imminent graduation from Google Labs to become a permanent part of Google Books.

Similar ‘big data’ approaches can also be applied to a wide variety of other problems. From books to maps to the structure of the web itself, 'the world's information' is one amazing dataset.

Erez Lieberman Aiden is Visiting Faculty at Google and a Fellow of the Harvard Society of Fellows. Jean-Baptiste Michel is Visiting Faculty at Google and a Postdoctoral Fellow in Harvard's Department of Psychology.

What You Capture Is What You Get: A New Way for Task Migration Across Devices

Research Admin — Tue, 12 Jul 2011 21:45:00 +0000

Posted by Yang Li, Research Scientist

We constantly move from one device to another while carrying out everyday tasks. For example, we might find an interesting article on a desktop computer at work, then bring the article with us on a mobile phone during the commute and keep reading it on a laptop or a TV when we get home. Cloud computing and web applications have made it possible to access the same data and applications on different devices and platforms. However, there are not many ways to easily move tasks across devices that are as intuitive as drag-and-drop in a graphical user interface.

Since last year, our research team started developing new technologies for users to easily migrate their tasks across devices. In a project named Deep Shot, we demonstrated how a user can easily move web pages and applications, such as Google Maps directions, between a laptop and an Android phone by using the phone camera. With Deep Shot, a user can simply take a picture of their monitor with a phone camera, and the captured content automatically shows up and becomes instantly interactive on the mobile phone.

This project was inspired by our observations that many people tend to take a picture of map directions on the monitor using their mobile phone camera, rather than using other approaches such as email. Taking pictures feels more direct and convenient, and fits well our everyday activity that is often more opportunistic. Instead of just capturing raw pixels, Deep Shot recovers the actual contents and applications on the mobile phone based on these pixels. You can find out how Deep Shot keeps user interaction simple and what happens behind the scenes here. Similar to WYSIWYG—What You See Is What You Get—for graphical user interfaces, Deep Shot demonstrates WYCIWYG—What You Capture Is What You Get—for cross-device interaction. We are exploring this interaction style for various task migration situations in our everyday life.

Deep Shot remains a research project at Google. With increasing capabilities of mobile phones and fast growing web applications, we hope to explore more exciting ways to help users carry out their everyday activities.

Languages of the World (Wide Web)

Research Admin — Fri, 08 Jul 2011 00:15:00 +0000

Posted by Daniel Ford and Josh Batson

The web is vast and infinite. Its pages link together in a complex network, containing remarkable structures and patterns. Some of the clearest patterns relate to language.

Most web pages link to other pages on the same web site, and the few off-site links they have are almost always to other pages in the same language. It's as if each language has its own web which is loosely linked to the webs of other languages. However, there are a small but significant number of off-site links between languages. These give tantalizing hints of the world beyond the virtual.

To see the connections between languages, start by taking the several billion most important pages on the web in 2008, including all pages in smaller languages, and look at the off-site links between these pages. The particular choice of pages in our corpus here reflects decisions about what is `important'. For example, in a language with few pages every page is considered important, while for languages with more pages some selection method is required, based on pagerank for example.

We can use our corpus to draw a very simple graph of the web, with a node for each language and an edge between two languages if more than one percent of the offsite links in the first language land on pages in the second. To make things a little clearer, we only show the languages which have at least a hundred thousand pages and have a strong link with another language, meaning at least 1% of off-site links go to that language. We also leave out English, which we'll discuss more in a moment. (Figure 1)

Looking at the language web in 2008, we see a surprisingly clear map of Europe and Asia.
The language linkages invite explanations around geopolitics, linguistics, and historical associations.

Figure 1: Language links on the web.

The outlines of the Iberian and Scandinavian Peninsulas are clearly visible, which suggest geographic rather than purely linguistic associations.

Examining links between other languages, it seems that many are explained by people and communities which speak both languages.

The language webs of many former Soviet republics link back to the Russian web, with the strongest link from Ukrainian. While Russia is the major importer of Ukrainian products, the bilingual nature of Ukraine is a more plausible explanation. Most Ukrainians speak both languages, and Russian is even the dominant language in large parts of the country.

The link from Arabic to French speaks to the long connection between France and its former colonies. In many of these countries Arabic and French are now commonly spoken together, and there has been significant emigration from these countries to France.

Another strong link is between the Malay/Malaysian and Indonesian webs. Malaysia and Indonesia share a border, but more importantly the languages are nearly eighty percent cognate, meaning speakers of one can easily understand the other.

What about the sizes of each language web? Both the number of sites in each language and the number of urls seen by Google's crawler follow an exponential distribution, although the ordering for each is slightly different (Figure 2). The exact number of pages in each language in 2008 is unknown, since multiple urls may point to the same page and some pages may not have been seen at all. However, the language of an un-crawled url can be guessed by the dominant language of its site. In fact, calendar pages and other infinite spaces mean that there really are an unlimited number of pages on the web, though some are more useful than others.

Figure 2: The number of sites and seen urls per language are roughly exponentially distributed.

The largest language on the web, in terms of size and centrality, has always been English, but where is it on our map?

Every language on the web has strong links to English, usually with around twenty percent of offsite links and occasionally over forty five percent, such as from Tagalog/Filipino, spoken in the Philippines, and Urdu, principally spoken in Pakistan (Figure 3). Both the Philippines and Pakistan are former British colonies where English is one of the two official languages.

Figure 3: Language links to and from English

You might wonder whether off-site links landing on English pages can be explained simply by the number of English pages available to be linked to. The webs of other languages in our corpus typically have sixty to eighty percent of their out-language links to English pages. However, only 38 percent of the pages and 42 percent of sites in our set are English, while it attracts 79 percent of all out-language links from other languages.

Chinese and Japanese also seem unusual because there are relatively few links from pages in these languages to pages in English. This is despite the fact that Japanese and Chinese sites are the most popular non-English sites for English sites to link to. However, the number of sites in a language is a strong predictor of its `introversion', or fraction of off-site links to pages in the same language. Taking this into account shows that Chinese and Japanese webs are not unusually introverted given their size. In general, language webs with more sites are more introverted, perhaps due to better availability of content. (Figure 4)

Figure 4: Language size vs introversion.

There is a roughly linear relationship between the (log) number of sites in a language and the fraction of off-site links which point to pages in the same language, with a correlation of 0.9 if English is removed. However, only 45 percent of off-site links from English pages are to other English pages, making English the most extroverted web language given its size. Other notable outliers are the Hindi web, which is unusually introverted, and the Tagalog and Malay webs which are unusually extroverted.

We can generate another map by connecting languages if the number of links from one to the other is 50 times greater than expected given the number of out-of-language links and the size of the language linked to (Figure 5). This time, the native languages of India show up clearly. Surprising links include those from Hindi to Ukrainian, Kurdish to Swedish, Swahili to Tagalog and Bengali, and Esperanto to Polish.

Figure 5: Unexpected connections, given the size of each language.

What's happened since 2008? The languages of the web have become more densely connected. There is now significant content in even more languages, and these languages are more closely linked. We hope that tools like Google page translation, voice translation, and other services will accelerate this process and bring more people in the world closer together, whichever languages they speak.

UPDATE 9 July 2011: As has been pointed out in the comments, in both the Philippines and Pakistan, English is one of the two official languages; however, the Philippines was not a British colony.

Google Translate welcomes you to the Indic web

Research Admin — Tue, 21 Jun 2011 16:30:00 +0000

Posted by Ashish Venugopal, Research Scientist

(Cross-posted on the Translate Blog and the Official Google Blog)

Beginning today, you can explore the linguistic diversity of the Indian sub-continent with Google Translate, which now supports five new experimental alpha languages: Bengali, Gujarati, Kannada, Tamil and Telugu. In India and Bangladesh alone, more than 500 million people speak these five languages. Since 2009, we’ve launched a total of 11 alpha languages, bringing the current number of languages supported by Google Translate to 63.

Indic languages differ from English in many ways, presenting several exciting challenges when developing their respective translation systems. Indian languages often use the Subject Object Verb (SOV) ordering to form sentences, unlike English, which uses Subject Verb Object (SVO) ordering. This difference in sentence structure makes it harder to produce fluent translations; the more words that need to be reordered, the more chance there is to make mistakes when moving them. Tamil, Telugu and Kannada are also highly agglutinative, meaning a single word often includes affixes that represent additional meaning, like tense or number. Fortunately, our research to improve Japanese (an SOV language) translation helped us with the word order challenge, while our work translating languages like German, Turkish and Russian provided insight into the agglutination problem.

You can expect translations for these new alpha languages to be less fluent and include many more untranslated words than some of our more mature languages—like Spanish or Chinese—which have much more of the web content that powers our statistical machine translation approach. Despite these challenges, we release alpha languages when we believe that they help people better access the multilingual web. If you notice incorrect or missing translations for any of our languages, please correct us; we enjoy learning from our mistakes and your feedback helps us graduate new languages from alpha status. If you’re a translator, you’ll also be able to take advantage of our machine translated output when using the Google Translator Toolkit.

Since these languages each have their own unique scripts, we’ve enabled a transliterated input method for those of you without Indian language keyboards. For example, if you type in the word “nandri,” it will generate the Tamil word நன்றி (see what it means). To see all these beautiful scripts in action, you’ll need to install fonts* for each language.

We hope that the launch of these new alpha languages will help you better understand the Indic web and encourage the publication of new content in Indic languages, taking us five alpha steps closer to a web without language barriers.

*Download the fonts for each language: Tamil, Telugu, Bengali, Gujarati and Kannada.

Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths

Research Admin — Mon, 20 Jun 2011 15:00:00 +0000

Posted by Matthias Grundmann, Vivek Kwatra, and Irfan Essa, Research Team

Earlier this year, we announced the launch of new features on the YouTube Video Editor, including stabilization for shaky videos, with the ability to preview them in real-time. The core technology behind this feature is detailed in this paper, which will be presented at the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 2011).

Casually shot videos captured by handheld or mobile cameras suffer from significant amount of shake. Existing in-camera stabilization methods dampen high-frequency jitter but do not suppress low-frequency movements and bounces, such as those observed in videos captured by a walking person. On the other hand, most professionally shot videos usually consist of carefully designed camera configurations, using specialized equipment such as tripods or camera dollies, and employ ease-in and ease-out for transitions. Our goal was to devise a completely automatic method for converting casual shaky footage into more pleasant and professional looking videos.

Our technique mimics the cinematographic principles outlined above by automatically determining the best camera path using a robust optimization technique. The original, shaky camera path is divided into a set of segments, each approximated by either a constant, linear or parabolic motion. Our optimization finds the best of all possible partitions using a computationally efficient and stable algorithm.

To achieve real-time performance on the web, we distribute the computation across multiple machines in the cloud. This enables us to provide users with a real-time preview and interactive control of the stabilized result. Above we provide a video demonstration of how to use this feature on the YouTube Editor. We will also demo this live at Google’s exhibition booth in CVPR 2011.

For further details, please read our paper.

Google at CVPR 2011

Research Admin — Thu, 16 Jun 2011 21:00:00 +0000

Posted by Mei Han and Sergey Ioffe, Research Team

The computer vision community will get together in Colorado Springs the week of June 20th for the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 2011). This year will see a record number of people attending the conference and 27 co-located workshops and tutorials. The registration was closed at 1500 attendees even before the conference started.

Computer Vision is at the core of many Google products, such as Image Search, YouTube, Street View, Picasa, and Goggles, and as always, Google is involved in several ways with CVPR. Andrew Senior is serving as an area chair of CVPR 2011 and many Googlers are reviewers. Googlers also co-authored these papers:

Where's Waldo: Matching People in Images of Crowds by Rahul Garg, Deva Ramanan, Steve Seitz, Noah Snavely
Visual and Semantic Similarity in ImageNet by Thomas Deselaers, Vittorio Ferrari
Multicore Bundle Adjustment by Changchang Wu, Sameer Agarwal, Brian Curless, Steve Seitz
A Hierarchical Conditional Random Field Model for Labeling and Segmenting Images of Street Scenes by Qixing Huang, Mei Han, Bo Wu, Sergey Ioffe
Kernelized Structural SVM Learning for Supervised Object Segmentation by Luca Bertelli, Tianli Yu, Diem Vu, Salih Gokturk
Discriminative Tag Learning on YouTube Videos with Latent Sub-tags by Weilong Yang, George Toderici
Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths by Matthias Grundmann, Vivek Kwatra, Irfan Essa
Image Saliency: From Local to Global Context by Meng Wang, Janusz Konrad, Prakash Ishwar, Yushi Jing, Henry Rowley

If you are attending the conference, stop by Google’s exhibition booth. In addition to talking with Google researchers, you will get to see examples of exciting computer vision research that has made it into Google products including, among others, the following:

Google Earth Facade Shadow Removal by Mei Han, Vivek Kwatra, and Shengyang Dai
We will demonstrate our technique for removing shadows and other lighting/texture artifacts from building facades in Google Earth. We obtain cleaner, clearer, and more uniform textures which provide users with an improved visual experience.
Video Stabilization on YouTube Editor by Matthias Grundmann, Vivek Kwatra, and Irfan Essa
Casually shot videos captured by handheld or mobile cameras suffer from significant amount of shake. In contrast, professionally shot video usually employs stabilization equipment such as tripods or camera dollies, and employ ease-in and ease-out for transitions. Our technique mimics these cinematographic principles, by optimally dividing the original, shaky camera path into a set of segments and approximating each with either constant, linear or parabolic motion using a computationally efficient and stable algorithm. We will showcase a live version of our algorithm, featuring real-time performance and interactive control, which is publicly available at youtube.com/editor.
Tag Suggest for YouTube by George Toderici and Mehmet Emre Sargin
YouTube offers millions of users the opportunity to upload videos and share them with their friends. Many users would love to have their videos discoverable but don't annotate them properly. One new feature on YouTube that seeks to address this problem is tag prediction based on video content and independently based on text metadata.

6/17/2011 UPDATE: "Posted by" was changed to include Sergey Ioffe.

Our first round of Google Research Awards for 2011

Research Admin — Thu, 09 Jun 2011 13:00:00 +0000

Posted by Maggie Johnson, Director of Education & University Relations

We’ve just finished awarding the latest round of Google Research Awards, which provide funding to full-time faculty working on research in areas of mutual interest with Google. A record number of submissions came in this round, and we are delighted to be funding 112 awards across 21 different focus areas for a total of more than $6.75 million. The subject areas that received the highest level of support were systems and infrastructure, human computer interaction, Geo/maps and machine learning. Thanks to strong international collaborations, 23% of the funding in this round was awarded to universities outside the U.S.

In prior years, we’ve used this blog post to highlight some of our top-ranked projects, but this year, we’d like to give you an inside look into how we determine the award recipients.

Designating the awards involves a careful and detailed review process. First, we have a set of internal research leads, each a well-known expert in their field, review all the proposals in their area. They assess the proposals on merit, innovation, connection to Google’s products and services and fit with our overall research agenda. The research leads then assign several volunteer reviewers—culled from experts on their team or other Google engineers holding PhDs—to weigh each proposal.

All these reviews are recorded in an internal grant administration system, and the research leads make their funding recommendations. These recommendations are aggregated and a series of committee meetings are run, one for each research area. The research lead attends, along with members of the university relations team and executives in research. This committee reviews each proposal that the research lead has recommended for funding, using the same criteria mentioned above. This additional review process may change the proposal rankings and sometimes brings back other proposals for reconsideration.

Once the committee meetings are complete, we make the final funding decisions, which are based on the available budget and balancing the funding across research areas and geographic regions. The final decisions are reviewed one last time by research management, and then we distribute the awards to the selected faculty.

As the number of submissions for these research awards continues to grow, we remain committed to a merit-based review process with effective checks and balances. Congratulations to the well-deserving recipients of this round’s awards, and if you are interested in applying for the next round (deadline is August 1), please visit our website for more information.

Instant Mix for Music Beta by Google

Research Admin — Wed, 08 Jun 2011 21:16:00 +0000

Posted by Douglas Eck, Research Scientist

Music Beta by Google was announced at the Day One Keynote of Google I/O 2011. This service allows users to stream their music collections from the cloud to any supported device, including a web browser. It’s a first step in creating a platform that gives users a range of compelling music experiences. One key component of the product, Instant Mix, is a playlist generator developed by Google Research. Instant Mix uses machine hearing to extract attributes from audio which can be used to answer questions such as “Is there a Hammond B-3 organ?” (instrumentation / timbre), “Is it angry?” (mood), “Can I jog to it?” (tempo / meter) and so on. Machine learning algorithms relate these audio features to what we know about music on the web, such as the fact that Jimmy Smith is a jazz organist or that Arcade Fire and Wolf Parade are similar artists. From this we can predict similar tracks for a seed track and, with some additional sequencing logic, generate Instant Mix playlists from songs in a user’s locker.

Because we combine audio analysis with information about which artists and albums go well together, we can use both dimensions of similarity to compare songs. If you pick a mellow track from an album, we will make a mellower playlist than if you pick a high energy track from the same album. For example, here we compare short Instant Mixes made from two very different tracks by U2. The first Instant Mix comes from "Mysterious Ways," an upbeat, danceable track from Achtung Baby with electric guitar and heavy percussion.

U2 "Mysterious Ways"
David Bowie "Fame"
Oingo Boingo "Gratitude"
Infectious Grooves “Spreck”
Red Hot Chili Peppers “Special Secret Song Inside”

Compare this to a short Instant Mix made from a much more laid back U2 cut, "MLK" from the album Unforgettable Fire. This track has delicate vocals on top of a sparse synthesizer background and no percussion.

U2 "MLK"
Jewel “Don’t”
Antony and the Johnsons “What Can I Do?”
The Beatles “And I Love Her”
Van Morrison “Crazy Love”

As you can hear, the “Mysterious Ways” Instant Mix is funky, with strong percussion and high-energy vocals while the “MLK” mix carries on with that track's laid-back lullaby feeling.

Our approach also allows us to create mixes from music in the long tail. Are you the lead singer in an unknown Dylan cover band? Even if your group is new or otherwise unknown, Instant Mix can still use audio similarity to match your tracks to real Dylan tracks (provided, of course, that you sing like Bob and your band sounds like The Band).

Our goal with Instant Mix is to build awesome playlists from your music collection. We achieve this by using machine learning to blend a wide range of information sources, including features derived from the music audio itself. Though we’re still in beta, and still have a lot of work to do, we believe Instant Mix is a great tool for music discovery that stands out from the crowd. Give it a try!

Further reading by Google Researchers:
Machine Hearing: An Emerging Field
Richard F. Lyon.

Sound Ranking Using Auditory Sparse-Code Representations
Martin Rehn, Richard F. Lyon, Samy Bengio, Thomas C. Walters, Gal Chechik.

Large-Scale Music Annotation and Retrieval: Learning to Rank in Joint Semantic Spaces
Jason Weston, Samy Bengio, Philippe Hamel.

Google Scribe: Now with automatic text for links and faster formatting options

Research Admin — Thu, 26 May 2011 18:00:00 +0000

Posted by Kartik Singh and Kuntal Loya, Google Scribe team

Since Google Scribe's first release on Google Labs last year, we have been poring over your feedback and busy adding the top features you asked for. Today, we're excited to announce a new version of Google Scribe that brings more features to word processing.

Besides formatting, Google Scribe provides features that help you author high quality documents quickly:

Automatic text for links
Adding a hyperlink to your document has been a two-step process of choosing the link and the text to display for it. Google Scribe now makes it easier. Just paste or type any link into your document and Google Scribe will set an appropriate link text.
Smart toolbar
Do you repeatedly spend time reaching out to the toolbar to format your document ? To speed-up formatting, Google Scribe now displays an abridged toolbar close-by when you select a portion of the document.
Text completion in 12 languages
Google Scribe auto-completes text as you type. In addition to saving keystrokes, the suggestions indicate correct or popular phrases to use. Google Scribe now auto-detects document language, so you no longer need to choose a language.

You can view other applicable suggestions by clicking on the options button next to the Google Scribe icon and choosing “Show Multiple Suggestions”.

We have extended auto-complete support to Arabic, Dutch, French, German, Hungarian, Italian, Polish, Portuguese, Russian, Spanish and Swedish in addition to English that we already supported.
Correct your document as you type
Google Scribe now has basic support for checking spelling, punctuation and phrases in your document. Google Scribe underlines incorrect usage and clicking on underlined words or phrases will display a menu of suggested corrections to choose from.

We are continuously working on expanding the list of proofreading features. Stay tuned.

Try out the new Google Scribe at scribe.googlelabs.com and let us know what you think.

Google at ACL 2011

Research Admin — Thu, 19 May 2011 00:00:00 +0000

Posted by Ryan McDonald and Fernando Pereira, Research Team

The Annual Meeting of the Association for Computational Linguistics is one of the premier conferences for language and text technologies. Many employees at Google have strong roots in the community of researchers that attend this meeting, including many of our researchers working on machine translation and speech.

At this years conference, Google is particularly well represented. The General Chair is Dekang Lin and a few Googlers are serving as technical Area Chairs (in addition to the plethora of Googlers that reviewed papers for the conference). Google is also a Platinum Sponsor of ACL this year.

Research advances at Google can be seen throughout the conference’s technical content. Below is a complete list of Googler-authored or co-authored papers in the main conference. We want to give special emphasis to this year’s best paper award, given to “Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections” by CMU graduate student and Google intern Dipanjan Das and his internship advisor Slav Petrov. ACL is an extremely selective conference and this award speaks volumes to the importance of syntactic analysis and using bilingual corpora to project syntactic resources from resource rich languages (like English) to other languages. Congratulations Dipanjan and Slav!

Googlers are also involved in two of this year’s tutorials. Marius Pasca will present “Web Search Queries as a Corpus” and Kuzman Ganchev and his colleagues will teach about “Rich Prior Knowledge in Learning for Natural Language Processing”. Finally, Katja Filippova and her colleagues are running a workshop on “Monolingual Text-to-Text Generation”.

ACL will take place this year in Portland from June 19th to June 24th.

Papers by Googlers (a * indicates a paper that will be linked to after the conference):

Ranking Class Labels Using Query Sessions*
Marius Pasca

Fine-Grained Class Label Markup of Search Queries*
Joseph Reisinger and Marius Pasca

Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections
Dipanjan Das and Slav Petrov

Large-Scale Cross-Document Coreference Using Distributed Inference and Hierarchical Models
Sameer Singh, Amarnag Subramanya, Fernando Pereira and Andrew McCallum

Piggyback: Using Search Engines for Robust Cross-Domain Named Entity Recognition
Stefan Rüd, Massimiliano Ciaramita, Jens Müller and Hinrich Schütze

Beam-Width Prediction for Efficient Context-Free Parsing
Nathan Bodenstab, Aaron Dunlop, Keith Hall and Brian Roark

Language-independent compound splitting with morphological operations
Klaus Macherey, Andrew Dai, David Talbot, Ashok Popat and Franz Och

Model-Based Aligner Combination Using Dual Decomposition
John DeNero and Klaus Macherey

Binarized Forest to String Translation
Hao Zhang, Licheng Fang, Peng Xu and Xiaoyun Wu

Semi-supervised Latent Variable Models for Fine-grained Sentiment Analysis
Oscar Tackstrom and Ryan McDonald

Make beautiful interactive maps even faster with new additions to the Fusion Tables API

Research Admin — Tue, 10 May 2011 21:00:00 +0000

Posted by Rebecca Shapley, Jayant Madhavan, Rod McChesney, and Kathryn Hurley, Fusion Tables team

Google Fusion Tables is a modern data management and publishing web application that makes it easy to host, manage, collaborate on, visualize, and publish data tables online. Since we first launched Fusion Tables almost two years ago, we've seen tremendous interest and usage from dozens of areas, from journalists to scientists to open-data entrepreneurs, and have been excited to see the innovative applications that our users have been able to rapidly build and publish.

We've been working hard to enrich what Fusion Tables offers for customization and control of visual presentation. This past fall we added the ability to style the colors and icons of mapped data with a few clicks in the Fusion Tables web app. This spring we made it easy to use HTML and customize what users see in the info window that appears after a click on the map. We’ve enjoyed seeing the impressive visualizations you have created. Some, like the Guardian’s map of deprivation in the UK, were created strictly within the web app, while apps like the Bay Citizen’s Bike Accident tracker and the Texas Tribune’s Census 2010 interactive map take advantage of the Fusion Tables SQL API to do even more customization.

Of course, it’s not always convenient to do everything through a web interface, and today we’re delighted to invite trusted testers to try out the new Fusion Tables Styling and Info Window API. Now developers will be able to set a table’s map colors and info windows with code.

Even better, this new Styling and Info Window API will be part of the Google APIs Console. The Google APIs Console helps you manage projects and teams, provision access quotas, and view analytics and metrics on your API usage. It also offers sample code that supports the OAuth 2.0 client key management flow you need to build secure apps for your users.

So if you've been looking for a way to programmatically create highly-customizable map visualizations from data tables, check out our new APIs and let us know what you think! To become a trusted tester, please apply to join the Google Group and tell us a little bit about how you use the Fusion Tables API.

Google at CHI 2011

Research Admin — Thu, 05 May 2011 14:30:00 +0000

Posted by Yang Li, Research Scientist

Cross-posted with the Technical Programs and Events Blog

Google has an increasing presence at ACM CHI: Conference on Human Factors in Computing Systems, which is the premiere conference for Human Computer Interaction research. Eight Google papers will appear at the conference. These papers not only touch on our core areas such as Search, Chrome and Android but also demonstrate our growing effort in new areas where HCI is essential, such as new search user interfaces, gesture-based interfaces and cross-device interaction. They showcase our efforts to address user experiences in diverse situations. Googlers are playing active roles in the conference in many other ways too: participating in conference committees, hosting panels, organizing workshops and teaching courses, as well as running demos and 1:1 sessions at Google's booth.

This year's CHI takes place in Vancouver, BC, from May 7th - 12th.

PAPERS

Gesture Avatar: A Technique for Operating Mobile User Interfaces Using Gestures, by Hao Lü, Yang Li*

User-Defined Motion Gestures for Mobile Interaction by Jaime Ruiz, Yang Li*, Edward Lank

Experimental Analysis of Touch-Screen Gesture Designs in Mobile Environments by Andrew Bragdon, Eugene Nelson, Yang Li*, Ken Hinckley

Many Bills: Engaging Citizens through Visualizations of Congressional Legislation by Yannick Assogba, Irene Ros, Joan DiMicco, Matt McKeon*

YouPivot: Improving Recall with Contextual Search by Joshua Hailpern, Nicholas Jitkoff*, Andrew Warr*, Karrie Karahalios, Robert Sesek, Nik Shkrob

Oops, I Did It Again: Mitigating Repeated Access Control Errors on Facebook by Serge Egelman, Andrew Oates*, Shriram Krishnamurthi

Deep Shot: A Framework for Migrating Tasks Across Devices Using Mobile Phone Cameras by Tsung-Hsiang Chang, Yang Li*

DoubleFlip: A Motion Gesture Delimiter for Mobile Interaction by Jaime Ruiz, Yang Li*

WORKSHOPS

Crowdsourcing and Human Computation: Systems, Studies and Platforms by Michael Bernstein, Ed H. Chi*, Lydia B. Chilton, Björn Hartmann, Aniket Kittur, Robert C. Miller

PANELS

Designing for User Experience: Academia & Industry by Joseph 'Jofish' Kaye, Elizabeth Buie, Jettie Hoonhout, Kristina Höök, Virpi Roto, Scott Jenson*, Peter Wright

Festschrift Panel in Honor of Stuart K. Card by Ed H. Chi*, Peter Pirolli, Bonnie John, Judith S Olson, Dan Russell*, Tom Moran

CHI Should be Replicating and Validating Results More: Discuss by Max L. Wilson, Wendy Mackay, Ed H. Chi*, Michael Bernstein, Dan Russell*, Harold Thimbleby

Transferability of Research Findings: Context-Dependent or Model-Driven by Ed H. Chi*, Mary Czerwinski, David Millen, Dave Randall, Gunnar Stevens, Volker Wulf, John Zimmerman

The Future of Child-Computer Interaction by Allison Druin, Gary Knell, Elliot Soloway, Dan Russell*, Elizabeth Mynatt, Yvonne Rogers

CASE STUDIES

From Basecamp to Summit: Scaling Field Research Across 9 Locations by Jens Riegelsberger*, Audrey Yang*, Konstantin Samoylov*, Elizabeth Nunge*, Molly Stevens*, Patrick Larvie*

COURSES

Design and Analysis of Large Scale Log Studies by Susan Dumais, Robin Jeffries*, Dan Russell*, Diane Tang*, Jaime Teevan

SIG MEETING

Participatory Culture in the Age of Social Media by Dana Rotman, Sarah Vieweg, Sarita Yardi, Ed H. Chi*, Jenny Preece, Ben Shneiderman, Peter Pirolli, Tom Glaisyer

Note: * denotes a Googler

Partnering with Tsinghua University to support education in Western China

Research Admin — Thu, 14 Apr 2011 15:30:00 +0000

Posted by Aimin Zhu, China University Relations

We’re excited to announce that we’ve teamed up with Tsinghua University to provide educational support to five major universities in Western China: Qinghai, Xinjiang, Guizhou, Ningxia, and Yunnan. Together, we aim to:

Support faculty development by recognizing outstanding teachers, sponsoring published papers, and funding academic exchange and cooperation with other universities
Establish specialized curricula by creating new courses focused on advanced industrial and web technologies
Cultivate student talent by inspiring scientific and technological innovation through local activities and programs.

A ceremony held at Tsinghua today kicked off what we expect to be a long and beneficial partnership to advance educational opportunities in the region.

1 billion core-hours of computational capacity for researchers

Research Admin — Thu, 07 Apr 2011 22:00:00 +0000

Posted by Dan Belov, Principal Engineer and David Konerding, Software Engineer

We’re pleased to announce a new academic research grant program: Google Exacycle for Visiting Faculty. Through this program, we’ll award up to 10 qualified researchers with at least 100 million core-hours each, for a total of 1 billion core-hours. The program is focused on large-scale, CPU-bound batch computations in research areas such as biomedicine, energy, finance, entertainment, and agriculture, amongst others. For example, projects developing large-scale genomic search and alignment, massively scaled Monte Carlo simulations, and sky survey image analysis could be an ideal fit.

Exacycle for Visiting Faculty expands upon our current efforts through University Relations to stimulate advances in science and engineering research, and awardees will participate through the Visiting Faculty Program. We invite full-time faculty members from universities worldwide to apply. All grantees, including those outside of the U.S., will work on-site at specific Google offices in the U.S. or abroad. The exact Google office location will be determined at the time of project selection.

We are excited to accept proposals starting today. The application deadline is 11:59 p.m. PST May 31, 2011. Applicants are encouraged to send in their proposals early as awards will be granted starting in June.

More information and details on how to apply for a Google Exacycle for Visiting Faculty grant can be found on the Google Exacycle for Visiting Faculty website.

Overlapping Experiment Infrastructure: More, Better, Faster Experimentation

Research Admin — Tue, 05 Apr 2011 00:00:00 +0000

Posted by Deirdre O'Brien and Diane Tang, Adwords Team

At Google, experimentation is practically a mantra; we evaluate almost every change that potentially affects what our users experience. Such changes include not only obvious user-visible changes such as modiﬁcations to a user interface, but also more subtle changes such as different machine learning algorithms that might affect ranking or content selection. Our insatiable appetite for experimentation has led us to tackle the problems of how to run more experiments, how to run experiments that produce better decisions, and how to run them faster.

Google's infrastructure supports this vast experimentation by using orthogonal diversion criteria for experiments in different "layers" so that each event (e.g. a web search) can be assigned to multiple experiments. The treatment and population sample are easily specified in data files allowing for fast and accurate experiment set up. We have also developed analytical tools to do experiment sizing and a metrics dashboard which provides summarized data within hours of experiment set up. Decision making is improved by the consistency and accuracy in metrics assured by these tools. We believe that Google's experimental system and processes described in this paper can be generalized and applied by any entity interested in using experimentation to improve search engines and other web applications.

Ig-pay Atin-lay Oice-vay Earch-say

Research Admin — Fri, 01 Apr 2011 14:15:00 +0000

Posted by Martin Jansche and Alex Salcianu, Google Speech Team

As you might know, Google Voice Search is available in more than two dozen languages and dialects, making it easy to perform Google searches just by speaking into your phone.

Today it is our pleasure to announce the launch of Pig Latin Voice Search!

What is Pig Latin you may ask? Wikipedia describes it as a language game where, for each English word, the first consonant (or consonant cluster) is moved to the end of the word and an “ay” is affixed (for example, “pig” yields “ig-pay” and “search” yields “earch-say”).

Our Pig Latin Voice Search is even more fun than our other languages, because when you speak in Pig Latin, our speech recognizer not only recognizes your piggy speech but also translates it automagically to normal English and does a Google search.

To configure Pig Latin Voice Search in your Android phone just go to Settings, select “Voice input & output settings”, and then “Voice recognizer settings”. In the list of languages you’ll see Pig Latin. Just select it and you are ready to roll in the mud!

It also works on iPhone with the Google Search app. In the app, tap the Settings icon, then "Voice Search" and select Pig Latin.

Ave-hay un-fay ith-way Ig-pay Atin-lay.

Pig Latin Voice Search works on Android 2.2 (Froyo) and later Android versions. If you don't already have Google Voice Search on your Android phone, scan or tap this QR code to download it.

The list of languages and dialects now supported by Google Voice Search includes:

US English, UK English, Australian English, Indian English, South African English
Spanish from Spain, Mexico, Argentina, and Latin America
French (France), Italian, and Portuguese (Brazil)
German (Germany) and Dutch
Russian, Polish, and Czech
Turkish
Japanese, Korean, Mandarin (Mainland China and Taiwan), and Cantonese
Bahasa Indonesia and Bahasa Malaysia
Afrikaans and isiZulu
Latin
Pig Latin

Word of Mouth: Introducing Voice Search for Indonesian, Malaysian and Latin American Spanish

Research Admin — Wed, 30 Mar 2011 17:00:00 +0000

Posted by Linne Ha, International Program Manager

Read more about the launch of Voice Search in Latin American Spanish on the Google América Latina blog.

Today we are excited to announce the launch of Voice Search in Indonesian, Malaysian, and Latin American Spanish, making Voice Search available in over two dozen languages and accents since our first launch in November 2008. This accomplishment could not have been possible without the help of local users in the region - really, we couldn’t have done it without them. Let me explain:

In 2010 we launched Voice Search in Dutch, the first language where we used the “word of mouth” project, a crowd-sourcing effort to collect the most accurate voice data possible.The traditional method of acquiring voice samples is to license the data from companies who specialize in the distribution of speech and text databases. However, from day one we knew that to build the most accurate Voice Search acoustic models possible, the best data would come from the people who would use Voice Search once it launched - our users.

Since then, in each country, we found small groups of people who were avid fans of Google products and were part of a large social network, either in local communities or on online. We gave them phones and asked them to get voice samples from their friends and family. Everyone was required to sign a consent form and all voice samples were anonymized. When possible, they also helped to test early versions of Voice Search as the product got closer to launch.

Building a speech recognizer is not just limited to localizing the user interface. We require thousands of hours of raw data to capture regional accents and idiomatic speech in all sorts of recording environments to mimic daily life use cases. For instance, when developing Voice Search for Latin American Spanish, we paid particular attention to Mexican and Argentinean Spanish. These two accents are more different from one another than any other pair of widely-used accents in all of South and Central America. Samples collected in these countries were very important bookends for building a version of Voice Search that would work across the whole of Latin America. We also chose key countries such as Peru, Chile, Costa Rica, Panama and Colombia to bridge the divergent accent varieties.

As an International Program Manager at Google, I have been fortunate enough to travel around the world and meet many of our local Google users. They often have great suggestions for the products that they love, and word of mouth was created with the vision that our users could participate in developing the product. These Voice Search launches would not have been possible without the help of our users, and we’re excited to be able to work together on the product development with the people who will ultimately use our products.

Reading tea leaves in the tourism industry: A Case Study in the Gulf Oil Spill

Research Admin — Thu, 24 Mar 2011 18:00:00 +0000

Posted by Hyunyoung Choi and Paul Liu, Senior Economists

A few years ago, our in-house economists, Hal Varian and Hyunyoung Choi, demonstrated how to “predict the present” with monthly visitor arrivals to Hong Kong. We took this idea further to see if search queries could predict the future. If users start to research their travel plans some weeks or months in advance, then intuitively shouldn’t we be able to extend "predicting the present" into "predicting the future?" We decided to test it out by focusing on a region whose tourism was recently severely impacted: Florida’s gulf coast.

With the travel industry still in the midst of recovering from a deep recession, the Gulf Oil spill had the potential to do significant economic damage. Our case study on the Gulf Oil spill helped find useful insight into people’s future travel plans to Florida; in fact, we found that travel search queries actually were good predictors for trips to Florida, and destinations within Florida, about 4 weeks later.

The results we saw surprised us. Google Insights for Search suggested that at least with respect to hotel bookings (using data from Smith Travel Research, Inc.), the aggregate effect of the oil spill was modest on Florida travel, since travelers tended to shift their destinations from the affected regions on the west coast to the east coast or central regions of Florida. In particular, hotel bookings for affected areas along the Gulf coast were 4.25% less than predicted, and unaffected areas along the Atlantic coast were 4.89% greater than predicted.

You can read the full case study here or try your own hand at predicting the future!

Games, auctions and beyond

Research Admin — Wed, 16 Mar 2011 21:45:00 +0000

Posted by Yossi Matias, Senior Director, Head of Israel R&D Center

In an effort to advance the understanding of market algorithms and Internet economics, Google has launched an academic research initiative focused on the underlying aspects of online auctions, pricing, game-theoretic strategies, and information exchange. Twenty professors from three leading Israeli academic institutions - the Hebrew University, Tel Aviv University and the Technion - will receive a Google grant to conduct research for three years.

In the past two decades, we have seen the Internet grow from a scientific network to an economic force that positively affects the global economy. E-commerce, online advertising, social networks and other new online business models present fascinating research questions and topics of study that can have a profound impact on society.

Consider online advertising, which is based on principles from algorithmic game theory and online auctions. The Internet has enabled advertising that is more segmented and measurable, making it more efficient than traditional advertising channels, such as newspaper classifieds, radio spots, and television commercials. These measurements have led to better pricing models, which are based on online real-time auctions. The original Internet auctions were designed by the industry, based on basic economic principles which have been known and appreciated for forty years.

As the Internet grows, online advertising is becoming more sophisticated, with developments such as ad-exchanges, advertising agencies which specialize in online markets, and new analytic tools. Optimizing value for advertisers and publishers in this new environment may benefit from a better understanding of the strategies and dynamics behind online auctions, the main driving tool of Internet advertising.

These grants will foster collaboration and interdisciplinary research by bringing together world renowned computer scientists, engineers, economists and game theorists to analyze complex online auctions and markets. Together, they will help bring this area of study into mainstream academic scientific research, ultimately advancing the field to the benefit of the industry at large.

The professors who received research grants include:

Hebrew University: Danny Dolev, Jeffrey S. Rosenschein, Noam Nisan (Computer Science and Engineering); Liad Blumrosen, Alex Gershkov, Eyal Winter (Economics); Michal Feldman and Ilan Kremer (Business). The last six are also members of the Center for the Study of Rationality.
Tel Aviv University: Yossi Azar, Amos Fiat, Haim Kaplan, and Yishay Mansour (Computer Science); Zvika Neeman (Economics); Ehud Lehrer and Eilon Solan (Mathematics); and Gal Oestreicher (Business).
Technion: Seffi Naor (Computer Science); Ron Lavi (Industrial Engineering); Shie Mannor and Ariel Orda (Electrical Engineering).

In addition to providing the funds, Google will offer support by inviting the researchers to seminars, workshops, faculty summits and brainstorming events. The results of this research will be published for the benefit of the Internet industry as a whole, and will contribute to the evolving discipline of market algorithms.

Large Scale Image Annotation: Learning to Rank with Joint Word-Image Embeddings

Research Admin — Thu, 10 Mar 2011 16:00:00 +0000

Posted by Jason Weston and Samy Bengio, Research Team

In our paper, we introduce a generic framework to find a joint representation of images and their labels, which can then be used for various tasks, including image ranking and image annotation.

We focus on the task of automatic assignment of annotations (text labels) to images given only the pixel representation of the image (i.e., with no known metadata). This is achieved by a learning algorithm, that is, where the computer learns to predict annotations for new images given annotated training images. Such training datasets are becoming larger and larger, with tens of millions of images and tens of thousands of possible annotations. In this paper, we propose a strongly performing method that scales to such datasets by simultaneously learning to optimize precision at the top of the ranked list of annotations for a given image and learning a low-dimensional joint embedding vector space for both images and annotations. Our system learns an interpretable model, where annotations with alternate wordings ("president obama" or "barack"), different languages ("tour eiﬀel" or "eiffel tower"), or similar concepts (such as "toad" or "frog") are close in the embedding space. Hence, even when our model does not predict the exact annotation given by a human labeler, it often predicts similar annotations.

Our system is trained on ~10 million images with ~100,000 possible annotation types and it annotates a single new image in ~0.17 seconds (not including feature processing) and consumes only 82MB of memory. Our method both outperforms all the methods we tested against and in comparison to them is faster and consumes less memory, making it possible to house such a system on a laptop or mobile device.

Julia meets HTML 5

Research Admin — Mon, 31 Jan 2011 20:00:00 +0000

Posted by Daniel Wolf, Software Engineer

Today, we launched Julia Map on Google Labs, a fractal renderer in HTML 5. Julia sets are fractals that were studied by the French mathematician Gaston Julia in the early 1920s. Fifty years later, Benoît Mandelbrot studied the set z2 − c and popularized it by generating the first computer visualisation. Generating these images requires heavy computation resources. Modern browsers have optimized JavaScript execution up to the point where it is now possible to render in a browser fractals like Julia sets almost instantly.

Julia Map uses the Google Maps API to zoom and pan into the fractals. The images are computed with HTML 5 canvas. Each image generally requires millions of floating point operations. Web workers spread the heavy calculations on all cores of the machine.

We hope you will enjoy exploring the different Julia sets, and share the URLs of the most artistic images you discovered. See what others have posted on Twitter under hashtag #juliamap. Click on the images below to dive in to infinity!

Google at NIPS 2010

Research Admin — Thu, 27 Jan 2011 15:30:00 +0000

Posted by Slav Petrov, Doug Aberdeen, and Lisa McCracken, Google Research

The machine learning community met in Vancouver in December for the 24th Neural Information Processing Systems Conference (NIPS). As always, the single-track program of the main conference featured a number of outstanding talks, followed by interesting late night poster sessions. A record number of workshops covered a wide variety of topics, while allocating sufficient time for skiing in Whistler - after all, many of the most interesting research conversations happen while riding the lift in-between ski runs. This year’s conference also featured a symposium dedicated to Sam Roweis, providing a retrospective on Sam’s life and work. Sam, a fellow Googler and professor at NYU, was at the heart of the NIPS community and is terribly missed.

As always, Google was involved in various ways with NIPS. Here at Google, we take a data-driven approach when solving problems. Therefore, Machine Learning is in one way or another at the core of most of the things that we do. It is therefore unsurprising that many Googlers helped shape the program of the conference or were in the audience. This year, three Googlers served as area chairs and even more were reviewers. Googlers also co-authored the following papers:

Label Embedding Trees for Large Multi-Class Tasks by Samy Bengio and Jason Weston
Learning Bounds for Importance Weighting by Corinna Cortes, Yishay Mansour, and Mehryar Mohri
Online Learning in the Manifold of Low-Rank Matrices by Uri Shalit, Daphna Weinshall, and Gal Chechik
Deterministic Single–Pass Algorithm for LDA by Issei Sato, Kenichi Kurihara, and Hiroshi Nakagawa
Distributed Dual Averaging In Networks by John Duchi, Alekh Agarwal, and Martin Wainwright

Additionally, Googlers co-organized three well attended workshops:

Coarse–to–Fine Learning and Inference by Ben Taskar, David Weiss, Benjamin Sapp, and Slav Petrov
Low–rank Methods for Large–scale Machine Learning by Arthur Gretton, Michael Mahoney, Mehryar Mohri, and Ameet Talwalkar
Learning on Cores, Clusters, and Clouds by John Duchi, Ofer Dekel, John Langford, Lawrence Cayton, and Alekh Agarwal

Finally, Yoram Singer gave a great talk on Learning Structural Sparsity at the Sam Roweis symposium and Googlers presented the following talks during the workshops:

Online Learning in the Manifold of Low–Rank Matrices by Uri Shalit, Daphna Weinshall, and Gal Chechik
Distributed MAP Inference for Undirected Graphical Models by Sameer Singh, Amar Subramanya, Fernando Pereira, and Andrew McCallum
MapReduce/Bigtable for Distributed Optimization by Keith Hall, Scott Gilpin and Gideon Mann
Self-Pruning Prediction Trees by Sally Goldman
Web Scale Image Annotation: Learning to Rank with Joint Word-Image Embeddings by Jason Weston, Samy Bengio, and Nicolas Usunier
Coarse–to–fine Decoding for Parsing and Machine Translation by Slav Petrov

Overall, it was a very successful conference and it was good to be back in Vancouver one last time. This coming year NIPS 2011 will be in Granada, Spain. Hasta luego!

More Google Contributions to the Broader Scientific Community

Research Admin — Wed, 26 Jan 2011 03:00:00 +0000

Posted by Corinna Cortes and Alfred Spector, Google Research

Googlers actively engage with the scientific community by publishing technical papers, contributing open-source packages, working on standards, introducing new APIs and tools, giving talks and presentations, participating in ongoing technical debates, and much more. Our publications offer technical and algorithmic advances, demonstrate things we learn as we develop novel products and services, and shed light on some of the technical challenges we face at Google.

In an effort to highlight some of our work, we periodically select a number of publications to be featured on this site. We first posted a set of papers on this blog in mid-2010 and subsequently discussed them in more detail in the following blog postings. This blog posting highlights a few new noteworthy papers authored or co-authored by Googlers from the later half of 2010. In the coming weeks we will be offering a more in-depth look at these publications, but here are some summaries:

Algorithms and Electronic Commerce

Robust Mechanisms for Risk-Averse Sellers
ACM Conference on Electronic Commerce (EC)

Mukund Sundararajan and Qiqi Yan, Stanford University

In his seminal Nobel prize-winning work, Roger Myerson identified the revenue-maximizing auction for a risk-neutral auctioneer. In contrast, this work identifies good mechanisms for risk-averse auctioneers. These mechanisms trade a little revenue for better certainty, in the best possible way. We expect this work will help guide reserve-price selection in auctions where auctioneers/sellers want better control over their revenue.

Monitoring Algorithms for Negative Feedback Systems
World Wide Web Conference (WWW)

Mark Sandler and S. Muthukrishnan

In negative feedback systems, users report abusive content at a site to its owner for consideration or removal, but the users might not be honest. For the site owners, this represents a trade-off between vetting such user reports by humans vs. accepting them without vetting. This paper presents a mathematical framework for design and analysis of such systems and presents algorithms with provably good trade-offs against malicious users.

HCI

Children's Roles Using Keyword Search Interfaces in the Home
Computer Human Interactions (CHI)

Allison Druin, University of Maryland, Elizabeth Foss, University of Maryland, Hilary Hutchinson, Evan Golub, University of Maryland, and Leshell Hatley, University of Maryland

In this paper, we describe seven search roles children display as information seekers using Internet keyword interfaces, based on a home study of 83 children ages 7, 9, and 11.

Machine Learning

Large Scale Image Annotation: Learning to Rank with Joint Word-Image Embeddings

European Conference on Machine Learning (ECML) Best Paper

Jason Weston, Samy Bengio, and Nicolas Usunier, Universite Paris 6 - LIP6

In this paper, we introduce a generic framework to find a joint representation of images and their labels, which can then be used for various tasks, including image ranking and image annotation. We simultaneously propose an efficient training algorithm that scales to tens of millions of images and hundreds of thousands of labels, while focusing training on making good predictions at the top of the ranked list. The models are both fast at prediction time and have low memory usage making it possible to house such systems on a laptop or mobile device.

Overlapping Experiment Infrastructure: More, Better

Faster Experimentation, Knowledge Discovery and Datamining (KDD)

Diane Tang, Ashish Agarwal, Deirdre O'Brien, and Mike Meyer

Google's data driven culture requires running a large number of live traffic experiments. This paper describes Google's overlapping experimental infrastructure where a single event (e.g. a web search) can be assigned to multiple simultaneous large experiments. The infrastructure and supporting tools provide a framework that enables running experiments from design to decision making and launch, and can be generalized to many other web applications.

NLP

Products of Random Latent Variable Grammars

North American Chapter of the Association for Computational Linguistics (NAACL)

Slav Petrov

It is well known that the Expectation Maximization algorithm can converge to widely varying local maxima. This paper shows that this can be advantageous when learning latent variable grammars for syntactic parsing. By combining multiple state-of-the-art individual grammars into an unweighted product model, parsing accuracy can be improved from 90.2% to 91.8% for English, and from 80.3% to 84.5% for German.

Software Engineering

Contention Aware Execution: Online Contention Detection and Response

International Symposium on Code Generation and Optimization (CGO)

Jason Mars, University of Virginia, Neil Vachharajani, Robert Hundt, Mary Lou Soffa, University of Virginia

This paper makes a big step forward in addressing an important and pressing problem in the field of Computer Science today. This work presents a lightweight runtime solution that significantly improves the utilization of datacenter servers by up to 58% on average. This work also received the CGO 2010 Best Presentation Award.

Speech

Say What? Why users choose to speak their web queries

Interspeech

Maryam Kamvar and Doug Beeferman

Say What? Have you been speaking your search queries into your mobile device rather than typing them? Spoken search is available on Android, iPhone and Blackberry devices and we see an increasing numbers of searches coming in by voice on these phones. In our paper “Say What: Why users choose to speak their web queries” we investigate, on an aggregate level, what factors are most predictive of spoken queries. Understanding context in which a speech-driven search is used (or conversely not used) can be used to improve recognition engines and spoken interface design. So, save keystrokes and say your query!

Query Language Modeling for Voice Search

IEEE Workshop on Spoken Language Technology

Ciprian Chelba, Johan Schalkwyk, Thorsten Brants, Vida Ha, Boulos Harb, Will Neveitt, Carolina Parada*, Johns Hopkins University, and Peng Xu

The paper describes language modeling for google.com query data, and its application to speech recognition for Google Voice Search.
Our empirical findings include:

10% relative gains in WER from large scale modeling,
a less known yet potentially quite detrimental interaction between Kneser-Ney smoothing and entropy pruning (approx. 10% relative increase in WER)
evidence that hints at non-stationarity of the query stream, and
surprisingly strong dependence across three English locales---USA, Britain and Australia.

Structured Data

Dremel: Interactive Analysis of Web-Scale Datasets

Very Large Data Bases (VLDB)

Sergey Melnik, Andrey Gubarev, Jing Jing Long, Geoffrey Romer, Shiva Shivakumar, Matt Tolton, and Theo Vassilakis, Google Inc.

Dremel is a scalable, interactive ad-hoc query system. By combining multi-level execution trees and columnar data layout, it is capable of running aggregation queries over trillion-row tables in seconds. The system is widely used at Google and serves as the foundational technology behind BigQuery, a product launched in limited preview mode.

Systems and Infrastructure

Large-scale Incremental Processing Using Distributed Transactions and Notifications

USENIX Symposium on Operating Systems Design and Implementation (OSDI)

Daniel Peng and Frank Dabek

In the past, Google accumulated a whole day’s worth of changes to the web and ran a series of enormous MapReduces to apply this batch of changes to our index of the web. This system led to a delay of several days between crawling a document and presenting it to users in search results. To meet our goal of reducing the indexing delay to minutes, we needed to update the index as each individual document was crawled, rather than in daily batches. No existing infrastructure supported this kind of incremental transformation at web scale, so we built Percolator: a framework for transforming a large repository using small ACID transactions.

Availability in Globally Distributed Storage Systems

USENIX Symposium on Operating Systems Design and Implementation (OSDI)

Daniel Ford, Francois Labelle, Florentina Popovici, Murray Stokely, Van-Anh Truong*, Columbia University, Luiz Barroso, Carrie Grimes, and Sean Quinlan

In our paper, we characterize the availability of cloud storage systems, based on extensive monitoring of Google's main storage infrastructure, and the sources of failure which affect availability. We also present statistical models for reasoning about the impact of design choices such as data placement, recovery speed, and replication strategies, including replication across multiple data centers.

Vision

Improved Consistent Sampling, Weighted Minhash and L1 Sketching

IEEE International Conference on Data Mining (ICDM)

Sergey Ioffe

With the huge amounts of very high-dimensional data, such as images and videos, we frequently need to "sketch" the data -- that is, represent it in a much more compact form, while still allowing us to accurately determine how different any two images or videos are. In this paper, we describe a sketching method for L1, one of the most common distance measures. It works by first hashing the data with a new algorithm, and then compressing each hash to a small number of bits, which is learned from data. This method is fast and allows the distances to be estimated accurately, while reducing the storage requirements by a factor of 100.

*) work carried out while at Google

Robot hackathon connects with Android, browsers and the cloud

Research Admin — Fri, 17 Dec 2010 17:00:00 +0000

Posted by Ryan Hickman and Mamie Rheingold, 20% Robotics Task Force

With a beer fridge stocked and music blasting, engineers from across Google—and the world—spent the month of October soldering and hacking in their 20% time to connect hobbyist and educational robots with Android phones. Just two months later we’re psyched to announce three ways you can play with your iRobot Create(R), LEGO(R) MINDSTORMS(R) or VEX Pro(R) through the cloud:

For the month of October, we invited any Googler who wanted to contribute to connect robots to Google’s services in the cloud to pool their 20% time and participate in as much of the process as they could, from design to hard-core coding.

Thanks to our hardware partners (iRobot, LEGO Group, and VEX Robotics), we never suffered a shortage of supplies. Designers flew in from London, and prototypes were passed between engineers in Tel-Aviv, Hyderabad, Zurich, Munich and California. In Mountain View, we gathered around every Thursday night, rigging up a projector against the wall to share our week’s worth of demos while chowing on pizza. And here is what we produced (so far!):

App Inventor: Low level Bluetooth support for connecting with many serial-enabled robots, and of course tight integration with LEGO MINDSTORMS.
Cellbots for Android: Brand new Java app from Cellbots.com, which is open source and available for free in the Android Market.
Python library: Modularized version of the popular Cellbots project, which is all open source code.

We hope these applications provide some fun and inspire you to build upon this lightweight connectivity between robots, Android, the cloud and your browser.

Letting everyone do great things with App Inventor

Research Admin — Wed, 15 Dec 2010 18:45:00 +0000

Posted by Karen Parker, App Inventor Program Manager

In July, we announced App Inventor for Android, a Google Labs experiment that makes it easier for people to access the capabilities of their Android phone and create apps for their personal use. We were delighted (and honestly a bit overwhelmed!) by the interest that our announcement generated. We were even more delighted to hear the stories of what you were doing with App Inventor. All sorts of people (teachers and students, parents and kids, programming hobbyists and programming newbies) were building Android apps that perfectly fit their needs.

For example, we’ve heard of people building vocabulary apps for their children, SMS broadcasting apps for their community events, apps that track their favorite public transportation routes and—our favorite—a marriage proposal app.

We are so impressed with the great things people have done with App Inventor, we want to allow more people the opportunity to do great things. So we’re excited to announce that App Inventor (beta) is now available in Labs to anyone with a Google account.

Visit the App Inventor home page to get set up and start building your first app. And be sure to share your App Inventor story on the App Inventor user forum. Maybe this holiday season you can make a new kind of homemade gift—an app perfectly designed for the recipient’s needs!

$6 million to faculty in Q4 Research Awards

Research Admin — Wed, 08 Dec 2010 22:30:00 +0000

Posted by Maggie Johnson, Director of Education and University Relations

We've just completed the latest round of Google Research Awards, our program which identifies and supports faculty pursuing research in areas of mutual interest. We had a record number of submissions this round, and are funding 112 awards across 20 different areas—for a total of more than $6 million. We’re also providing more than 150 Android devices for research and curriculum development to faculty whose projects rely heavily on Android hardware.

The areas that received the highest level of funding, due to the large number of proposals in these areas, were systems and infrastructure, human computer interaction, security and multimedia. We also continue to support international research; in this round, 29 percent of the funding was awarded to universities outside the U.S.

Some examples from this round of awards:

Injong Rhee, North Carolina State University. Experimental Evaluation of Increasing TCP Initial Congestion Window (Systems)
James Jones, University of California, Irvine. Bug Comprehension Techniques to Assist Software Debugging (Software Engineering)
Yonina Eldar, Technion, Israel. Semi-Supervised Regression with Auxiliary Knowledge (Machine Learning)
Victor Lavrenko, University of Edinburgh, United Kingdom. Interactive Relevance Feedback for Mobile Search (Information Retrieval)
James Glass, MIT. Crowdsourcing to Acquire Semantically Labelled Text and Speech Data for Speech Understanding (Speech)
Chi Keung Tang, The Hong Kong University of Science and Technology. Quasi-Dense 3D Reconstruction from 2D Uncalibrated Photos (Geo/Maps)
Phil Blunsom, Oxford, United Kingdom. Unsupervised Induction of Multi-Nonterminal Grammars for Statistical Machine Translation (Machine Translation)
Oren Etzioni, University of Washington. Accessing the Web utilizing Android Phones, Dialogue, and Open Information Extraction (Mobile)
Matthew Salganik, Princeton. Developments in Bottom-Up Social Data Collection (Social)

The full list of this round’s award recipients can be found in this PDF. For more information on our research award program, visit our website. And if you’re a faculty member, we welcome you to apply for one of next year’s two rounds. The deadline for the first round is February 1.

Four Googlers elected ACM Fellows this year

Research Admin — Wed, 08 Dec 2010 00:00:00 +0000

Posted by Alfred Spector, VP of Research

I am delighted to share with you that, like last year, the Association for Computing Machinery (ACM) has announced that four Googlers have been elected ACM Fellows in 2010, the most this year from any single corporation or institution.

Luiz Barroso, Dick Lyon, Muthu Muthukrishnan and Fernando Pereira were chosen for their contributions to computing and computer science that have provided fundamental knowledge to the field and have generated multiple innovations.

On behalf of Google, I congratulate our colleagues, who join the 10 other ACM Fellows and other professional society awardees at Google in exemplifying our extraordinarily talented people. I’ve been struck by the breadth and depth of their contributions, and I hope that they will serve as inspiration for students and computer scientists around the world.

You can read more detailed summaries of their achievements below, including the official citations from ACM—although it’s really hard to capture everything they’ve accomplished in one paragraph!

Dr. Luiz Barroso: Distinguished Engineer
For contributions to multi-core computing, warehouse scale data-center architectures, and energy proportional computing

Over the past decade, Luiz has played a leading role in the definition and implementation of Google’s cluster architecture which has become a blueprint for the computing systems behind the world’s leading Internet services. As the first manager of Google’s Platforms Engineering team, he helped deliver multiple generations of cluster systems, including the world’s first container-based data center. His theoretical and engineering insights into the requirements of this class of machinery have influenced the processor industry roadmap towards more effective products for server-class computing. His book "The Datacenter as a Computer" (co-authored with Urs Hoelzle) was the first authoritative publication describing these so-called warehouse-scale computers for computer systems professionals and researchers. Luiz was among the first computer scientists to recognize and articulate the importance of energy-related costs for large data centers, and identify energy proportionality as a key property of energy efficient data centers. Prior to Google, at Digital Equipment's Western Research Lab, he worked on Piranha, a pioneering chip-multiprocessing architecture that inspired today’s popular multi-core products. As one of the lead architects and designers of Piranha, his papers, ideas and numerous presentations stimulated much of the research that led to products decades later.

Richard Lyon: Research Scientist
For contributions to machine perception and for the invention of the optical mouse

In the last four years at Google, Dick led the team developing new camera systems and improved photographic image processing for Street View, while leading another team developing technologies for machine hearing and their application to sound retrieval and ranking. He is now writing a book with Cambridge University Press, and will teach a Stanford course this fall on "Human and Machine Hearing," returning to a line of work that he carried out at Xerox, Schlumberger, and Apple while also doing the optical mouse, bit-serial VLSI computing machines, and handwriting recognition. The optical mouse (1980) is especially called out in the citation, because it exemplifies the field of "semi-digital" techniques that he developed, which also led to his work on the first single-chip Ethernet device. And more recently, as chief scientist at Foveon, Dick invented and developed several new techniques for color image sensing and processing, and delivered acclaimed cameras and end-user software. A hallmark of Dick’s work during his distinguished career has been a practical interplay between theory, including biological theory, and practical computing.

Dr. S. Muthukrishnan: Research Scientist
For contributions to efficient algorithms for string matching, data streams, and Internet ad auctions

Muthu has made significant contributions to the theory and practice of Internet ad systems during his more than four years at Google. Muthu's breakthrough WWW’09 paper presented a general stable matching framework that produces a (desirable) truthful mechanism capturing all of the common variations and more, in contradiction to prevailing wisdom. In display ads, where image, video and other types of ads are shown as users browse, Muthu led Ad Exchange at Google, to automate placement of display ads that were previously negotiated offline by sales teams. Prior to Google, Muthu was well known for his pioneering work in the area of data stream algorithmics (including a definitive book on the subject), which led to theoretical and practical advances still in use today to monitor the health and smooth operation of the Internet. Muthu has a talent for bringing new perspectives to longstanding open problems as exemplified in the work he did on string processing. Muthu has made influential contributions to many other areas and problems including IP networks, data compression, scheduling, computational biology, distributed algorithms and database technology. As an educator, Muthu’s avant garde teaching style won him the Award for Excellence in Graduate Teaching at Rutgers CS, where is on the faculty. As a student remarked in his blog: "there is a magic in his class which kinda spellbinds you and it doesn't feel like a class. It’s more like a family sitting down for dinner to discuss some real world problems. It was always like that even when we were 40 people jammed in for cs-513."

Dr. Fernando Pereira: Research Director
For contributions to machine-learning models of natural language and biological sequences

For the past three years, Fernando has been leading some of Google’s most advanced natural language understanding efforts and some of the most important applications of machine learning technology. He has just the right mix of forward thinking ideas and the ability to put ideas into practice. With this balance, Fernando has has helped his team of research scientists apply their ideas at the scale needed for Google. From when he wrote the first Prolog compiler (for the PDP-10 with David Warren) to his days as Chair at University of Pennsylvania, Fernando has demonstrated a unique understanding of the challenges and opportunities that faced companies like Google with their unprecedented access to massive data sets and its application to the world of speech recognition, natural language processing and machine translation. At SRI, he pioneered probabilistic language models at a time when logic-based models were more popular. At AT&T, his work on a toolkit for finite-state models became an industry standard, both as a useful piece of software and in setting the direction for building ever larger language models. And his year at WhizBang had an influence on other leaders of the field, such as Andrew McCallum at University of Massachusetts and John Lafferty and Tom Mitchell at Carnegie Mellon University, with whom Fernando developed the Conditional Random Field model for sequence processing that has become one of the leading tools of the trade.

Finally, we also congratulate Professor Christos Faloutsos of Carnegie Mellon, who is on sabbatical and a Visiting Faculty Member at Google this academic year. Professor Faloutsos is cited for contributions to data mining, indexing, fractals and power laws.

Update 12/8: Updated Dick Lyon's title and added information about Professor Faloutsos.

Google Launches Cantonese Voice Search in Hong Kong

Research Admin — Thu, 02 Dec 2010 18:46:00 +0000

Posted by Posted by Yun-hsuan Sung (宋雲軒) and Martin Jansche, Google Research

On November 30th 2010, Google launched Cantonese Voice Search in Hong Kong. Google Search by Voice has been available in a growing number of languages since we launched our first US English system in 2008. In addition to US English, we already support Mandarin for Mainland China, Mandarin for Taiwan, Japanese, Korean, French, Italian, German, Spanish, Turkish, Russian, Czech, Polish, Brazilian Portuguese, Dutch, Afrikaans, and Zulu, along with special recognizers for English spoken with British, Indian, Australian, and South African accents.

Cantonese is widely spoken in Hong Kong, where it is written using traditional Chinese characters, similar to those used in Taiwan. Chinese script is much harder to type than the Latin alphabet, especially on mobile devices with small or virtual keyboards. People in Hong Kong typically use either “Cangjie” (倉頡) or “Handwriting” (手寫輸入) input methods. Cangjie (倉頡) has a steep learning curve and requires users to break the Chinese characters down into sequences of graphical components. The Handwriting (手寫輸入) method is easier to learn, but slow to use. Neither is an ideal input method for people in Hong Kong trying to use Google Search on their mobile phones.

Speaking is generally much faster and more natural than typing. Moreover, some Chinese characters – like “滘” in “滘西州” (Kau Sai Chau) and “砵” in “砵典乍街” (Pottinger Street) – are so rarely used that people often know only the pronunciation, and not how to write them. Our Cantonese Voice Search begins to address these situations by allowing Hong Kong users to speak queries instead of entering Chinese characters on mobile devices. We believe our development of Cantonese Voice Search is a step towards solving the text input challenge for devices with small or virtual keyboards for users in Hong Kong.

There were several challenges in developing Cantonese Voice Search, some unique to Cantonese, some typical of Asian languages and some universal to all languages. Here are some examples of problems that stood out:

Data Collection: In contrast to English, there are few existing Cantonese datasets that can be used to train a recognition system. Building a recognition system requires both audio and text data so it can recognize both the sounds and the words. For audio data, our efficient DataHound collection technique uses smartphones to record and upload large numbers of audio samples from local Cantonese-speaking volunteers. For text data, we sample from anonymized search query logs from http://www.google.com.hk to obtain the large amounts of data needed to train language models.
Chinese Word Boundaries: Chinese writing doesn’t use spaces to indicate word boundaries. To limit the size of the vocabulary for our speech recognizer and to simplify lexicon development, we use characters, rather than words, as the basic units in our system and allow multiple pronunciations for each character.
Mixing of Chinese Characters and English Words: We found that Hong Kong users mix more English into their queries than users in Mainland China and Taiwan. To build a lexicon for both Chinese characters and English words, we map English words to a sequence of Cantonese pronunciation units.
Tone Issues: Linguists disagree on the best count of the number of tones in Cantonese – some say 6, some say 7, or 9, or 10. In any case, it’s a lot. We decided to model tone-plus-vowel combinations as single units. In order to limit the complexity of the resulting model, some rarely-used tone-vowel combinations are merged into single models.
Transliteration: We found that some users use English words while others use the Cantonese transliteration (e.g.,: “Jordan” vs. “佐敦”). This makes it challenging to develop and evaluate the system, since it’s often impossible for the recognizer to distinguish between an English word and its Cantonese transliteration. During development we use a metric that simply checks whether the correct search results are returned.
Different Accents and Noisy Environment: People speak in different styles with different accents. They use our systems in a variety of environments, including offices, subways, and shopping malls. To make our system work in all these different conditions, we train it using data collected from many different volunteers in many different environments.

Cantonese is Google’s third spoken language for Voice Search in the Chinese linguistic family, after Mandarin for Mainland China and Mandarin for Taiwan. We plan to continue to use our data collection and language modeling technologies to help speakers of Chinese languages easily input text and look up information.

Voice Search in Underrepresented Languages

Research Admin — Tue, 09 Nov 2010 22:21:00 +0000

Posted by Pedro J. Moreno, Staff Research Scientist and Johan Schalkwyk, Senior Staff Engineer

Welkom*!

Today we’re introducing Voice Search support for Zulu and Afrikaans, as well as South African-accented English. The addition of Zulu in particular represents our first effort in building Voice Search for underrepresented languages.

We define underrepresented languages as those which, while spoken by millions, have little presence in electronic and physical media, e.g., webpages, newspapers and magazines. Underrepresented languages have also often received little attention from the speech research community. Their phonetics, grammar, acoustics, etc., haven’t been extensively studied, making the development of ASR (automatic speech recognition) voice search systems challenging.

We believe that the speech research community needs to start working on many of these underrepresented languages to advance progress and build speech recognition, translation and other Natural Language Processing (NLP) technologies. The development of NLP technologies in these languages is critical for enabling information access for everybody. Indeed, these technologies have the potential to break language barriers.

We also think it’s important that researchers in these countries take a leading role in advancing the state of the art in their own languages. To this end, we’ve collaborated with the Multilingual Speech Technology group at South Africa’s North-West University led by Prof. Ettiene Barnard (also of the Meraka Research Institute), an authority in speech technology for South African languages. Our development effort was spearheaded by Charl van Heerden, a South African intern and a student of Prof. Barnard. With the help of Prof. Barnard’s team, we collected acoustic data in the three languages, developed lexicons and grammars, and Charl and others used those to develop the three Voice Search systems. A team of language specialists traveled to several cities collecting audio samples from hundreds of speakers in multiple acoustic conditions such as street noise, background speech, etc. Speakers were asked to read typical search queries into an Android app specifically designed for audio data collection.

For Zulu, we faced the additional challenge of few text sources on the web. We often analyze the search queries from local versions of Google to build our lexicons and language models. However, for Zulu there weren’t enough queries to build a useful language model. Furthermore, since it has few online data sources, native speakers have learned to use a mix of Zulu and English when searching for information on the web. So for our Zulu Voice Search product, we had to build a truly hybrid recognizer, allowing free mixture of both languages. Our phonetic inventory covers both English and Zulu and our grammars allow natural switching from Zulu to English, emulating speaker behavior.

This is our first release of Voice Search in a native African language, and we hope that it won’t be the last. We’ll continue to work on technology for languages that have until now received little attention from the speech recognition community.

Salani kahle!**

* “Welcome” in Afrikaans
** “Stay well” in Zulu

Suggesting a Better Remote Control

Research Admin — Thu, 04 Nov 2010 20:12:00 +0000

Posted by Ullas Gargi and Rich Gossweiler, Research Team

It seems clear that the TV is a growing source of online audio-video content that you select by searching. Entering characters of a search string one by one using a traditional remote control and onscreen keyboard is extremely tiresome. People have been working on building better ways to search on the TV, ranging from small keyboards to voice input to interesting gestures you might make to let the TV know what you want. But currently the traditional left-right-up-down clicker dominates as the family room input device. To enter the letters of a show, you click over and over until you get to the desired letter on the on-screen keyboard and then you hit enter to select it. You repeat this mind-numbingly slow process until you type in your query string or at least enough letters that the system can put up a list of suggested completions. Can we instead use a Google AutoComplete style recommendation model and novel interface to make character entry less painful?

We have developed an interaction model that reduces the distance to the predicted next letter without scrambling or moving letters on the underlying keyboard (which is annoying and increases the time it takes to find the next letter). We reuse the highlight ring around the currently selected letter and fill it with 4 possible characters that might be next, but we do not change the underlying keyboard layout. With 4 slots to suggest the next letter and a good prediction model trained on the target corpus, the next letter is often right where you are looking and just a click away.

To learn more about this combination of User Experience and Machine Learning to address a growing problem with searching on TVs, check out our WWW 2010 publication,QuickSuggest.

Exploring Computational Thinking

Research Admin — Mon, 25 Oct 2010 17:00:00 +0000

Posted by Elaine Kao, Education Program Manager

Over the past year, a group of California-credentialed teachers along with our own Google engineers came together to discuss and explore ideas about how to incorporate computational thinking into the K-12 curriculum to enhance student learning and build this critical 21st century skill in everyone.

What exactly is computational thinking? Well, that would depend on who you ask as there are several existing resources on the web that may define this term slightly differently. We define computational thinking (CT) as a set of skills that software engineers use to write the programs that underlay all of the computer applications you use every day. Specific CT techniques include:

Problem decomposition: the ability to break down a problem into sub-problems
Pattern recognition: the ability to notice similarities, differences, properties, or trends in data
Pattern generalization: the ability to extract out unnecessary details and generalize those that are necessary in order to define a concept or idea in general terms
Algorithm design: the ability to build a repeatable, step-by-step process to solve a particular problem

Given the increasing prevalence of technology in our day-to-day lives and in most careers outside of computer science, we believe that it is important to raise this base level of understanding in everyone.

To this end, we’d like to introduce you to a new resource: Exploring Computational Thinking. Similar to some of our other initiatives in education, including CS4HS and Google Code University, this program is committed to providing educators with access to our curriculum models, resources, and communities to help them learn more about CT, discuss it as a strategy for teaching and understanding core curriculum, as well as easily incorporate CT into their own curriculum, whether it be in math, science, language, history or beyond. The materials developed by the team reflect both the teachers’ expertise in pedagogy and K-12 curriculum as well as our engineers’ problem-solving techniques that are critical to our industry.

Prior to launching this program, we reached out to several educators and classrooms and had them try our materials. Here’s some of the feedback we received:

CT as a strategy for teaching and student learning works well with many subjects, and can easily be incorporated to support the existing K-12 curriculum
Our models help to call out the specific CT techniques and provide more structure around the topics taught by educators, many of who were already unknowingly applying CT in their classrooms
Including programming exercises in the classroom can significantly enrich a lesson by both challenging the advanced students and motivating the students who have fallen behind
Our examples provide educators with a means of re-teaching topics that students have struggled with in the past, without simply going through the same lesson that frustrated them before

To learn more about our program or access CT curriculum materials and other resources, visit us at www.google.com/edu/ect.

Google at the Conference on Empirical Methods in Natural Language Processing (EMNLP ’10)

Research Admin — Mon, 18 Oct 2010 21:30:00 +0000

Posted by Slav Petrov, Research Scientist

The Conference on Empirical Methods in Natural Language Processing (EMNLP '10) was recently held at the MIT Stata Center in Massachusetts. Natural Language Processing is at the core of many of the things that we do here at Google. Googlers have therefore been traditionally part of this research community, participating as program committee members, paper authors and attendees.

At this year's EMNLP conference Google Fellow, Amit Singhal gave an invited keynote talk on "Challenges in running a commercial search engine" where he highlighted some of the exciting opportunities, as well as challenges, that Google is currently facing. Furthermore, Terry Koo (who recently joined Google), David Sontag (former Google PhD Fellowship recipient) and their collaborators from MIT received the Fred Jelinek Best Paper Award for their innovative work on syntactic parsing with the title "Dual Decomposition for Parsing with Non-Projective Head Automata".

Here is a complete list of the papers presented by Googlers at the conference:

Dual Decomposition for Parsing with Non-Projective Head Automata (Fred Jelinek Best Paper Award) by Terry Koo, Alexander M. Rush, Michael Collins, Tommi Jaakkola, and David Sontag
"Poetic" Statistical Machine Translation: Rhyme and Meter (see also here) by Dmitriy Genzel, Jakob Uszkoreit, and Franz Och
Efficient Graph-Based Semi-Supervised Learning of Structured Tagging Models by Amarnag Subramanya, Slav Petrov, and Fernando Pereira
Uptraining for Accurate Deterministic Question Parsing by Slav Petrov, Pi-Chuan Chang, Michael Ringgaard, and Hiyan Alshawi
Self-training with Products of Latent Variable Grammars by Zhongqiang Huang, Mary Harper, and Slav Petrov

Kuzman Ganchev Receives Presidential Award from the Republic of Bulgaria

Research Admin — Fri, 15 Oct 2010 13:45:00 +0000

Posted by Slav Petrov, Research Scientist

We would like to congratulate Kuzman Ganchev for being the runner-up for the John Atanasoff award from the President of the Republic of Bulgaria. Kuzman recently joined our New York office as a research scientist, after completing his doctoral studies at the University of Pennsylvania.

The John Atanasoff award was established in 2003 and is given annually to a Bulgarian scientist under 35 for scientific or practical contributions to the development of computer and information technology worldwide and significant economic or social importance for Bulgaria. Kuzman received the award for his contributions to computational linguistics and machine learning. Kuzman is the co-author of more than 20 publications that have appeared in international conferences and journals.

Korean Voice Input — Have you Dictated your E-Mails in Korean lately?

Research Admin — Thu, 14 Oct 2010 16:00:00 +0000

Posted by Mike Schuster & Kaisuke Nakajima, Google Research

Google Voice Search has been available in various flavors of English since 2008, in Mandarin and Japanese since 2009, in French, Italian, German and Spanish since June 2010 (see also in this blog post), and shortly after that in Taiwanese. On June 16th 2010, we took the next step by launching our Korean Voice Search system.

Korean Voice Search, by focusing on finding the correct web page for a spoken query, has been quite successful since launch. We have improved the acoustic models several times which resulted in significantly higher accuracy and reduced latency, and we are committed to improving it even more over time.

While voice search significantly simplifies input for search, especially for longer queries, there are numerous applications on any smartphone that could also benefit from general voice input, such as dictating an email or an SMS. Our experience with US English has taught us that voice input is as important as voice search, as the time savings from speaking rather than typing a message are substantial. Korean is the first non-English language where we are launching general voice input. This launch extends voice input to emails, SMS messages, and more on Korean Android phones. Now every text field on the phone will accept Korean speech input.

Creating a general voice input service had different requirements and technical challenges compared to voice search. While voice search was optimized to give the user the correct web page, voice input was optimized to minimize (Hangul) character error rate. Voice inputs are usually longer than searches (short full sentences or parts of sentences), and the system had to be trained differently for this type of data. The current system’s language model was trained on millions of Korean sentences that are similar to those we expect to be spoken. In addition to the queries we used for training voice search, we also used parts of web pages, selected blogs, news articles and more. Because the system expects spoken data similar to what it was trained on, it will generally work well on normal spoken sentences, but may yet have difficulty on random or rare word sequences -- we will work to keep improving on those.

Korean voice input is part of Google’s long-term goal to make speech input an acceptable and useful form of input on any mobile device. As with voice search, our cloud computing infrastructure will help us to improve quality quickly, as we work to better support all noise conditions, all Korean dialects, and all Korean users.

Clustering Related Queries Based on User Intent

Research Admin — Wed, 13 Oct 2010 21:10:00 +0000

Posted by Jayant Madhavan and Alon Halevy

People today use search engines for all their information needs, but when they pose a particular search query, they typically have a specific underlying intent. However, when looking at any query in isolation, it might not entirely be clear what the underlying intent is. For example, when querying for mars, a user might be looking for more information about the planet Mars, or the planets in the solar system in general, or the Mars candy bar, or Mars the Roman god of war. The ambiguity in intent is most pronounced for queries that are inherently ambiguous and for queries about prominent entities about which there are various different types of information on the Internet. Given such ambiguity, modern search engines try to complement their search results with lists of related queries that can be used to further explore a particular intent.

In a recent paper, we explored the problem of clustering the related queries as a means of understanding the different intents underlying a given user query. We propose an approach that combines an analysis of anonymized document-click logs (what results do users click on) and query-session logs (what sequences of queries do users pose in a search session). We model typical user search behavior as a traversal of a graph whose nodes are related queries and clicked documents. We propose that the nodes in the graph, when grouped based on the probability of a typical user visiting them within a single search session, yield clusters that correspond to distinct user intents.

Our results show that underlying intents (clusters of related queries) almost always correspond to well-understood, high-level concepts. For example, for mars, in addition to re-constructing each of the intents listed earlier, we also find distinct clusters grouping queries about NASA’s missions to the planet, about specific interest in life on Mars, as well as a Japanese comic series, and a grocery chain named Mars. We found that our clustering approach yields better results than earlier approaches that either only used document-click or only query-session information. More details about our proposed approach and an analysis of the resulting clusters can be found in our paper that was presented at the International World Wide Web conference earlier this year.

Google at USENIX Symposium on Operating Systems Design and Implementation (OSDI ‘10)

Research Admin — Tue, 12 Oct 2010 18:01:00 +0000

Posted by Murray Stokely, Software Engineer

The 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI ‘10) was recently held in Vancouver, B.C. This biennial conference is one of the premiere forums for presenting innovative research in distributed systems from both academia and industry, and we were glad to be a part of it.

In addition to sponsoring this conference since 2002, Googlers contributed to the exchange of scientific ideas through authoring or co-authoring 3 published papers, organizing workshops, and serving on the program committee. A short summary of the contributions:

Large-scale Incremental Processing Using Distributed Transactions and Notifications.
Google replaced its batch-oriented indexing system with an incremental system, Percolator. Rather than running a series of high-latency map-reduces over large batches of documents, we now index individual documents at very low latency. The result is a 50% reduction in search result age; our paper discusses this project and the implications of the result.
Availability in Globally Distributed Storage Systems.
Reliable and efficient storage systems are a key component of cloud-based services. In this paper we characterize the availability properties of cloud storage systems based on extensive monitoring of Google's main storage infrastructure and present statistical models that enable further insight into the impact of multiple design choices, such as data placement and replication strategies. We demonstrate the utility of these models by computing data availability under a variety of replication schemes given the real patterns of failures observed in our fleet.
Onix: A Distributed Control Platform for Large-scale Production Networks.
There has been recent interest in a new networking paradigm called Software-Defined Networking (SDN). The crucial enabler for SDN is distributed control platform that shields developers from the details of the underlying physical infrastructure and allows them to write sophisticated control logic against a high-level API. Onix provides such a control platform for large-scale production networks.

In addition to the papers presented by current Googlers, we were also happy to see that the recipient of the 2009 Google Ph.D. Fellowship in Cloud Computing, Roxana Geambasu, presented her work on Comet: An active distributed key-value store.

Videos of all of the talks from OSDI are available on the conference website for attendees and current USENIX members. There is also a USENIX YouTube channel with a growing subset of the conference videos open to everyone.

Google is making substantial progress on many of the grand challenge problems in computer science and artificial intelligence as part of its mission to organize the worlds information and make it useful. Given the continuing increase in the scale of our distributed systems it’s fair to say we’ll have some other exciting new work to share at the next OSDI. Hope to see you in 2012.

Making an Impact on a Thriving Speech Research Community

Research Admin — Mon, 11 Oct 2010 23:04:00 +0000

Posted by Vincent Vanhoucke, Google Research

While we continue to launch exciting new speech products--most recently Voice Actions and Google Search by Voice in Russian, Czech and Polish--we also strive to contribute to the academic research community by sharing both innovative techniques and experiences with large-scale systems.

This year’s gathering of the world’s experts in speech technology research, Interspeech 2010 in Makuhari, Japan, which Google co-sponsored, was a fantastic demonstration of the momentum of this community, driven by new challenges such as mobile voice communication, voice search, and the increasing international reach of speech technologies.

Googlers published papers that showcased the breadth and depth of our speech recognition research. Our work addresses both fundamental problems in acoustic and language modeling, as well as the practical issues of building scalable speech interfaces that real people use everyday to make their lives easier.

Here is a list of the papers presented by Googlers at the conference:

Direct Construction of Compact Context-Dependency Transducers From Data, David Rybach and Michael Riley (Computer Speech & Language Best Paper Award).
Voice Search for Development, Etienne Barnard, Johan Schalkwyk, Charl van Heerden and Pedro J. Moreno.
Unsupervised Discovery and Training of Maximally Dissimilar Cluster Models, Françoise Beaufays, Vincent Vanhoucke and Brian Strope.
Search by Voice in Mandarin Chinese, Jiulong Shan, Genqing Wu, Zhihong Hu, Xiliu Tang, Martin Jansche and Pedro J. Moreno.
On-Demand Language Model Interpolation for Mobile Speech Input, Brandon Ballinger, Cyril Allauzen, Alexander Gruenstein, and Johan Schalkwyk.
Building Transcribed Speech Corpora Quickly and Cheaply for Many Languages, Thad Hughes, Kaisuke Nakajima, Linne Ha, Atul Vasu, Pedro J. Moreno and Mike LeBeau.
Say What? Why Users Choose to Speak their Web Queries, Maryam Kamvar and Doug Beeferman.
Study on Interaction between Entropy Pruning and Kneser-Ney Smoothing, Ciprian Chelba, Thorsten Brants, Will Neveitt and Peng Xu.
Decision Tree State Clustering with Word and Syllable Features, Hank Liao, Chris Alberti, Michiel Bacchiani and Olivier Siohan.

Bowls and Learning

Research Admin — Thu, 07 Oct 2010 22:00:00 +0000

Posted by Phil Long, Research Team

It is easy to find the bottom of a bowl no matter where you start -- if you toss a marble anywhere into the bowl, it will roll downhill and find its way to the bottom.

What does this have to do with Machine Learning? A natural way to try to construct an accurate classifier is to minimize the number of prediction errors the classifier makes on training data. The trouble is, even for moderate-sized data sets, minimizing the number of training errors is a computationally intractable problem. A popular way around this is to assign different training errors different costs and to minimize the total cost. If the costs are assigned in a certain way (according to a “convex loss function”), the total cost can be efficiently minimized the way a marble rolls to the bottom of a bowl.

In a recent paper, Rocco Servedio and I show that no algorithm that works this way can achieve a simple and natural theoretical noise-tolerance guarantee that can be achieved by other kinds of algorithms. A result like this is interesting for two reasons: first, it's important to understand what you cannot do with convex optimization in order to get a fuller understanding of what you can do with it. Second, this result may spur more research into noise-tolerant training algorithms using alternative approaches.

Remembering Fred Jelinek

Research Admin — Sat, 18 Sep 2010 01:01:00 +0000

Posted by Ciprian Chelba, Research Team

It is with great sadness that we note the passing of Fred Jelinek, teacher and colleague to many of us here at Google. His seminal contributions to statistical modeling of speech and language influenced not only us, but many more members of the research community.

Several of us at Google remember Fred:

Ciprian Chelba:
Fred was my thesis advisor at CLSP. My ten years of work in the field after graduation led me to increasingly appreciate the values that Fred instilled by personal example: work on the hard problem because it simply cannot be avoided, bring fundamental and original contributions that steer clear of incrementalism, exercise your creativity despite the risks entailed, and pursue your ideas with determination.

I recently heard a comment from a colleague, “A natural born leader is someone you follow even if only out of curiosity.” I immediately thought of Fred. Working with him marked a turning point in my life, and his influential role will be remembered.

Bob Moore:
I first met Fred Jelinek in 1984 at an IBM-sponsored workshop on natural-language processing. Fred's talk was my first exposure to the application of statistical ideas to language, and about the only thing I understood was the basic idea of N-gram language modeling: estimate the probability of the next word in a sequence based on a small fixed number of immediately preceding words. At the time, I was so steeped in the tradition of linguistically-based formal grammars that I was sure Fred's approach could not possibly be useful.

Starting about five years later, however, I began to interact with Fred often at speech and language technology meetings organized by DARPA, as well as events affiliated with the Association for Computational Linguistics. Gradually, I (along with much of the computational linguistics community) began to understand and appreciate the statistical approach to language technology that Fred and his colleagues were developing, to the point that it now dominates the field of computational linguistics, including my own research. The importance of Fred's technical contributions and visionary leadership in bringing about this revolution in language technology cannot be overstated. The field is greatly diminished by his passing.

Fernando Pereira:
I met Fred first at a DARPA-organized workshop where one of the main topics was how to put natural language processing research on a more empirical, data-driven path. Fred was leading the charge for the move, drawing from his successes in speech recognition. Although I had already started exploring those ideas, I was not fully convinced by Fred’s vision. Nevertheless, Fred’s program raised many interesting research questions, and I could not resist some of them. Working on search for speech recognition at AT&T, I was part of the small team that invented the finite-state transducer representation of recognition models. I gave what I think was the first public talk on the approach at a workshop session that Fred chaired. It was Fred’s turn to be skeptical, and we had a spirited exchange in the discussion period. At the time, I was disappointed that I had failed to interest Fred in the work, but later I was delighted when Fred became a strong supporter of our work after a JHU Summer workshop where Michael Riley led the use of our software tools in successful experiments with a team of JHU researchers and students. Indeed, in hindsight, Fred was right to be skeptical before we had empirical validation for the approach, and his strong support when the results started coming in was thus much more meaningful and gratifying. Through these experiences and much more, I came to respect immensely Fred’s pioneer spirit, vision, and sharp mind. Many of my most successful projects benefited directly or indirectly from his ideas, his criticism, and his building of thriving institutions, from CLSP to links with the research team at Charles University in Prague. I saw Fred last at ACL in Uppsala. He was in great form, and we had a good discussion on funding for the summer workshops. I am very sad that he will not be with us to continue these conversations.

Shankar Kumar:
Fred was my academic advisor at CLSP/JHU and I interacted with him throughout my Ph.D. program. I had the privilege of having him on my thesis committee. My very first exposure to research in speech and NLP was through an independent study that I did under him. A few years later, I was his teaching assistant for the speech recognition class. Fred's energy and passion for research made a strong impression on me back then and continues to influence my work to this day. I remember Fred carefully writing up his ideas and sending them out as a starting point to our discussions. While I found this curiously amusing at the time, I now think this was his unique approach to ensure clarity of thought and to steer the discussion without distractions. Fred's enthusiasm for learning new concepts was infectious! I attended several classes and guest lectures with him - graphical models, NLP, and many more. His insightful questions and his active participation in each one of these classes made them memorable for me. He epitomized what a life-long learner should be. I will always recall Fred's advice on sharing credit generously. In his own words, "The contribution of a research paper does not get divided by the number of authors". By his passing, we have lost a role model who dedicated his life to research and whose contributions will continue to impact and shape the field for years to come.

Michael Riley:
I got to know Fred pretty well having attended two of the CLSP six-week summer workshops, working on a few joint grants, and visiting CLSP in between. If there is a ‘father of speech recognition’, its got to be Fred Jelinek - he led the IBM team that invented and popularized many of the key methods used today. His intellect, wide knowledge, and force of will served him well later as the leader of the JHU Center for Language and Speech Processing - a sort of academic hearth where countless speech/NLP researchers and students interacted over the years in seminars and workshops. I was impressed that at an age when many retired and after which most of his IBM colleagues had gone into (very lucrative) financial engineering, he remained a vigorous, leading academic. Fernando mentioned the initial skepticism he had for our work on weighted FSTs for ASR. Some years later though I heard that he praised the work to my lab director, Larry Rabiner, on a plane ride that likely helped my promotion shortly thereafter. And no discussion of Fred would be complete without a mention of his inimitable humor, delivered in that loud Czech-accented voice:

Riley [at workshop planning meeting]: “Could they hold the summer workshop in some nicer place than Baltimore to help attract people?”
Fred: “Riley, we’ll hold it in Rome next year and get better people than you!”

Seminar presenter: [fumbling with Windows configuration for minutes].
Fred [very loud]: “How long do we have to endure this high-tech torture?”

The website of The Johns Hopkins University’s Center for Language and Speech Processing links to Fred’s own descriptions of his life and technical achievements.

Frowns, Sighs, and Advanced Queries — How does search behavior change as search becomes more difficult?

Research Admin — Fri, 17 Sep 2010 15:18:00 +0000

Posted by Anne Aula, Rehan Khan, and Zhiwei Guan, User Experience Team

How does search behavior change as search becomes more difficult?

At Google, we strive to make finding information easy, efficient, and even fun. However, we know that once in a while, finding a specific piece of information turns out to be tricky. Based on dozens of user studies over the years, we know that it’s relatively easy for an observer to notice that the user is having problems finding the information, by watching changes in language, body language, and facial expressions:

Computers, however, don’t have the luxury of observing a user the way another person would. But would it be possible for a computer to somehow tell that the user is struggling to find information?

We decided to find out. We first ran a study in the usability lab where we gave users search tasks, some of which we knew to be difficult. The first couple of searches always looked pretty much the same independent of task difficulty: users formulated a query, quickly scanned the results and either clicked on a result or refined the query. However, after a couple of unsuccessful searches, we started noticing interesting changes in behavior. In addition to many of them sighing or starting to bite their nails, users sometimes started to type their searches as natural language questions, they sometimes spent a very long time simply staring at the results page, and they sometimes completely changed their approach to the task.

We were fascinated by these findings as they seemed to be signals that the computer could potentially detect while the user is searching. We formulated the initial findings from the usability lab study as hypotheses which we then tested in a larger web-based user study.

The overall findings were promising: we found five signals that seemed to indicate that users were struggling in the search task. Those signals were: use of question queries, use of advanced operators, spending more time on the search results page, formulating the longest query in the middle of the session, and spending a larger proportion of the time on the search results page. None of these signals alone are strong enough predictors of users having problems in search tasks. However, when used together, we believe we can use them to build a model that will one day make it possible for computers to detect frustration in real time.

You can read the full text of the paper here.

Focusing on Our Users: The Google Health Redesign

Research Admin — Wed, 15 Sep 2010 13:00:00 +0000

Posted by Hendrik Mueller, User Experience Researcher

When I relocated to New York City a few years ago, some of the most important health information for me to have on hand was my immunization history. At the time, though, my health records were scattered, and it felt like a daunting task to organize them -- a not-uncommon problem that many people face. For me, the solution came when Google Health became available in May of 2008, and I started using it to organize my health information and keep it more manageable. I also saw the potential to do much more within Google Health, such as tracking my overall fitness goals. When I joined the Google Health team as the lead user experience researcher, I was curious about the potential for Google Health to impact people’s lives beyond things like immunization tracking and how we could make the product a lot easier to use. So I set out to explore how to expand and improve Google Health.

Here at Google, we focus on the user throughout the entire product development process. So before Google Health was first launched, we interviewed many people about how they managed their medical records and other health information to better understand their needs. We then iteratively created and tested multiple concepts and designs. After our initial launch, we followed up with actual Google Health users through surveys, interviews, and usability studies to understand how well we were meeting their needs.

From this user research, we learned what was working in the product and what needed to be improved. Here are some of the things our users found especially useful:

Organizing and tracking health-related information in a single place that is accessible from anywhere at any time
Sharing medical records easily with loved ones and health care providers, either by allowing online access or by printing out health summaries
Referencing rich information about health topics, aggregated from trusted sources and Google search results

Our users also described to us the benefits they saw from using Google Health:

“Google Health gives me many tools to research my prescriptions and symptoms, and to track all of the many tests I keep having. Google Health made several necessary and cumbersome tasks easy and worry free.”

“For years now, I've tried to remember my son’s allergies and medications, but the list has grown so long, that I kept forgetting one or two when a doctor asked me about them. That can't happen again because I now have a single place to keep up with them. And I love the fact that I can print off information for situations when I really need it.”

“I really like that I can share my profile with others. I want my mom to know my medical information, just in case anything ever happens to me.”

While we learned that our users were clearly getting positive results from using Google Health, our research also taught us that more was needed. We learned that we needed to make fundamental changes to fully meet the needs of all of our current and prospective users, such as those that are chronically ill, those who care for family members, and especially those users looking to track and improve their wellness and fitness.

On this last point, our user surveys already pointed out that there was more we could do to help our users track and manage their wellness, not just their sickness, so we conducted further research about how people collect, monitor, track, and analyze their wellness data. We interviewed several people in their homes and invited others into our usability labs. As a result, we identified several areas where we could improve Google Health to make it a more useful wellness tool, including:

Dedicated wellness tracking including pre-built and custom trackers
Efficient manual data entry as well as automatic data collection through devices
A customizable summary dashboard of wellness and other health topics
Goal setting and progress tracking using interactive charts
Personalized pages for each topic with rich charts, journaling, and related information

These insights led us to a whole new set of design proposals. We gathered feedback on the resulting sketches, wire-frames, and screenshots from active and new Google Health users. The results throughout this process were eye-opening. While we were on the right track for some parts of the design, other parts had to be corrected or even redesigned. We went through several iterations until we had a design that tested well and we felt met the user needs our research had uncovered. Finally, we conducted several usability studies with a functioning prototype throughout the product development process to continuously improve usability and function.

At the end, the collaboration between the user experience, engineering, and product management teams resulted in an entirely new user experience for Google Health combined with a set of new functionality that is now available for you to try out at www.google.com/health. See for yourself how the old and new versions compare. Here is a screenshot of a health profile in the new version:

And this is how the same account and profile looked in the old user interface:

As a Google Health user, I am excited to take advantage of the new design and have already started using it for my own exercise and weight tracking. And on behalf of the user experience team and the entire Google Health team, we’re excited about being able to bring you a new design and more powerful tool that we think will meet more of your health and wellness needs.

We look forward to continuing to explore how we can make Google Health even more useful and easier to use for people like you. As you use Google Health, you may see a link to a feedback survey at the top of the application. If you do, please take the time to fill it out - we will be listening to your input!

Discontinuous Seam Carving for Video Retargeting

Research Admin — Mon, 13 Sep 2010 19:24:00 +0000

Posted by Matthias Grundmann and Vivek Kwatra, Google Research

Videos come in different sizes, resolutions and aspect ratios, but the device used for playback, may it be your TV, mobile phone, or laptop, only has a fixed resolution and form factor. As a result, you cannot watch your favorite old show that came in 4:3 on your new 16:9 HDTV without having black bars on the side, referred to as letterboxing. Likewise, widescreen movies and user-videos uploaded on YouTube are shot using various cameras with wide-ranging formats, so they do not fit completely on the screen. As an alternative to letterboxing, several devices try to upscale the content uniformly, which either changes the aspect ratio, making everything look stretched out, or simply crop the frame, thereby discarding any content that cannot fit the screen after scaling.

At Google Research, together with collaborators from Georgia Tech, we have developed an algorithm that resizes (or retargets) videos to fit the form factor of a given device without cropping, stretching or letterboxing. Our approach uses all of the screen’s precious pixels, while striving to deliver as much video-content of the original as possible. The result is a video that adapts to your needs, so you don’t have to adapt to the video.

Six frames from the result of our retargeting algorithm applied to a sub-clip of “Apologize”, © 2006 One Republic. Original frame is shown on the left, our resized result on the right. The original content is fit to a new aspect ratio.

The key insight is that we can separate the video into salient and non-salient content, which are then treated differently. Think of salient content as actors, faces, or structured objects, where the viewer anticipates specific, important details to perceive it as being correct and unaltered. We cannot change this content beyond uniform scaling without it being noticeable. On the other hand, non-salient content, such as sky, water or a blurry out-of-focus background can be squished or stretched without changing the overall appearance or the viewer noticing a dramatic change.

Our technique, which we call discontinuous seam carving -- named so because it modifies the video by adding or removing disconnected seams (or chains) of pixels -- allows greater freedom in the resizing process than previous approaches. By optimizing for the retargeted video to be consistent with the original, we carefully preserve the shape and motion of the salient content while being less restrictive with non-salient content. The key innovations of our research include: (a) a solution that maintains temporal continuity of the video in addition to preserving its spatial structure, (b) space-time smoothing for automatic as well as interactive (user-guided) salient content selection, and (c) sequential frame-by-frame processing conducive for arbitrary length and streaming video. The outcome is a scalable system capable of retargeting videos featuring complex motions of actors and cameras, highly dynamic content and camera shake. For more details, please refer to our paper or visit the project web-site.

Google Search by Voice: A Case Study

Research Admin — Thu, 09 Sep 2010 23:59:00 +0000

Posted by Johan Schalkwyk, Google Research

Wind the clock back two years with your smart phone in hand. Try to recall doing a search for a restaurant or the latest scores of your favorite sports team. If you’re like me you probably won’t even bother, or you’ll suffer with tiny keys or fat fingers on a touch screen. With Google Search by Voice all that has changed. Now you just tap the microphone, speak, and within seconds you see the result. No more fat fingers.

Google Search by Voice is a result of many years of investment in speech at Google. We started by building our own recognizer (aka GReco ) from the ground up. Our first foray in search by voice was doing local searches with GOOG-411. Then, in November 2008, we launched Google Search by Voice. Now you can search the entire Web using your voice.

What makes search by voice really interesting is that it requires much more than a just good speech recognizer. You also need a good user interface and a good phone like an Android in the hands of millions of people. Besides the excellent computational platform and data availability, the project succeeded due to Google’s culture built around teams that wholeheartedly tackle such challenges with the conviction that they will set a new bar.

In our book chapter, “Google Search by Voice: A Case Study”, we describe the basic technology, the supporting technologies, and the user interface design behind Google Search by Voice. We describe how we built it and what lessons we have learned. As the product required many helping hands to build, this chapter required many helping hands to write. We believe it provides a valuable contribution to the academic community.

The book, Advances in Speech Recognition, is available for purchase from Springer.

Towards Energy-Proportional Datacenters

Research Admin — Wed, 01 Sep 2010 19:18:00 +0000

Posted by Dennis Abts, Michael R. Marty, Philip M. Wells, Peter Klausler, and Hong Liu

This is part of the series highlighting some notable publications by Googlers.

At Google, we operate large datacenters containing clusters of servers, networking switches, and more. While this gear costs a lot of money, an increasingly important cost -- both in terms of dollars and environmental impact -- is the electricity that drives the computing clusters and the cooling infrastructure. Since our clusters often do not run at full utilization, Google recently put forth a call to industry and researchers to develop energy proportional computer systems. With such systems, the power consumed by our clusters would be directly proportional to utilization. Servers consume the most electricity, and therefore researchers have responded to Google’s call by focusing their attention towards servers. As the servers become increasingly energy proportional, however, the “always on” network fabric that connects servers together will consume an increasing fraction of datacenter power unless it too becomes energy proportional.

In a paper recently published at the International Symposium on Computer Architecture (ISCA), we push further towards the goal of energy-proportional computing by focusing on the energy usage of high-bandwidth, highly-scalable cluster networking fabrics. This research considers a broad set of architectural and technological solutions to optimize energy usage without sacrificing performance. First, we show how the Flattened Butterfly network topology uses less power since it uses less switching chips and fewer links than a comparable-performance network built using the more conventional Fat Tree topology. Second, our approach takes advantage of the observation that when network demand is low, we can reduce the speed at which links transmit data. We show via simulation, that by tuning the speeds of the links very rapidly, we can reduce power consumption with little impact on performance. Finally, our research is a further call to action for the academic and industry research communities to make energy efficiency, and energy proportionality in particular, a first-class citizen in networking research. Put together, our proposed techniques can reduce energy cost for typical Google workloads seen in our production datacenters by millions of dollars!

Google Publications

Research Admin — Fri, 30 Jul 2010 18:41:00 +0000

Posted by Corinna Cortes and Alfred Spector, Google Research

We often get asked if Google scientists and engineers publish technical papers, and the answer is, “Most certainly, yes.” Indeed, we have a formidable research capability, and we encourage publications as well as other forms of technical dissemination--including our contributions to open source and standards and the introduction of new APIs and tools, which have proven to sometimes be foundational.

Needless to say, with our great commitment to technical excellence in computer science and related disciplines, we find it natural and rewarding to contribute to the scientific community and to ongoing technical debates. And we know that it is important for Google to help create the fundamental building blocks upon which continuing advances can occur.

To be specific, Googlers publish hundreds of technical papers that appear in journals, books, and conference and workshop proceedings every year. These deal with specific applications and engineering questions, algorithmic and data structure problems, and important theoretical problems in computer science, mathematics, and other areas, that can guide our algorithmic choices. While the publications are interesting in their own right, they also offer a glance at some of the key problems we face when dealing with very large data sets and demonstrate other questions that arise in our engineering design at Google.

We’d like to highlight a few of the more noteworthy papers from the first trimester of this year. The papers reflect the breadth and depth of the problems on which we work. We find that virtually all aspects of computer science, from systems and programming languages, to algorithms and theory, to security, data mining, and machine learning are relevant to our research landscape. A more complete list of our publications can be found here.

In the coming weeks we will be offering a more in-depth look at these publications, but here are some summaries:

Speech Recognition

"Google Search by Voice: A Case Study," by Johan Schalkwyk, Doug Beeferman, Francoise Beaufays, Bill Byrne, Ciprian Chelba, Mike Cohen, Maryam Garrett, Brian Strope, to appear in Advances in Speech Recognition: Mobile Environments, Call Centers, and Clinics, Amy Neustein (Ed.), Springer-Verlag 2010.

Google Search by Voice is a result of many years of investment in speech at Google. In our book chapter, “Google Search by Voice: A Case Study,” we describe the basic technology, the supporting technologies, and the user interface design behind Google Search by Voice. We describe how we built it and what lessons we have learned. Google search by voice is growing rapidly and being built in many languages. Along the way we constantly encounter new research problems providing the perfect atmosphere for doing research on real world problems.

Computer Architecture & Networks & Distributed Systems

"Energy-proportional Datacenter Networks," by Dennis Abts, Mike Marty, Philip Wells, Peter Klausler, Hong Liu, International Symposium on Computer Architecture, ISCA, June 2010.

Google researchers have called on industry and academia to develop energy-proportional computing systems, where the energy consumed is directly proportional to the utilization of the system. In this work, we focus on the energy usage of high-bandwidth, highly scalable cluster networks. Through a combination of an energy-efficient topology and dynamic fine-grained control of link speeds, our proposed techniques show the potential to significantly reduce both electricity and environmental costs.

Economics & Market Algorithms

"Quasi-Proportional Mechanisms: Prior-free Revenue Maximization," by Vahab S. Mirrokni, S. Muthukrishnan, Uri Nadav, Latin American Theoretical Informatics Symposium, LATIN, April 2010.

Say a seller wishes to sell an item, but the buyers value it vastly differently. What is a suitable auction to sell the item, in terms of efficiency as well as revenue? First and second price auctions will be efficient but will only extract the lower value in equilibrium; if one knows the distributions from which values are drawn, then setting a reserve price will get optimal revenue but will not be efficient. This paper views this problem as prior-free auction and proposes a quasi-proportional allocation in which the probability that an item is allocated to a bidder depends (quasi-proportionally) on their bids. The paper also proves existence of an equilibrium for quasi-proportional auctions and shows how to compute them efficiently. Finally, the paper shows that these auctions have high efficiency and revenue.

"Auctions with Intermediaries," Jon Feldman, Vahab Mirrokni, S. Muthukrishnan, Mallesh Pai, ACM Conference on Electronic Commerce, EC, June 2010.

We study an auction where the bidders are middlemen, looking in turn to auction off the item if they win it. This setting arises naturally in online advertisement exchange systems, where the participants in the exchange are ad networks looking to sell ad impressions to their own advertisers. We present optimal strategies for both the bidders and the auctioneer in this setting. In particular, we show that the optimal strategy for bidders is to choose a randomized reserve price, and the optimal reserve price of the centeral auctioneer may depend on the number of bidders (unlike the case when there are no middlemen).

Computer Vision

"Discontinuous Seam-Carving for Video Retargeting," Matthias Grundmann, Vivek Kwatra, Mei Han, Irfan Essa, Computer Vision and Pattern Recognition, CVPR, June 2010.

Playing a video on devices with different form factors requires resizing (or retargeting) the video to fit the resolution of the given device. We have developed a content-aware technique for video retargeting based on discontinuous seam-carving, which unlike standard methods like uniform scaling and cropping, strives to retain salient content (such as actors, faces and structured objects) while discarding relatively unimportant pixels (such as the sky or a blurry background). The key innovations of our research include: (a) a solution that maintains temporal continuity of the video in addition to preserving its spatial structure, (b) space-time smoothing for automatic as well as interactive (user-guided) salient content selection, and (c) sequential frame-by-frame processing conducive for arbitrary length and streaming video.

Machine Learning

"Random classification noise defeats all convex potential boosters," Philip M. Long, Rocco A. Servedio, Machine Learning, vol. 78 (2010), pp. 287-304.

A popular approach that has been used to tackle many machine learning problems recently is to formulate them as optimization problems in which the goal is to minimize some “convex loss function.” This is an appealing formulation because these optimization problems can be solved in much the same way that a marble rolls to the bottom of a bowl. However, it turns out that there are drawbacks to this formulation. In "Random Classification Noise Defeats All Convex Potential Boosters," we show that any learning algorithm that works in this way can fail badly if there are noisy examples in the training data. This research motivates further study of other approaches to machine learning, for which there are algorithms that are provably more robust in the presence of noise.

IR

"Clustering Query Refinements by User Intent," Eldar Sadikov, Jayant Madhavan, Lu Wang, Alon Halevy, Proceedings of the International World Wide Web Conference, WWW, April 2010.

When users pose a search query, they usually have an underlying intent or information need, and the sequence of queries he or she poses in single search sessions is usually determined by the user's underlying intent. Our research demonstrates that there typically are only a small number of prominent underlying intents for a given user query. Further, these intents can be identified very accurately by an analysis of anonymized search query logs. Our results show that underlying intents almost always correspond to well-understood high-level concepts.

HCI

"How does search behavior change as search becomes more difficult?", Anne Aula, Rehan Khan, Zhiwei Guan, Proceedings of the ACM Conference on Human Factors in Computing Systems, CHI , April 2010.

Seeing that someone is getting frustrated with a difficult search task is easy for another person--just look for the frowns, and listen for the sighs. But could a computer tell that you're getting frustrated from just the limited behavior a search engine can observe? Our study suggests that it can: when getting frustrated, our data shows that users start to formulate question queries, they start to use advanced operators, and they spend a larger proportion of the time on the search results page. Used together, these signals can be used to build a model that can potentially detect user frustration.

Googlers receive multiple awards at the 2010 International Conference on Machine Learning

Research Admin — Tue, 27 Jul 2010 23:13:00 +0000

Posted by Fernando Pereira, Research Director

Googlers were recognized in three of the four paper awards at ICML 2010:

Sajid Siddiqi was co-recipient of the best paper award for Hilbert Space Embeddings of Hidden Markov Models with Le Song, Byron Boots, Geoff Gordon, and Alex Smola
John Duchi, who is also a graduate student at UC Berkeley, was co-recipient of the best student paper award for On the Consistency of Ranking Algorithms with Lester Mackey and Michael Jordan
And last but not the least, Yoram Singer was co-recipient of the best 10-year paper award for the most influential paper of ICML 2000, Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers (pdf) with Erin Allwein and Robert Schapire, which has currently 852 citations in Google Scholar.

I feel a particular connection to this last paper as Rob and Yoram were members of technical staff and Erin a student intern at the department I headed at AT&T Labs when this work was done.

Congratulations to all!

Announcing our Q2 Research Awards

Research Admin — Thu, 22 Jul 2010 16:23:00 +0000

Posted by Maggie Johnson, Director of Education & University Relations

We’re excited to announce the latest round of Google Research Awards, our program which identifies and supports full-time faculty pursuing research in areas of mutual interest. From a record number of submissions, we are funding 75 awards across 18 different areas—a total of more than $4 million.

The areas that received the highest level of funding for this round were systems and infrastructure, human computer interaction, multimedia and security. We also continue to develop more collaborations internationally. In this round, 26 percent of the funding was awarded to universities outside the U.S.

Here are some examples from this round of awards:

Jeremy Cooperstock, McGill University. A Spatialized Audio Map System for Mobile Blind Users (Geo/maps): A mobile audio system that provides location-based information, primarily for use by the blind and visually impaired communities.
Alexander Pretschner, Karlsruhe Institute of Technology, Germany. Towards Operational Privacy (Security and privacy): Provide a framework for precise semantic definitions in policies for domain-specific applications to give users a way to define the exact behaviour they expect from a system in application-specific contexts.
Erik Brynjolfsson, Massachusetts Institute of Technology. The Future of Prediction - How Google Searches Foreshadow Housing Prices and Quantities (Economics and market algortihms): How data from search engines like Google provide a highly accurate but simple way to predict future business activities.
Stephen Pulman, Oxford University Computing Laboratory. Automatic Generation of Natural Language Descriptions of Visual Scenes (Natural language processing): Develop a system that automatically generates a description of a visual scene.
Jennifer Rexford, Princeton. Rethinking Wide-Area Traffic Management (Software and hardware systems infrastructure): Drawing on mature techniques from optimization theory, design new traffic-management solutions where the hosts, routers, and management system cooperate in a more effective way.
John Quinn, Makerere University, Uganda. Mobile Crop Surveillance in the Developing World (Multimedia search and audio/video processing): A computer vision system using camera-enabled mobile devices to monitor the spread of viral disease among staple crops.
Allison Druin, University of Maryland. Understanding how Children Change as Searchers (Human-computer interaction): Do children change as searchers as they age? How do searchers typically shift between roles over time? If children change, how many of them become Power Searchers? If children don’t change, what roles do they typically demonstrate?
Ronojoy Adhikari, The Institute of Mathematical Sciences, India. Machine Learning of Syntax in Undeciphered Scripts (Machine learning): Devise algorithms that would learn to search for evidence of semantics in datasets such as the Indus script.

You can find the full list of this round’s award recipients here (pdf). More information on our research award program can be found on our website.

Google launches Korean Voice Search

Research Admin — Wed, 30 Jun 2010 21:45:00 +0000

Posted by Mike Schuster & Martin Jansche, Google Research

On June 16th, we launched our Korean voice search system. Google Search by Voice has been available in various flavors of English since 2008, in Mandarin and Japanese since 2009, and in French, Italian, German and Spanish just a few weeks ago (some more details in a recent blog post).

Korean speech recognition has received less attention than English, which has been studied extensively around the world by teams in both English and non-English speaking countries. Fundamentally, our methodology for developing a Korean speech recognition system is similar to the process we have used for other languages. We created a set of statistical models: an acoustic model for the basic sounds of the language, a language model for the words and phrases of the language, and a dictionary mapping the words to their pronunciations. We trained our acoustic model using a large quantity of recorded and transcribed Korean speech. The language model was trained using anonymized Korean web search queries. Once these models were trained, given an audio input, we can compute and display the most likely spoken phrase, along with its search result.

There were several challenges in developing a Korean speech recognition system, some unique to Korean, some typical of Asian languages and some universal to all languages. Here are some examples of problems that stood out:

Developing a Korean dictionary: Unlike English, where there are many publicly-available dictionaries for mapping words to their pronunciations, there are very few available for Korean. Since our Korean recognizer knows several hundred thousand words, we needed to create these mappings ourselves. Luckily, Korean has one of the most elegant and simple writing systems in the world (created in the 15th century!) and this makes mapping Korean words to pronunciations relatively straightforward. However, we found that Koreans also use quite a few English words in their queries, which complicates the mapping process. To predict these pronunciations, we built a statistical model using data from an existing (smaller) Korean dictionary.
Korean word boundaries: Although Korean orthography uses spaces to indicate word boundaries (unlike Japanese or Mandarin), we found that people use word boundaries inconsistently for search queries. To limit the size of the vocabulary generated from the search queries, we used statistical techniques to cut rare long words into smaller sub-words (similarly to the system we developed for Japanese).
Pronunciation exceptions: Korean (like all other languages) has many exceptions for pronunciations that are not immediately obvious. For example, numbers are often written as digit sequences but not necessarily spoken this way (2010 = 이천십). The same is true for many common alphanumeric sequences like “mp3”, “kbs2” or mixed queries like “삼성 tv”, which often contain spelled letters and possibly English spoken digits as opposed to Korean ones.
Encoding issues: Korean script (Hangul) is written in syllabic blocks, with each block containing at least two of the 24 modern Hangul letters (Jamo), at least one consonant and one vowel. Including the normal ASCII characters this brings the total number of possible basic characters to over 10000, not including Hanja (used mostly in the formal spelling of names). So, despite its simple writing system, Korean still presents the same challenge of handling a large alphabet that is typical of Asian languages.
Script ambiguity: We found that some users like to use English native words and others the Korean transliteration (example: “ncis season 6” vs. “ncis 시즌6”). This makes it challenging to train and evaluate the system. We use a metric that estimates whether our transcription will give the correct web page result on the user’s smart phone screen, and such script variations make this tricky.
Recognizing rare words: The recognizer is good at recognizing things users often type into the search engine, such as cities, shops, addresses, common abbreviations, common product model numbers and well-known names like “김연아”. However, rare words (like many personal names) are often harder for us to recognize. We continue to work on improving those.
Every speaker sounds different: People speak in different styles, slow or fast, with an accent or without, have lower or higher pitched voices, etc. To make our system work for all these different conditions, we trained our system using data from many different sources to capture as many conditions as possible.

When speech recognizers make errors, the reason is usually that the models are not good enough, and that often means they haven’t been trained on enough data. For Korean (and all other languages) our cloud computing infrastructure allows us to retrain our models frequently and using an ever growing amount of data to continually improve performance. Over time, we are committed to improve the system regularly to make speech a user-friendly input method on mobile devices.

Google Search by Voice now available in France, Italy, Germany and Spain

Research Admin — Mon, 14 Jun 2010 23:00:00 +0000

Posted by Thad Hughes, Martin Jansche, and Pedro Moreno, Google Research

Google’s speech team is composed of people from many different cultural backgrounds. Indeed, if we count the languages spoken by our teammates, the number comes to well over a dozen. Given our own backgrounds and interests, we are naturally excited to extend our software to work with many different languages and dialects. After testing the waters with English, Mandarin Chinese, and Japanese, we decided to tackle four main European languages which are often referred to as FIGS - French, Italian, German and Spanish.

Developing Voice Search systems in each of these languages presented its own challenges. French and Spanish required special work to deal with diacritic and accent marks (e.g. ç in French, ñ in Spanish). When we develop a new language we tweak our dictionaries based on user generated content. To our surprise we found that a lot of this content in French and Spanish often uses non-standard orthography. For example a French speaker might type “francoise” into a search engine and still expect it to return results for “Françoise”. Likewise in Spanish a user might type “espana” and expect results for the term “España”. Of course a lot of this has to do with the fact that, until recently, domain names (like www.elpais.es) did not allow diacritics, and that entering special characters is often painful but omitting diacrictics is usually not an obstacle to communication. However, non-standard spellings distort the intended pronunciations. For example, if “francoise” were a real French word, one would expect it to be pronounced “franquoise”. In order to capture the intended pronunciation of the non-standard spellings, we fixed the orthography in our dictionaries for Spanish and French automatically. While this is not perfect, it deals with many of the offending cases.

Since our Voice search systems typically understand more than a million different words in each language, developing pronunciation dictionaries is one of the most critical tasks. We need the dictionary to match what the user said with the written form. Not surprisingly we found that dictionary development for some languages like Spanish and Italian to be extremely easy, as they have very regular orthographies. In fact the core of our Spanish pronunciation module consists of less than 100 lines of source code. Other languages like German and French have more complex orthographies. For example in French “au”, “eaux” and “hauts” are all pronounced “o”.

A notable aspect of German (especially “Internet German”) is that a lot of English words are in common usage. We do our best to recognize thousands of English words, even though English contains some sounds that don’t exist in German, like “th” in “the”. One of the trickiest examples we came across was when one of our volunteers read “nba playoffs 2009”, saying “nba playoffs” in English followed by “zwei tausend neun” in German. So go ahead and search for “Germany’s Next Topmodel” or “Postbank Online”, see if it works for you.

German is also notorious for having long, complex words. Our favorite examples include:

Berufskraftfahrerqualifikationsgesetz (or shorter: BKrFQG)
Eierschalensollbruchstellenverursacher
Verkehrsinfrastrukturfinanzierungsgesellschaft
Stichpimpulibockforcelorum
Hypothalamus-Hypophysen-Nebennierenrinde-Achse

Just for fun, compare how long it takes you to say these to Voice Search vs. typing them.

Even though a vocabulary size of one million words sounds like a large number, each of these languages has even more words, so we need a procedure to select which ones to model. We obviously do not do this manually and instead use statistical procedures to identify the list of words we will allow. We do this by looking at many sources of data and looking at the frequency of words. It is therefore surprising to find sometimes really weird terms selected by our algorithms. For example in Spanish we found these unusual words:

So, in the unlikely event that you ever try a Spanish voice search query like this “imágenes del músculo supercalifragilisticoespialidoso chiripitiflautico esternocleidomastoideo” you may be surprised to see that it works.

French, Italian, German, and Spanish are spoken in many parts of the world. In this first release of Google Search by Voice in these languages, we initially only support the varieties spoken in France, Italy, Germany, and Spain, respectively. The reason is that almost all aspects of a Voice Search system are affected by regional variation: French speakers from different regions have slightly different accents, use a number of different words, and will want to search for different things. Eventually, we plan to support other regions as well, and we will work hard to make sure our systems work well for all of you.

So, we hope you find these new voice search system useful and fun to use. We definitely had a “supercalifragilisticoespialidoso chiripitiflautico” time developing them.

Google Fusion Tables celebrates one year of data management

Research Admin — Wed, 09 Jun 2010 20:00:00 +0000

Posted by Alon Halevy, Google Research and Rebecca Shapley, User Experience

A year ago we launched Google Fusion Tables, an easy way to integrate, visualize and collaborate on data tables in the Google cloud. You used it and saw the potential, and told us what else you wanted. Since then, we’ve responded by offering programmatic access through the Fusion Tables API, math across data columns owned by multiple people, and search on the collection of tables that have been made public. We published about Fusion Tables in SIGMOD 2010 and in the First Symposium on Cloud Computing. And since the map visualizations were such a hit, we made them even better by supporting large numbers of points, lines and polygons, custom HTML in map pop-up balloons complete with tutorials and integration with the Google Maps API. We’ve made all this capability available on Google’s cloud and are excited to see examples every day of how our cloud approach to data tables is changing the game and making structured data management, collaboration, and publishing fast, easy, and open.

But more exciting than all the features we’ve been releasing is the things that people have been *doing* with Fusion Tables. News agencies have been taking advantage of Fusion Tables to map data that governments make public, and tell a more complete story (see the L.A. Times, Knoxville News, and Chicago Tribune). Just this month the State of California kicked off an application development contest, hosting data sets like this one in Fusion Tables for easy API access for developers. And the US Department of Health and Human Services held the Community Health Data Forum, where attendees presented data applications such as the heart-friendly and people-friendly hospital-finder, built with Google Fusion Tables.

It continues to astound us how quickly our users are able to pull together these kinds of compelling data applications with Fusion Tables, again showing the power of a cloud approach to data. Fusion Tables were the multimedia extension to Joseph Rossano’s art exhibit on Butterflies and DNA barcodes, an easy way to map real-estate in Monterey county or potholes in Spain, provided the geo-catalog for wind power data and ethanol-selling stations, and even the data backend for an geo portal to organize water data for Africa, among many, many other uses.

As we head into our second year, we’re looking forward to delivering more tools that make data management easier and more powerful on the web. What’s next for Fusion Tables? Request your favorite features on our Feature Request (a special implementation of Google Moderator), and follow the latest progress of Fusion Tables on our User Group, Facebook, and Twitter. We love to hear from you!