- Part 5

About: Research @ Google

Profile:

Website

http://www.blogger.com/profile/12098626514775266161

Contact:

Posts by Research @ Google

Under the hood of Croatian, Filipino, Ukrainian, and Vietnamese in Google Voice Search

Jul 25, 2013

Posted by Eugene Weinstein and Pedro Moreno, Google Speech Team

Although we’ve been working on speech recognition for several years, every new language requires our engineers and scientists to tackle unique challenges. Our most recent additions – Croatian, Filipino, Ukrainian, and Vietnamese – required creative solutions to reflect how each language is used across devices and in everyday conversations.

For example, since Vietnamese is a tonal language, we had to explore how to take tones into consideration. One simple technique is to model the tone and vowel combinations (tonemes) directly in our lexicons. This, however, has the side effect of a larger phonetic inventory. As a result we had to come up with special algorithms to handle the increased complexity. Additionally, Vietnamese is a heavily diacritized language, with tone markers on a majority of syllables. Since Google Search is very good at returning valid results even when diacritics are omitted, our Vietnamese users frequently omit the diacritics when typing their queries. This creates difficulties for the speech recognizer, which selects its vocabulary from typed queries. For this purpose, we created a special diacritic restoration algorithm which enables us to present properly formatted text to our users in the majority of cases.

Filipino also presented interesting challenges. Much like in other multilingual societies such as Hong Kong, India, South Africa, etc., Filipinos often mix several languages in their daily life. This is called code switching. Code switching complicates the design of pronunciation, language, and acoustic models. Speech scientists are effectively faced with a dilemma: should we build one system per language, or should we combine all languages into one?

In such situations we prefer to model the reality of daily language use in our speech recognizer design. If users mix several languages, our recognizers should do their best in modeling this behavior. Hence our Filipino voice search system, while mainly focused on the Filipino language, also allows users to mix in English terms.

The algorithms we’re using to model how speech sounds are spoken in each language make use of our distributed large-scale neural network learning infrastructure (yes, the same one that spontaneously discovered cats on YouTube!). By partitioning the gigantic parameter set of the model, and by evaluating each partition on a separate computation server, we’re able to achieve unprecedented levels of parallelism in training acoustic models.

The more people use Google speech recognition products, the more accurate the technology becomes. These new neural network technologies will help us bring you lots of improvements and many more languages in the future.

Read | No Comments | Tags: Uncategorized

11 Billion Clues in 800 Million Documents: A Web Research Corpus Annotated with Freebase Concepts

Jul 17, 2013

Posted by Dave Orr, Amar Subramanya, Evgeniy Gabrilovich, and Michael Ringgaard, Google Research

“I assume that by knowing the truth you mean knowing things as they really are.”

- Plato

When you type in a search query — perhaps Plato — are you interested in the string of letters you typed? Or the concept or entity represented by that string? But knowing that the string represents something real and meaningful only gets you so far in computational linguistics or information retrieval — you have to know what the string actually refers to. The Knowledge Graph and Freebase are databases of things, not strings, and references to them let you operate in the realm of concepts and entities rather than strings and n-grams.

We’ve previously released data to help with disambiguation and recently awarded $1.2M in research grants to work on related problems. Today we’re taking another step: releasing data consisting of nearly 800 million documents automatically annotated with over 11 billion references to Freebase entities.

These Freebase Annotations of the ClueWeb Corpora (FACC) consist of ClueWeb09 FACC and ClueWeb12 FACC. 11 billion phrases that refer to concepts and entities in Freebase were automatically labeled with their unique identifiers (Freebase MID’s). For example:

Since the annotation process was automatic, it likely made mistakes. We optimized for precision over recall, so the algorithm skipped a phrase if it wasn’t confident enough of the correct MID. If you prefer higher precision, we include confidence levels, so you can filter out lower confidence annotations that we did include.

Based on review of a sample of documents, we believe the precision is about 80-85%, and recall, which is inherently difficult to measure in situations like this, is in the range of 70-85%. Not every ClueWeb document is included in this corpus; documents in which we found no entities were excluded from the set. A document might be excluded because there were no entities to be found, because the entities in question weren’t in Freebase, or because none of the entities were resolved at a confidence level above the threshold.

The ClueWeb data is used in multiple TREC tracks. You may also be interested in our annotations of several TREC query sets, including those from the Million Query Track and Web Track.

If you would prefer a human-annotated set, you might want to look at the Wikilinks Corpus we released last year. Entities there were disambiguated by links to Wikipedia, inserted by the authors of the page, which is effectively a form of human annotation.

You can find more detail and download the data on the pages for the two sets: ClueWeb09 FACC and ClueWeb12 FACC. You can also subscribe to our data release mailing list to learn about releases as they happen.

Special thanks to Jamie Callan and Juan Caicedo Carvajal for their help throughout the annotation project.

Read | No Comments | Tags: Uncategorized

New research from Google shows that 88% of the traffic generated by mobile search ads is not replaced by traffic originating from mobile organic search

Jul 16, 2013

Posted by Shaun Lysen, Statistician at Google

Often times people are presented with two choices after making a search on their devices – they could either click on the organic results for their query, or on the ads that appear on the page. Website owners who want to build a strong online presence often wonder how to balance organic search and paid search ads in driving website traffic. But what happens when ads are paused? Would businesses see an increase in organic traffic that could make up for the loss in paid traffic? To answer these questions, we released a “Search Ads Pause” analysis in 2011 showing that 89% of traffic generated by search ads is not replaced by organic clicks.

As smartphones become increasingly important to consumers, we recently conducted the same studies for mobile devices to understand the role of mobile search ads in driving site traffic. From March 2012 – April 2013, we ran 327 unique studies across US-based mobile advertising accounts from 12 key industries.

We selected AdWords accounts that exhibited sharp changes in advertisers’ spending on mobile search (ad spend) and identified stable periods before the spend change (pre-period) and after the spend change (post-period). We observed the number of organic and paid clicks, and the number of times organic results appear on the first page of search results (impressions) during both the pre-period and post-period. Google then created a proprietary statistical model to predict what the number of organic and paid clicks would have been in the post-period had the ad spend not changed, and compared those figures to the actual number of clicks observed. We then were able to estimate what percentage of paid clicks are incremental, i.e. a visit to the advertiser’s site from an ad click would not have been replaced by a visit to the site from an organic click.

The final results showed that mobile search ads contribute to a very high proportion of incremental traffic to websites. On average, 88% of mobile paid clicks are lost and not recovered when a mobile search campaign is paused. This finding is consistently high across the 12 key industries, including automotive, travel, retail and more. The full study, including details around the methodology and findings, can be found in the paper ‘Incremental Clicks Impact of Mobile Search Advertising’.

Read | No Comments | Tags: Uncategorized

Google Databoard: A new way to explore industry research

Jul 9, 2013

Posted by Adam Grunewald, Mobile Marketing Manager

It’s important for people to stay up to date about the most recent research and insights related to their work or personal lives. But it can be difficult to keep up with all the new studies and updated data that’s out there. To make life a bit easier, we’re introducing a new take on how research can be presented. The Databoard for Research Insights enables people to explore and interact with some of Google’s recent research in a unique and immersive way. The Databoard uses responsive design to to offer an engaging experience across devices. Additionally, the tool is a new venture into data visualization and shareability with bite-sized charts and stats that can be shared with your friends or coworkers. The Databoard is currently home to several of Google’s market research studies for businesses, but we believe that this way of conveying data can work across all forms of research.

Here are some of the things that make the Databoard different from other ways research is released today:

Easy to use
All of the information in the Databoard is presented in a bite-sized way so that you can quickly find relevant information. You can explore an entire study or jump straight to the topics or data points you care about. The Databoard is also optimized for all devices so you can explore the research on your computer, tablet or smartphone.

Meant to be shared
Most people, when they find a compelling piece of data, want to share it! Whether it’s with a colleague, client, or a community on a blog or social network, compelling insights and data are meant to be shared. With the databoard, you can easily share individual charts and insights or collections of data with anyone through email or social networks, just look for the share button at the top of each chart or insight.

Create a cohesive story
Most research studies set out to answer a specific question, like how people use their smartphones in stores, or how a specific type of consumer shops. This means that businesses need to look across multiple pieces of research to craft a comprehensive business or marketing strategy. With this in mind, the Databoard lets you curate a customized infographic out of the charts or data points you find important across multiple Google research studies. Creating an infographic is quick and easy, and you can share the finished product with your friends or colleagues.

The databoard is currently home to six research studies including The New Multi-screen World, Mobile In-store shopper research and Mobile search moments. New studies will be added frequently. To get started creating your own infographic, visit the Databoard now.

Read | No Comments | Tags: Uncategorized

Conference Report: USENIX Annual Technical Conference (ATC) 2013

Jul 3, 2013

Posted by Murray Stokely, Google Storage Analytics Team

This year marks Google’s eleventh consecutive year as a sponsor of the USENIX Annual Technical Conference (ATC), just one of the co-located events at USENIX Federated Conference Week (FCW), which combines numerous conferences and workshops covering fields such as Autonomic Computing, Feedback Computing and much more in an intensive week of research, trends, and community interaction.

ATC provides a broad forum for computing systems research with an emphasis on implementations and experimental results. In addition to the Googlers presenting publications, we had two members on the program committee of ATC and several keynote speakers, invited speakers, panelists, committee members, and participants at the other co-located events at FCW.

In the paper Janus: Optimal Flash Provisioning for Cloud Storage Workloads, Googler Christoph Albrecht and co-authors demonstrated a system that allows users to make informed ﬂash memory provisioning and partitioning decisions in cloud-scale distributed ﬁle systems that include both ﬂash storage and disk tiers. As ﬂash memory is still expensive, it is best to use it only for workloads that can make good use of it. Janus creates long term workload characterizations based on RPC samples and file age metadata. It uses these workload characterizations to formulate and solve an optimization problem that maximizes the reads sent to the flash tier. Based on evaluations from workloads using Janus, in use at Google for the past 6 months, the authors conclude that the recommendation system is quite effective, with ﬂash hit rates using the optimized recommendations 47-76% higher than the option of using the ﬂash as an unpartitioned tier.

In packetdrill: Scriptable Network Stack Testing, from Sockets to Packets, Google’s Neal Cardwell and co-authors showcased a portable, open-source scripting tool that enables testing the correctness and performance of network protocols. Despite their importance in modern computer systems, network protocols often undergo only ad hoc testing before their deployment, in large part due to their complexity. Furthermore, new algorithms have unforeseen interactions with other features, so testing has only become more daunting as TCP has evolved. The packetdrill tool was instrumental in the development of three new features for Linux TCP—Early Retransmit, Fast Open, and Loss Probes—and allowed the authors to ﬁnd and ﬁx 10 bugs in Linux. Furthermore, the team uses packetdrill in all phases of the development process for the kernel used in one of the world’s largest Linux installations. In the hope that sharing packetdrill with the community will make the process of improving Internet protocols an easier one, the source code and test scripts for packetdrill have been made freely available.

There were also additional refereed publications with Google co-authors at some of the co-located events at FCW, notably NicPic: Scalable and Accurate End-Host Rate Limiting, which outlines a system which enables accurate network traffic scheduling in a scalable fashion, and AGILE: Elastic Distributed Resource Scaling for Infrastructure-as-a-Service, a system that efficiently handles dynamic application workloads, reducing both penalties and user dissatisfaction.

Google is proud to support the academic community through conference participation and sponsorship. In particular, we are happy to mention one of the other interesting papers from this year’s USENIX FCW, co-authored by former Google PhD fellowship recipient Ashok Anand, MiG: Efficient Migration of Desktop VM Using Semantic Compression.

USENIX is a supporter of open access, so the papers and videos from the talks are available on the conference website.

Read | No Comments | Tags: Uncategorized

Natural Language Understanding-focused awards announced

Jul 2, 2013

Posted by Massimiliano Ciaramita, Research Scientist and David Harper, Head University Relations (EMEA)

Some of the biggest challenges for the scientific community today involve understanding the principles and mechanisms that underlie natural language use on the Web. An example of long-standing problem is language ambiguity; when somebody types the word “Rio” in a query do they mean the city, a movie, a casino, or something else? Understanding the difference can be crucial to help users get the answer they are looking for. In the past few years, a significant effort in industry and academia has focused on disambiguating language with respect to Web-scale knowledge repositories such as Wikipedia and Freebase. These resources are used primarily as canonical, although incomplete, collections of “entities”. As entities are often connected in multiple ways, e.g., explicitly via hyperlinks and implicitly via factual information, such resources can be naturally thought of as (knowledge) graphs. This work has provided the first breakthroughs towards anchoring language in the Web to interpretable, albeit initially shallow, semantic representations. Google has brought the vision of semantic search directly to millions of users via the adoption of the Knowledge Graph. This massive change to search technology has also been called a shift “from strings to things”.

Understanding natural language is at the core of Google’s work to help people get the information they need as quickly and easily as possible. At Google we work hard to advance the state of the art in natural language processing, to improve the understanding of fundamental principles, and to solve the algorithmic and engineering challenges to make these technologies part of everyday life. Language is inherently productive; an infinite number of meaningful new expressions can be formed by combining the meaning of their components systematically. The logical next step is the semantic modeling of structured meaningful expressions — in other words, “what is said” about entities. We envision that knowledge graphs will support the next leap forward in language understanding towards scalable compositional analyses, by providing a universe of entities, facts and relations upon which semantic composition operations can be designed and implemented.

So we’ve just awarded over $1.2 million to support several natural language understanding research awards given to university research groups doing work in this area. Research topics range from semantic parsing to statistical models of life stories and novel compositional inference and representation approaches to modeling relations and events in the Knowledge Graph.

These awards went to researchers in nine universities and institutions worldwide, selected after a rigorous internal review:

Mark Johnson and Lan Du (Macquarie University) and Wray Buntine (NICTA) for “Generative models of Life Stories”

Percy Liang and Christopher Manning (Stanford University) for “Tensor Factorizing Knowledge Graphs”

Sebastian Riedel (University College London) and Andrew McCallum (University of Massachusetts, Amherst) for “Populating a Knowledge Base of Compositional Universal Schema”

Ivan Titov (University of Amsterdam) for “Learning to Reason by Exploiting Grounded Text Collections”

Hans Uszkoreit (Saarland University and DFKI), Feiyu Xu (DFKI and Saarland University) and Roberto Navigli (Sapienza University of Rome) for “Language Understanding cum Knowledge Yield”

Luke Zettlemoyer (University of Washington) for “Weakly Supervised Learning for Semantic Parsing with Knowledge Graphs”

We believe the results will be broadly useful to product development and will further scientific research. We look forward to working with these researchers, and we hope we will jointly push the frontier of natural language understanding research to the next level.

Read | No Comments | Tags: Uncategorized

Fast, Accurate Detection of 100,000 Object Classes on a Single Machine

Jun 27, 2013

Posted by Tom Dean, Google Research

Humans can distinguish among approximately 10,000 relatively high-level visual categories, but we can discriminate among a much larger set of visual stimuli referred to as features. These features might correspond to object parts, animal limbs, architectural details, landmarks, and other visual patterns we don’t have names for, and it is this larger collection of features we use as a basis with which to reconstruct and explain our day-to-day visual experience. Such features provide the components for more complicated visual stimuli and establish a context essential for us to resolve ambiguous scenes.

Contrary to current practice in computer vision, the explanatory context required to resolve a visual detail may not be entirely local. A flash of red bobbing along the ground might be a child’s toy in the context of a playground or a rooster in the context of a farmyard. It would be useful to have a large number of feature detectors capable of signaling the presence of such features, including detectors for sandboxes, swings, slides, cows, chickens, sheep and farm machinery necessary to establish the context for distinguishing between these two possibilities.

This year’s winner of the CVPR Best Paper Award, co-authored by Googlers Tom Dean, Mark Ruzon, Mark Segal, Jonathon Shlens, Sudheendra Vijayanarasimhan and Jay Yagnik, describes technology that will enable computer vision systems to extract the sort of semantically rich contextual information required to recognize visual categories even when a close examination of the pixels spanning the object in question might not be sufficient for identification in the absence of such contextual clues. Specifically, we consider a basic operation in computer vision that involves determining for each location in an image the degree to which a particular feature is likely to be present in the image at that particular location.

This so-called convolution operator is one of the key operations used in computer vision and, more broadly, all of signal processing. Unfortunately, it is computationally expensive and hence researchers use it sparingly or employ exotic SIMD hardware like GPUs and FPGAs to mitigate the computational cost. We turn things on their head by showing how one can use fast table lookup — a method called hashing — to trade time for space, replacing the computationally-expensive inner loop of the convolution operator — a sequence of multiplications and additions — required for performing millions of convolutions with a single table lookup.

We demonstrate the advantages of our approach by scaling object detection from the current state of the art involving several hundred or at most a few thousand of object categories to 100,000 categories requiring what would amount to more than a million convolutions. Moreover, our demonstration was carried out on a single commodity computer requiring only a few seconds for each image. The basic technology is used in several pieces of Google infrastructure and can be applied to problems outside of computer vision such as auditory signal processing.

On Wednesday, June 26, the Google engineers responsible for the research were awarded Best Paper at a ceremony at the IEEE Conference on Computer Vision and Pattern Recognition held in Portland Oregon. The full paper can be found here.

Read | No Comments | Tags: Uncategorized

Some Innovative MOOCs

Jun 18, 2013

Posted by Maggie Johnson, Director of Education and University Relations

Last summer, we jumped into the world of MOOCs (Massive Open Online Courses) with our own course on search skills, Power Searching. Soon after, we open sourced the platform that we developed to present the course — Course Builder. A large number of courses have been hosted on Course Builder since, with many more coming soon. As can be seen from the list of courses, our goal is to provide the capability for anyone to create a MOOC. We’ve been pleasantly surprised by the variety of courses and the creativity of the instructors building on Course Builder.

For example, GivingWithPurpose is an innovative MOOC presented by Learning By Giving, one of Doris Buffett’s foundations for “giving it all away.” Instructor Rebecca Riccio, who teaches philanthropy at both Northeastern and Brandeis Universities, feels that reaching thousands of people in a discussion about how to allocate scarce resources to address the needs of our communities has huge potential. “For all the social, cultural, and economic value we derive from the nonprofit sector, we do shockingly little to educate people about why it is so important and what we can do to help it thrive. So while I believe GivingWithPurpose will be successful in its primary goal of teaching students how to give more effectively, in ways that both satisfy their own motivations for giving and support high-performing nonprofit organizations, my second, perhaps more ambitious goal is to promote more informed civic engagement.”

We’ve also hosted MOOCs on evaluating and selecting soccer players, how to search for a job, and how to develop digital learning opportunities for students in public schools. We have many university courses such as Game Theory from Stanford and Information Visualization from Indiana University.

Course Builder’s support of both traditional and non-traditional education opportunities is core to its mission. We’ll continue to build features that help university professors, K12 teachers and anyone else who has something important to teach.

Read | No Comments | Tags: Uncategorized

Excellent Papers for 2012

Jun 13, 2013

Posted by Corinna Cortes and Alfred Spector, Google Research

Googlers across the company actively engage with the scientific community by publishing technical papers, contributing open-source packages, working on standards, introducing new APIs and tools, giving talks and presentations, participating in ongoing technical debates, and much more. Our publications offer technical and algorithmic advances, feature aspects we learn as we develop novel products and services, and shed light on some of the technical challenges we face at Google.

In an effort to highlight some of our work, we periodically select a number of publications to be featured on this blog. We first posted a set of papers on this blog in mid-2010 and subsequently discussed them in more detail in the following blog postings. In a second round, we highlighted new noteworthy papers from the later half of 2010 and again in 2011. This time we honor the influential papers authored or co-authored by Googlers covering all of 2012 — covering roughly 6% of our total publications. It’s tough choosing, so we may have left out some important papers. So, do see the publications list to review the complete group.

In the coming weeks we will be offering a more in-depth look at some of these publications, but here are the summaries:

Algorithms and Theory

Online Matching with Stochastic Rewards
Aranyak Mehta*, Debmalya Panigrahi [FOCS'12]
Online advertising is inherently stochastic: value is realized only if the user clicks on the ad, while the ad platform knows only the probability of the click. This paper is the first to introduce the stochastic nature of the rewards to the rich algorithmic field of online allocations. The core algorithmic problem it formulates is online bipartite matching with stochastic rewards, with known click probabilities. The main result is an online algorithm which obtains a large fraction of the optimal value. The paper also shows the difficulty introduced by the stochastic nature, by showing how it behaves very differently from the classic (non-stochastic) online matching problem.

Matching with our Eyes Closed
Gagan Goel*, Pushkar Tripathi* [FOCS'12]
In this paper we propose a simple randomized algorithm for finding a matching in a large graph. Unlike most solutions to this problem, our approach does not rely on building large combinatorial structures (like blossoms) but works completely locally. We analyze the performance of our algorithm and show that it does significantly better than the greedy algorithm. In doing so we improve a celebrated 18 year old result by Aronson et. al.

Simultaneous Approximations for Adversarial and Stochastic Online Budgeted Allocation
Vahab Mirrokni*, Shayan Oveis Gharan, Morteza Zadimoghaddam, [SODA'12]
In this paper, we study online algorithms that simultaneously perform well in worst-case and average-case instances, or equivalently algorithms that perform well in both stochastic and adversarial models at the same time. This is motivated by online allocation of queries to advertisers with budget constraints. Stochastic models are not robust enough to deal with traffic spikes and adversarial models are too pessimistic. While several algorithms have been proposed for these problems, each algorithm was known to perform well in one model and not both, and we present new results for a single algorithm that works well in both models.

Economics and EC

Polyhedral Clinching Auctions and the Adwords Polytope
Gagan Goel*, Vahab Mirrokni*, Renato Paes Leme [STOC'12]
Budgets play a major role in ad auctions where advertisers explicitly declare budget constraints. Very little is known in auctions about satisfying such budget constraints while keeping incentive compatibility and efficiency. The problem becomes even harder in the presence of complex combinatorial constraints over the set of feasible allocations. We present a class of ascending-price auctions addressing this problem for a very general class of (polymatroid) allocation constraints including the AdWords problem with multiple keywords and multiple slots.

HCI

Backtracking Events as Indicators of Usability Problems in Creation-Oriented Applications
David Akers*, Robin Jeffries*, Matthew Simpson*, Terry Winograd [TOCHI '12]
Backtracking events such as undo can be useful automatic indicators of usability problems for creation-oriented applications such as word processors and photo editors. Our paper presents a new cost-effective usability evaluation method based on this insight.

Talking in Circles: Selective Sharing in Google+
Sanjay Kairam, Michael J. Brzozowski*, David Huffaker*, Ed H. Chi*, [CHI'12]
This paper explores why so many people share selectively on Google+: to protect their privacy but also to focus and target their audience. People use Circles to support these goals, organizing contacts by life facet, tie strength, and interest.

Information Retrieval

Online selection of diverse results
Debmalya Panigrahi, Atish Das Sarma, Gagan Aggarwal*, and Andrew Tomkins*, [WSDM'12]
We consider the problem of selecting subsets of items that are simultaneously diverse in multiple dimensions, which arises in the context of recommending interesting content to users. We formally model this optimization problem, identify its key structural characteristics, and use these observations to design an extremely scalable and efﬁcient algorithm. We prove that the algorithm always produces a nearly optimal solution and also perform experiments on real-world data that indicate that the algorithm performs even better in practice than the analytical guarantees.

Machine Learning

Large Scale Distributed Deep Networks
Jeffrey Dean, Greg S. Corrado*, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc’Aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, Andrew Y. Ng, NIPS 2012;
In this paper, we examine several techniques to improve the time to convergence for neural networks and other models trained by gradient-based methods. The paper describes a system we have built that exploits both model-level parallelism (by partitioning the nodes of a large model across multiple machines) and data-level parallelism (by having multiple replicas of a model process different training data and coordinating the application of updates to the model state through a centralized-but-partitioned parameter server system). Our results show that very large neural networks can be trained effectively and quickly on large clusters of machines.

Open Problem: Better Bounds for Online Logistic Regression
Brendan McMahan* and Matthew Streeter*, COLT/ICML’12 Joint Open Problem Session, JMLR: Workshop and Conference Proceedings.
One of the goals of research at Google is to help point out important open problems–precise questions that are interesting academically but also have important practical ramifications. This open problem is about logistic regression, a widely used algorithm for predicting probabilities (what is the probability an email message is spam, or that a search ad will be clicked). We show that in the simple one-dimensional case, much better results are possible than current theoretical analysis suggests, and we ask whether our results can be generalized to arbitrary logistic regression problems.

Spectral Learning of General Weighted Automata via Constrained Matrix Completion
Borja Balle and Mehryar Mohri*, NIPS 2012.
Learning weighted automata from finite samples drawn from an unknown distribution is a central problem in machine learning and computer science in general, with a variety of applications in text and speech processing, bioinformatics, and other areas. This paper presents a new family of algorithms for tackling this problem for which it proves learning guarantees. The algorithms introduced combine ideas from two different domains: matrix completion and spectral methods.

Machine Translation

Improved Domain Adaptation for Statistical Machine Translation
Wei Wang*, Klaus Macherey*, Wolfgang Macherey*, Franz Och* and Peng Xu*, [AMTA'12]
Research in domain adaptation for machine translation has been mostly focusing on one domain. We present a simple and effective domain adaptation infrastructure that makes an MT system with a single translation model capable of providing adapted, close-to-upper-bound domain-specific accuracy while preserving the generic translation accuracy. Large-scale experiments on 20 language pairs for patent and generic domains show the viability of our approach.

Multimedia and Computer Vision

Reconstructing the World’s Museums
Jianxiong Xiao and Yasutaka Furukawa*, [ECCV '12]
Virtual navigation and exploration of large indoor environments (e.g., museums) have been so far limited to either blueprint-style 2D maps that lack photo-realistic views of scenes, or ground-level image-to-image transitions, which are immersive but ill-suited for navigation. This paper presents a novel vision-based 3D reconstruction and visualization system to automatically produce clean and well-regularized texture-mapped 3D models for large indoor scenes, from ground-level photographs and 3D laser points. For the first time, we enable users to easily browse a large scale indoor environment from a bird’s-eye view, locate specific room interiors, fly into a place of interest, view immersive ground-level panoramas, and zoom out again, all with seamless 3D transitions.

The intervalgram: An audio feature for large-scale melody recognition
Thomas C. Walters*, David Ross*, Richard F. Lyon*, [CMMR'12]
Intervalgrams are small images that summarize the structure of short segments of music by looking at the musical intervals between the notes present in the music. We use them for finding cover songs – different pieces of music that share the same underlying composition. Wedo this by comparing ‘heatmaps’ which look at the similarity between intervalgrams from different pieces of music over time. If we see a strong diagonal line in the heatmap, it’s good evidence that the songs are musically similar.

General and Nested Wiberg Minimization
Dennis Strelow*, [CVPR'12]
Eriksson and van den Hengel’s CVPR 2010 paper showed that Wiberg’s least squares matrix factorization, which effectively eliminates one matrix from the factorization problem, could be applied to the harder case of L1 factorization. Our paper generalizes their approach beyond factorization to general nonlinear problems in two sets of variables, like perspective structure-from-motion. We also show that with our generalized method, one Wiberg minimization can also be nested inside another, effectively eliminating two of three sets of unknowns, and we demonstrated this idea using projective struture-from-motion

Calibration-Free Rolling Shutter Removal
Matthias Grundmann*, Vivek Kwatra*, Daniel Castro, Irfan Essa*, International Conference on Computational Photography ’12. Best paper.
Mobile phones and current generation DSLR’s, contain an electronic rolling shutter, capturing each frame one row of pixels at a time. Consequently, if the camera moves during capture, it will cause image distortions ranging from shear to wobbly distortions. We propose a calibration-free solution based on a novel parametric mixture model to correct these rolling shutter distortions in videos that enables real-time rolling shutter rectification as part of YouTube’s video stabilizer.

Natural Language Processing

Vine Pruning for Efficient Multi-Pass Dependency Parsing
Alexander Rush, Slav Petrov*, The 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL ’12), Best Paper Award.
Being able to accurately analyze the grammatical structure of sentences is crucial for language understanding applications such as machine translation or question answering. In this paper we present a method that is up to 200 times faster than existing methods and enables the grammatical analysis of text in large-scale applications. The key idea is to perform the analysis in multiple coarse-to-fine passes, resolving easy ambiguities first and tackling the harder ones later on.

Cross-lingual Word Clusters for Direct Transfer of Linguistic Structure
Oscar Tackstrom, Ryan McDonald*, Jakob Uszkoreit*, North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL ’12), Best Student Paper Award.
This paper studies how to build meaningful cross-lingual word clusters, i.e., clusters containing lexical items from two languages that are coherent along some abstract dimension. This is done by coupling distributional statistics learned from huge amounts of language specific data coupled with constraints generated from parallel corpora. The resulting clusters are used to improve the accuracy of multi-lingual syntactic parsing for languages without any training resources.

Networks

How to Split a Flow
Tzvika Hartman*, Avinatan Hassidim*, Haim Kaplan*, Danny Raz*, Michal Segalov*, [INFOCOM '12]
Decomposing a ﬂow into a small number of paths is a very important task arises in various network optimization mechanisms. In this paper we develop an an approximation algorithm for this problem that has both provable worst case performance grantees as well as good practical behavior.

Deadline-Aware Datacenter TCP (D2TCP)
Balajee Vamanan, Jahangir Hasan*, T. N. Vijaykumar, [SIGCOMM '12]
Some of our most important products like search and ads operate under soft-real-time constraints. They are architected and fine-tuned to return results to users within a few hundred milliseconds. Deadline-Aware Datacenter TCP is a research effort into making the datacenter networks deadline aware, thus improving the performance of such key applications.

Trickle: Rate Limiting YouTube Video Streaming
Monia Ghobadi, Yuchung Cheng*, Ankur Jain*, Matt Mathis* [USENIX '12]
Trickle is a server-side mechanism to stream YouTube video smoothly to reduce burst and buffer-bloat. It paces the video stream by placing an upper bound on TCP’s congestion window based on the streaming rate and the round-trip time. In initial evaluation Trickle reduces the TCP loss rate by up to 43% and the RTT by up to 28%. Given the promising results we are deploying Trickle to all YouTube servers.

Social Systems

Look Who I Found: Understanding the Effects of Sharing Curated Friend Groups
Lujun Fang*, Alex Fabrikant*, Kristen LeFevre*, [Web Science '12]. Best Student Paper award.
In this paper, we studied the impact of the Google+ circle-sharing feature, which allows individual users to share (publicly and privately) pre-curated groups of friends and contacts. We specifically investigated the impact on the growth and structure of the Google+ social network. In the course of the analysis, we identified two natural categories of shared circles (“communities” and “celebrities”). We also observed that the circle-sharing feature is associated with the accelerated densification of community-type circles.

Software Engineering

AddressSanitizer: A Fast Address Sanity Checker
Konstantin Serebryany*, Derek Bruening*, Alexander Potapenko*, Dmitry Vyukov*, [USENIX ATC '12].
The paper “AddressSanitizer: A Fast Address Sanity Checker” describes a dynamic tool that finds memory corruption bugs in C or C++ programs with only a 2x slowdown. The major feature of AddressSanitizer is simplicity — this is why the tool is very fast.

Speech

Japanese and Korean Voice Search
Mike Schuster*, Kaisuke Nakajima*, IEEE International Conference on Acoustics, Speech, and Signal Processing [ICASSP'12].
“Japanese and Korean voice search” explains in detail how the Android voice search systems for these difficult languages were developed. We describe how to segment statistically to be able to handle infinite vocabularies without out-of-vocabulary words, how to handle the lack of spaces between words for language modeling and dictionary generation, and how to deal best with multiple ambiguities during evaluation scoring of reference transcriptions against hypotheses. The combination of techniques presented led to high quality speech recognition systems–as of 6/2013 Japanese and Korean are #2 and #3 in terms of traffic after the US.

Google’s Cross-Dialect Arabic Voice Search
Fadi Biadsy*, Pedro J. Moreno*, Martin Jansche*, IEEE International Conference on Acoustics, Speech, and Signal Processing [ICASSP 2012].
This paper describes Google’s automatic speech recognition systems for recognizing several Arabic dialects spoken in the Middle East, with the potential to reach more than 125 million users. We suggest solutions for challenges specific to Arabic, such as the diacritization problem, where short vowels are not written in Arabic text. We conduct experiments to identify the optimal manner in which acoustic data should be clustered among dialects.

Deep Neural Networks for Acoustic Modeling in Speech Recognition
Geoffrey Hinton*, Li Deng, Dong Yu, George Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew W. Senior*, Vincent Vanhoucke*, Patrick Nguyen, Tara Sainath, Brian Kingsbury, Signal Processing Magazine (2012)”
Survey paper on the DNN breakthrough in automatic speech recognition accuracy.

Statistics

Empowering Online Advertisements by Empowering Viewers with the Right to Choose
Max Pashkevich*, Sundar Dorai-Raj*, Melanie Kellar*, Dan Zigmond*, Journal of Advertising Research, vol. 52 (2012).
YouTube’s TrueView in-stream video advertising format (a form of skippable in-stream ads) can improve the online video viewing experience for users without sacrificing advertising value for advertisers or content owners.

Structured Data

Efficient Spatial Sampling of Large Geographical Tables
Anish Das Sarma*, Hongrae Lee*, Hector Gonzalez*, Jayant Madhavan*, Alon Halevy*, [SIGMOD '12].
This paper presents fundamental results for the “thinning problem”: determining appropriate samples of data to be shown on specific geographical regions and zoom levels. This problem is widely applicable for a number of cloud-based geographic visualization systems such as Google Maps, Fusion Tables, and the developed algorithms are part of the Fusion Tables backend. The SIGMOD 2012 paper was selected among the best papers of the conference, and invited to a special best-papers issue of TODS.

Systems

Spanner: Google’s Globally-Distributed Database
James C. Corbett*, Jeffrey Dean*, Michael Epstein*, Andrew Fikes*, Christopher Frost*, JJ Furman*, Sanjay Ghemawat*, Andrey Gubarev*, Christopher Heiser*, Peter Hochschild*, Wilson Hsieh*, Sebastian Kanthak*, Eugene Kogan*, Hongyi Li*, Alexander Lloyd*, Sergey Melnik*, David Mwaura*, David Nagle*, Sean Quinlan*, Rajesh Rao*, Lindsay Rolig*, Dale Woodford*, Yasushi Saito*, Christopher Taylor*, Michal Szymaniak*, Ruth Wang*, [OSDI '12]
This paper shows how a new time API and its implementation can provide the abstraction of tightly synchronized clocks, even on a global scale. We describe how we used this technology to build a globally-distributed database that supports a variety of powerful features: non-blocking reads in the past, lock-free snapshot transactions, and atomic schema changes.

Read | No Comments | Tags: Uncategorized

Improving Photo Search: A Step Across the Semantic Gap

Jun 12, 2013

Posted by Chuck Rosenberg, Image Search Team

Last month at Google I/O, we showed a major upgrade to the photos experience: you can now easily search your own photos without having to manually label each and every one of them. This is powered by computer vision and machine learning technology, which uses the visual content of an image to generate searchable tags for photos combined with other sources like text tags and EXIF metadata to enable search across thousands of concepts like a flower, food, car, jet ski, or turtle.

For many years Google has offered Image Search over web images; however, searching across photos represents a difficult new challenge. In Image Search there are many pieces of information which can be used for ranking images, for example text from the web or the image filename. However, in the case of photos, there is typically little or no information beyond the pixels in the images themselves. This makes it harder for a computer to identify and categorize what is in a photo. There are some things a computer can do well, like recognize rigid objects and handwritten digits. For other classes of objects, this is a daunting task, because the average toddler is better at understanding what is in a photo than the world’s most powerful computers running state of the art algorithms.

This past October the state of the art seemed to move things a bit closer to toddler performance. A system which used deep learning and convolutional neural networks easily beat out more traditional approaches in the ImageNet computer vision competition designed to test image understanding. The winning team was from Professor Geoffrey Hinton’s group at the University of Toronto.

We built and trained models similar to those from the winning team using software infrastructure for training large-scale neural networks developed at Google in a group started by Jeff Dean and Andrew Ng. When we evaluated these models, we were impressed; on our test set we saw double the average precision when compared to other approaches we had tried. We knew we had found what we needed to make photo searching easier for people using Google. We acquired the rights to the technology and went full speed ahead adapting it to run at large scale on Google’s computers. We took cutting edge research straight out of an academic research lab and launched it, in just a little over six months. You can try it out at photos.google.com.

Why the success now? What is new? Some things are unchanged: we still use convolutional neural networks — originally developed in the late 1990s by Professor Yann LeCun in the context of software for reading handwritten letters and digits. What is different is that both computers and algorithms have improved significantly. First, bigger and faster computers have made it feasible to train larger neural networks with much larger data. Ten years ago, running neural networks of this complexity would have been a momentous task even on a single image — now we are able to run them on billions of images. Second, new training techniques have made it possible to train the large deep neural networks necessary for successful image recognition.

We feel it would be interesting to the research community to discuss some of the unique aspects of the system we built and some qualitative observations we had while testing the system.

The first is our label and training set and how it compares to that used in the ImageNet Large Scale Visual Recognition competition. Since we were working on search across photos, we needed an appropriate label set. We came up with a set of about 2000 visual classes based on the most popular labels on Google+ Photos and which also seemed to have a visual component, that a human could recognize visually. In contrast, the ImageNet competition has 1000 classes. As in ImageNet, the classes were not text strings, but are entities, in our case we use Freebase entities which form the basis of the Knowledge Graph used in Google search. An entity is a way to uniquely identify something in a language-independent way. In English when we encounter the word “jaguar”, it is hard to determine if it represents the animal or the car manufacturer. Entities assign a unique ID to each, removing that ambiguity, in this case “/m/0449p” for the former and “/m/012×34” for the latter. In order to train better classifiers we used more training images per class than ImageNet, 5000 versus 1000. Since we wanted to provide only high precision labels, we also refined the classes from our initial set of 2000 to the most precise 1100 classes for our launch.

During our development process we had many more qualitative observations we felt are worth mentioning:

1) Generalization performance. Even though there was a significant difference in visual appearance between the training and test sets, the network appeared to generalize quite well. To train the system, we used images mined from the web which did not match the typical appearance of personal photos. Images on the web are often used to illustrate a single concept and are carefully composed, so an image of a flower might only be a close up of a single flower. But personal photos are unstaged and impromptu, a photo of a flower might contain many other things in it and may not be very carefully composed. So our training set image distribution was not necessarily a good match for the distribution of images we wanted to run the system on, as the examples below illustrate. However, we found that our system trained on web images was able to generalize and perform well on photos.

A typical photo of a flower found on the web.

A typical photo of a flower found in an impromptu photo.

2) Handling of classes with multi-modal appearance. The network seemed to be able to handle classes with multimodal appearance quite well, for example the “car” class contains both exterior and interior views of the car. This was surprising because the final layer is effectively a linear classifier which creates a single dividing plane in a high dimensional space. Since it is a single plane, this type of classifier is often not very good at representing multiple very different concepts.

3) Handling abstract and generic visual concepts. The system was able to do reasonably well on classes that one would think are somewhat abstract and generic. These include “dance”, “kiss”, and “meal”, to name a few. This was interesting because for each of these classes it did not seem that there would be any simple visual clues in the image that would make it easy to recognize this class. It would be difficult to describe them in terms of simple basic visual features like color, texture, and shape.

Photos recognized as containing a meal.

4) Reasonable errors. Unlike other systems we experimented with, the errors which we observed often seemed quite reasonable to people. The mistakes were the type that a person might make – confusing things that look similar. Some people have already noticed this, for example, mistaking a goat for a dog or a millipede for a snake. This is in contrast to other systems which often make errors which seem nonsensical to people, like mistaking a tree for a dog.

Photo of a banana slug mistaken for a snake.

Photo of a donkey mistaken for a dog.

5) Handling very specific visual classes. Some of the classes we have are very specific, like specific types of flowers, for example “hibiscus” or “dhalia”. We were surprised that the system could do well on those. To recognize specific subclasses very fine detail is often needed to differentiate between the classes. So it was surprising that a system that could do well on a full image concept like “sunsets” could also do well on very specific classes.

Photo recognized as containing a hibiscus flower.

Photo recognized as containing a dahlia flower.

Photo recognized as containing a polar bear.

Photo recognized as containing a grizzly bear.

The resulting computer vision system worked well enough to launch to people as a useful tool to help improve personal photo search, which was a big step forward. So, is computer vision solved? Not by a long shot. Have we gotten computers to see the world as well as people do? The answer is not yet, there’s still a lot of work to do, but we’re closer.

Read | No Comments | Tags: Uncategorized

2013 Google PhD Fellowships: 5 Years of Supporting the Future of Computer Science

Jun 11, 2013

Posted by Michael Rennaker, Google University Relations

We are extremely excited to announce the 2013 Global Google PhD Fellows. From all around the globe, these 39 PhD students represent the fifth class in the program’s history, a select group recognized by Google researchers and their institutions as some of the most promising young academics in the world. As we welcome the newest class of PhD Fellows, we take a look back at the program’s roots and hear from two past recipients.

In 2009, Google launched its PhD Fellowship Program, created to recognize and support outstanding graduate students pursuing work in computer science, related disciplines or promising research areas. In its inaugural year, 13 United States PhD students were awarded fellowships, drawn from an extremely competitive pool of applicants. The global program now covers Europe, China, India and Australia and continues to draw some of the best young researchers, reflecting Google’s commitment to building strong relations with the academic community.

Among those first recipients of the fellowship award are 2009 PhD Fellow Roxana Geambasu, Visiting Professor in the Computer Science Department at Columbia University, and 2010 European Doctoral Fellow Roland Angst, Visiting Assistant Professor at Stanford University and affiliated with the Max Planck Center for Visual Computing and Communication. As early recipients of the award, Roxana and Roland reflect on the impact that the Google Fellowship program had on their careers.

For Roxana, the fellowship provided the tools and connections that helped lay the foundation for her academic career. She believes industrial fellowship programs are very important, as they give students an opportunity to interact more closely with industry.

“Beyond the financial support, I think that the fellowship impacted my career in many important ways. First, the Google fellowships are regarded as highly competitive, so receiving the award was probably a big plus on my resume when I was interviewing for faculty positions.”

“Second, the award yielded a mentor within Google, Brad Chen, with whom I’ve kept in touch ever since, as well as opportunities to visit the campus, deliver talks and meet Google engineers. Brad and I continue to meet at conferences and discuss my work, his work and (of late) the work of my students; it’s through that relationship I’m exposed to new people from Google and gain valuable advice about faculty award opportunities.”

Roland Angst credits the award with the ability to lighten his teaching load and instead focus on his research, which ultimately prepared him for his future academic career. Like Roxanna, Roland states that the fellowship also gave him the opportunity to establish connections with people working in related topics in industry.

“In my view, programs such as the Google Fellowship Awards represent an important and integral link between industry and universities. Firstly, such programs increase the awareness in the academic world for relevant problems in industry. Secondly, these programs allow the IT industry to express their gratitude to the educational services provided by the universities on which the IT industry heavily relies on.”

We welcome the latest recipients of the Global Google PhD Fellowships for 2013 with great excitement and high expectations. Recognized for their incredible innovation, creativity and leadership, we are very happy to support these excellent PhD students and offer our sincere congratulations.

Read | No Comments | Tags: Uncategorized

Building A Visual Planetary Time Machine

Jun 10, 2013

Posted by Randy Sargent, Google/Carnegie Mellon University; Matt Hancher and Eric Nguyen, Google; and Illah Nourbakhsh, Carnegie Mellon University

When a societal or scientific issue is highly contested, visual evidence can cut to the core of the debate in a way that words alone cannot — communicating complicated ideas that can be understood by experts and non-experts alike. After all, it took the invention of the optical telescope to overturn the idea that the heavens revolved around the earth.

Last month, Google announced a zoomable and explorable time-lapse view of our planet. This time-lapse Earth enables you explore the last 29 years of our planet’s history — from the global scale to the local scale, all across the planet. We hope this new visual dataset will ground debates, encourage discovery, and shift perspectives about some of today’s pressing global issues.

This project is a collaboration between Google’s Earth Engine team, Carnegie Mellon University’s CREATE Lab, and TIME Magazine — using nearly a petabyte of historical record from USGS’s and NASA’s Landsat satellites. And in this post, we’d like to give a little insight into the process required to build this time-lapse view of our planet.

Previews of the phenomena visible in these time-lapses.

First we’ll describe Google’s Earth Engine system for deriving the time-series imagery. Second, we’ll tell you more about CMU’s open-source “Time Machine” software for creating and streaming large, explorable time-series imagery.

Annual Composites: Distilling a Massive Dataset

Google Earth Engine brings together the world’s scientific satellite imagery — over a petabyte of multispectral imagery recording over 40 years of history — and makes it available online with tools that scientists, independent researchers, and nations can use to mine this massive warehouse of data to detect changes, map trends and quantify differences on the Earth’s surface using Google’s computational infrastructure. Today, the platform is used to monitor the Amazon and estimate forest carbon in Tanzania, among hundreds of other partners developing new uses for the technology.

Using Earth Engine, we first built annual global mosaics at a resolution of 30 meters per pixel for each year from 1984 through 2012. We started with a total of 2,068,467 scenes from the Landsat 4, 5, and 7 satellites, comprising 909 terabytes of data. The Earth’s atmosphere is a constantly-shifting sea of clouds, so in order to assemble a seamless cloud-free view of each year we analyzed all the images available at each location and used a simple cloud model to separate out the clouds from the ground. To help correct for atmospheric and seasonal effects, we used an additional 20TB of data from the MODIS MCD43A4 product to build a cloud-free low-resolution model of the Earth over time. We combined all this to produce a statistical estimate of the color of each pixel for every year for which data was available. Producing the final 29 global mosaics took a bit less than a day and consumed approximately 260,000 core-hours of CPU.

Some areas of the planet are almost perpetually cloudy, obscuring satellite views. In addition, before the more capable Landsat 7 began operating in 1999, coverage in some areas of the world was sparse, particularly in Asia, for various operational and technological reasons. We wrestled with how best to visualize areas with missing or cloud-obscured images from each year. In the end, after much experimentation, we chose to simply interpolate between valid image years. Other techniques, such as greying out invalid data, created distractingly large artifacts, visually drowning out the valid information. However, the downside with the approach we have taken is that it can be difficult to tell which data is original and which is interpolated. We are exploring the possibility of including a view that allows drilling down into the non-interpolated, original mosaics.

“Time Machine”: An HTML5 Time-Series Exploration Tool

Once we had produced the final global images, we adapted the Carnegie Mellon CREATE Lab’s open-source “Time Machine” software, which enables authoring, streaming, and exploring very-high-resolution videos. Time Machine videos take advantage of the power of HTML5 and modern web browsers: they are streamed as multiresolution, overlapping video tiles and displayed in a web page by manipulating the HTML5

Examples of zoomable timelapses with hundreds of millions or billions of pixels per frame include documenting plant growth, bee colony collapse, and very-large-scale simulations of the universe. Time-lapse Earth, however, sets a new record for giant videos: each frame of the video is a global Mercator-projected map with a resolution of 30 meters per pixel at the equator, for a total of 1.78 trillion pixels per frame. That’s about a million times larger than a standard HD video stream. In order to scale to such large videos, we needed to integrate Time Machine’s data production pipeline into Earth Engine and the rest of Google’s infrastructure. Encoding the final video tiles consumed approximately 1.4 million core-hours of CPU in Google’s data centers over the course of about a day. For CMU’s researchers, this would have been impossible without Google’s resources.

Combining all three phases of product generation:

Total processing time: 3 days
Total CPU usage: 1.8 million core-hours
Peak CPU usage: 66,000 simultaneous cores

Destination locations of top 1500 share links, weighted by number of visits.

Time-lapse Earth is powerful because it helps us to access and construct the story of our planet. That story will become richer with each release, as we continue to improve fidelity and add data. The story-teller is everyone — scientists and citizens alike provide the real value by interacting, exploring, layering their knowledge upon the globe, and sharing their insights so that we can all better understand our world.

We are especially proud of the collaboration that made time-lapse Earth possible, and believe it to be an exemplar of how industry, academia, government, and the press can benefit from working together deeply over a period of years. By drawing on the strengths of each member of the collaborative community, Google strives to integrate the world’s technical expertise and knowledge in order to tackle innovative and groundbreaking projects. In doing so, it is our goal to deliver an impactful service, one that can put a focus on the dramatic effect we are having on our planet.

Read | No Comments | Tags: Uncategorized

The Story Behind Course Builder

Jun 3, 2013

Posted by Maggie Johnson, Director of Education and University Relations

Last summer, we ran a MOOC (Massive Open Online Course) on Power Searching. Soon after, we open sourced Course Builder, the platform that we developed on Google technologies to present the course. Since then, we have released four versions of Course Builder adding features such as user-friendly content development, administrative support, dashboards on student performance and behavior, new assessment types including peer review, accessibility, internationalization, etc. A large number of courses have been hosted on Course Builder, with many more in the pipeline.

This work started with the observation that we have all the component technology one needs to create a platform for delivering a learning experience similar to other MOOCs that were being offered on Coursera and Udacity. So we set about wiring together these components (YouTube, App Engine, Groups, Apps, Google+ and Hangouts, etc.) to create the first version of Course Builder.

As we talked with faculty and others who wanted to create online learning experiences, we saw an opportunity for Course Builder to play an important role in the MOOC space. Our goal is to provide the capability for anyone to create a MOOC or even an “OOC”. We believe that an online environment can be used for a wide variety of education-related activities beyond just the standard university course. We have implemented a feature set that supports this goal.

Our users include not only colleges and universities, but also non-profits and K12 organizations. We host academic courses such as Information Visualization and Game Theory, as well as short courses including Mapping with Google, Digital Learning in K12, YouTube Creator Academy, and Giving with Purpose. Supporting this diversity in users, content and format is why we created Course Builder.

Hosting the platform on App Engine has provided additional capabilities that are essential for our users, particularly colleges and universities. It’s possible to brand a MOOC anyway the user wants. The user also owns the relationship with the student directly, and owns any data that they collect to use anyway they like. Given Course Builder is open source, it is possible to easily add customized features. Add to that App Engine’s scalability, self-managed hosting and the extensible component architecture built into Course Builder, and you have a powerful, flexible platform that can support any number of students and any type of content.

We will continue to support this diverse user base, and work to get even more great teachers and innovative learning designers involved in experimenting in this brave new world of online learning. The potential for positive disruption and change is enormous.

Read | No Comments | Tags: Uncategorized

Distributing the Edit History of Wikipedia Infoboxes

May 30, 2013

Posted by Enrique Alfonseca, Google Research

Aside from its value as a general-purpose encyclopedia, Wikipedia is also one of the most widely used resources to acquire, either automatically or semi-automatically, knowledge bases of structured data. Much research has been devoted to automatically building disambiguation resources, parallel corpora and structured knowledge from Wikipedia. Still, most of those projects have been based on single snapshots of Wikipedia, extracting the attribute values that were valid at a particular point in time. So about a year ago we compiled and released a data set that allows researchers to see how data attributes can change over time.

Figure 1. Infobox for the Republic of Palau in 2006 and 2013 showing the capital change.

Many attributes vary over time. These include the presidents of countries, the spouses of people, the populations of cities and the number of employees of companies. Every Wikipedia page has an associated history from which the users can view and compare past versions. Having the historical values of Infobox entries available would provide a historical overview of change affecting each entry, to understand which attributes are more likely to change over time or have a regularity in their changes, and which ones attract more user interest and are actually updated in a timely fashion. We believe that such a resource will also be useful in training systems to learn to extract data from documents, as it will allow us to collect more training examples by matching old values of an attribute inside old pages.

For this reason, we released, in collaboration with Wikimedia Deutschland e.V., a resource containing all the edit history of infoboxes in Wikipedia pages. While this was already available indirectly in Wikimedia’s full history dumps, the smaller size of the released dataset will make it easier to download and process this data. The released dataset contains 38,979,871 infobox attribute updates for 1,845,172 different entities, and it is available for download both from Google and from Wikimedia Deutschland’s Toolserver page. A description of the dataset can be found in our paper WHAD: Wikipedia Historical Attributes Data, accepted for publication at the Language Resources and Evaluation journal.

What kind of information can be learned from this data? Some examples from preliminary analyses include the following:

Every country in the world has a population in its Wikipedia attribute, which is updated at least yearly for more than 90% of them. The average error rate with respect to the yearly World Bank estimates is between two and three percent, mostly due to rounding.
50% of deaths are updated into Wikipedia infoboxes within a couple of days… but for scientists it takes 31 days to reach 50% coverage!
For the last episode of TV shows, the airing date is updated for 50% of them within 9 days; for for the first episode of TV shows, it takes 106 days.

While infobox attribute updates will be much easier to process as they transition into the Wikidata project, we are not there yet and we believe that the availability of this dataset will facilitate the study of changing attribute values. We are looking forward to the results of those studies.

Thanks to Googler Jean-Yves Delort and Guillermo Garrido and Anselmo Peñas from UNED for putting this dataset together, and to Angelika Mühlbauer and Kai Nissen from Wikipedia Deutschland for their support. Thanks also to Thomas Hofmann and Fernando Pereira for making this data release possible.

Read | No Comments | Tags: Uncategorized

Open Access for Publications

May 29, 2013

Posted by Alfred Spector, Vice President, Engineering

The Association for Computing Machinery (ACM) recently announced a new option for publication rights management, wherein researchers can choose to pay for the public to have perpetual open access to the publication. Google applauds this new option, and today we are announcing that we will pay the open access fees for all articles by Google researchers that are published in ACM journals. IEEE also has an open access option for some of its publications, and we also pay the open access fee for them and for publications in like organizations.

Google has always believed that by improving access to the world’s knowledge, we can help improve everyone’s lives. When it comes to scientific research, we have consistently said that open access to publications speeds up research, accelerates innovation, and helps grow the global economy.

Policies like ACM’s continue to demonstrate the sustainability of open access publishing. It will also provide better access to the papers that we write at Google. We encourage researchers everywhere to pursue open access options whenever publishing articles, and to continue to make publications available as widely as possible, within your rights.

Read | No Comments | Tags: Uncategorized

previous page · next page

About: Research @ Google

Profile:

Website

Contact:

Posts by Research @ Google

Google Databoard: A new way to explore industry research

Improving Photo Search: A Step Across the Semantic Gap

Categories

Tags