June, 2013 | Google Data

Archive for June, 2013

Chromebooks: coming to more stores near you

June 17th, 2013 | by Google Chrome Blog | published in Uncategorized

[Cross-posted from the Official Google Blog]

In Northern California where I live, summer is here, which means family vacations, kids’ camps, BBQs and hopefully some relaxation. But it also means back-to-school shopping is just around the corner. So in case you’re on the hunt for a laptop in addition to pens, paper, and stylish new outfits, your search just got a whole lot easier. Chromebooks—a fast, simple, secure laptop that won’t break the bank—will now be carried in over 3 times more stores than before, or more than 6,600 stores around the world.

In addition to Best Buy and Amazon.com, we’re excited to welcome several new retailers to the family. Starting today, Walmart will be making the newest Acer Chromebook, which has a 16GB Solid State Drive (SSD), available in approximately 2,800 stores across the U.S., for just $199. Look for Chromebooks coming to the laptop sections of a Walmart near you this summer.

And beginning this weekend, Staples will bring a mix of Chromebooks from Acer, HP and Samsung to every store in the U.S.—more than 1,500 in total. You can also purchase via Staples online, while businesses can purchase through the Staples Advantage B2B program. In the coming months select Office Depot, OfficeMax, and regional chains Fry’s and TigerDirect locations will begin selling Chromebooks.

In the 10 other markets worldwide where Chromebooks are sold, availability in national retailers continues to expand. In addition to Dixons in the UK, now 116 Tesco stores are selling Chromebooks, as well as all Media Markt and Saturn stores in the Netherlands, FNAC stores in France and Elgiganten stores in Sweden. In Australia, all JB Hi-Fi and Harvey Norman stores will be carrying Chromebooks for their customers as well. With our partners, we’re working hard to bring Chromebooks to even more countries later this year.

Chromebooks make great computers for everyone in the family—and now you shouldn’t have to look very far to find one. Happy summer!

Posted by David Shapiro, Director of Chromebook Marketing, long-time reader, first-time poster

ddd

Chromebooks: coming to more stores near you

June 17th, 2013 | by Emily Wood | published in Uncategorized

In Northern California where I live, summer is here, which means family vacations, kids’ camps, BBQs and hopefully some relaxation. But it also means back-to-school shopping is just around the corner. So in case you’re on the hunt for a laptop in addition to pens, paper, and stylish new outfits, your search just got a whole lot easier. Chromebooks—a fast, simple, secure laptop that won’t break the bank—will now be carried in over 3 times more stores than before, or more than 6,600 stores around the world.

In the 10 other markets worldwide where Chromebooks are sold, availability in national retailers continues to expand. In addition to Dixons in the UK, now 116 Tesco stores are selling Chromebooks, as well as all Media Markt and Saturn stores in the Netherlands, FNAC stores in France and Elgiganten stores in Sweden. In Australia, all JB Hi-Fi and Harvey Norman stores will be carrying Chromebooks for their customers as well. We’re working hard to bring Chromebooks to even more countries later this year.

Chromebooks make great computers for everyone in the family—and now you shouldn’t have to look very far to find one. Happy summer!

Posted by David Shapiro, Director of Chromebook Marketing

ddd

Google Summer of Code coding starts today!

June 17th, 2013 | by Stephanie Taylor | published in Uncategorized

Today is the first day of coding for our 9th year of the Google Summer of Code program. This year 1,192 students will spend the next 12 weeks writing code for 177 different open source organizations.

We are excited to see the contributions this year’s students will make to the open source community.

For more information on important dates for the program please visit our timeline. Stay tuned as we will highlight some of the new mentoring organizations over the next few months.

Have a great summer!

By Carol Smith, Open Source Programs

ddd

Happy Small Business Week.

June 17th, 2013 | by Jane Smith | published in Uncategorized

Posted by Lisa Gevelber, VP Marketing, Americas

(Cross-posted on the Official Google Blog)

Our first AdWords customer was a small business selling live mail-order lobsters. It’s been a long time since then, but a majority of our customers are still small businesses, who play a vital role not only for Google, but for the American economy. More than 60 percent of new jobs each year come from small businesses.

This Small Business Week, we want to celebrate you. We’re grateful to you for everything you do for us and our communities. Whether you fix people’s cars, offer music lessons to aspiring musicians, or make the world’s best homemade ice cream – when you do what you love, our lives get better.

As part of the celebration, we’ll be highlighting some amazing small businesses across the country, so keep an eye on the Google+ Your Business page. And in the meantime, check out some of the Google tools that are designed to help you take care of business.

Happy Small Business Week.

ddd

Happy Small Business Week.

June 17th, 2013 | by Emily Wood | published in Uncategorized

Happy Small Business Week.

Posted by Lisa Gevelber, VP Marketing, Americas

ddd

Celebrating 10 years of shared success

June 17th, 2013 | by Becky C. | published in Uncategorized

Ten years ago we launched AdSense to help publishers earn money by placing relevant ads on their websites. I can still remember the excitement and anticipation as AdSense went live that first day. Our small team huddled together in a cramped conference room, and right away we saw that publishers were as excited about AdSense as we were.

Fast-forward ten years, and AdSense has become a core part of Google’s advertising business. The AdSense community has grown to include over two million publishers, and last year alone, publishers earned over $7 billion from AdSense. AdSense is a community that thrives because of all the content creators we are so fortunate to partner with. Their stories inspire us to do our part to make AdSense great.

On this occasion, it’s especially inspiring to hear the stories of partners who have been with us since the very beginning. Like a retiree in New Zealand who was able to pursue her dream of writing about her garden, a tech support expert in Colorado who can spend more time with his kids, and a theme park reviewer who now sends employees around the world to test and review rides — all thanks to money earned from AdSense.

As part of our 10th anniversary celebration, we hope you’ll tune into our live Hangout on Air today at 10am PDT (5pm GMT) on the AdSense +page. I look forward to joining several of our partners to share stories from the early days of AdSense, talk about how we’ve all grown since then, and discuss the future for publishers and online advertising. And if you want even more 10th anniversary celebration, just visit our AdSense 10th anniversary page at any time.

Posted by Susan Wojcicki

ddd

Our continued commitment to combating child exploitation online

June 16th, 2013 | by Emily Wood | published in Uncategorized

The Internet has been a tremendous force for good—increasing access to information, improving people’s ability to communicate and driving economic growth. But like the physical world, there are dark corners on the web where criminal behavior exists.

In 2011, the National Center for Missing & Exploited Children’s (NCMEC’s) Cybertipline Child Victim Identification Program reviewed 17.3 million images and videos of suspected child sexual abuse. This is four times more than what their Exploited Children’s Division (ECD) saw in 2007. And the number is still growing. Behind these images are real, vulnerable kids who are sexually victimized and victimized further through the distribution of their images.

It is critical that we take action as a community—as concerned parents, guardians, teachers and companies—to help combat this problem.

Child sexual exploitation is a global problem that needs a global solution. More than half of the images and videos sent to NCMEC for analysis are found to have been uploaded to U.S. servers from outside the country. With this in mind, we need to sustain and encourage borderless communication between organizations fighting this problem on the ground. For example, NCMEC’s CyberTipline is able to refer reports regarding online child sexual exploitation to 66 countries, helping local law enforcement agencies effectively execute their investigations.

Google has been working on fighting child exploitation since as early as 2006 when we joined the Technology Coalition, teaming up with other tech industry companies to develop technical solutions. Since then, we’ve been providing software and hardware to helping organizations all around the world to fight child abuse images on the web and help locate missing children.

There is much more that can be done, and Google is taking our commitment another step further through a $5 million effort to eradicate child abuse imagery online. Part of this commitment will go to global child protection partners like the National Center for Missing & Exploited Children and the Internet Watch Foundation. We’re providing additional support to similar heroic organizations in the U.S., Canada, Europe, Australia and Latin America.

Since 2008, we’ve used “hashing” technology to tag known child sexual abuse images, allowing us to identify duplicate images which may exist elsewhere. Each offending image in effect gets a unique ID that our computers can recognize without humans having to view them again. Recently, we’ve started working to incorporate encrypted “fingerprints” of child sexual abuse images into a cross-industry database. This will enable companies, law enforcement and charities to better collaborate on detecting and removing these images, and to take action against the criminals. Today we’ve also announced a $2 million Child Protection Technology Fund to encourage the development of ever more effective tools.

We’re in the business of making information widely available, but there’s certain “information” that should never be created or found. We can do a lot to ensure it’s not available online—and that when people try to share this disgusting content they are caught and prosecuted.

Update June 17: Clarified language around NCMEC’s Child Victim Identification Program and CyberTipline.

Posted by Jacquelline Fuller, Director, Google Giving

ddd

Introducing Project Loon: Balloon-powered Internet access

June 14th, 2013 | by Emily Wood | published in Uncategorized

The Internet is one of the most transformative technologies of our lifetimes. But for 2 out of every 3 people on earth, a fast, affordable Internet connection is still out of reach. And this is far from being a solved problem.

There are many terrestrial challenges to Internet connectivity—jungles, archipelagos, mountains. There are also major cost challenges. Right now, for example, in most of the countries in the southern hemisphere, the cost of an Internet connection is more than a month’s income.

Solving these problems isn’t simply a question of time: it requires looking at the problem of access from new angles. So today we’re unveiling our latest moonshot from Google[x]: balloon-powered Internet access.

We believe that it might actually be possible to build a ring of balloons, flying around the globe on the stratospheric winds, that provides Internet access to the earth below. It’s very early days, but we’ve built a system that uses balloons, carried by the wind at altitudes twice as high as commercial planes, to beam Internet access to the ground at speeds similar to today’s 3G networks or faster. As a result, we hope balloons could become an option for connecting rural, remote, and underserved areas, and for helping with communications after natural disasters. The idea may sound a bit crazy—and that’s part of the reason we’re calling it Project Loon—but there’s solid science behind it.

Balloons, with all their effortless elegance, present some challenges. Many projects have looked at high-altitude platforms to provide Internet access to fixed areas on the ground, but trying to stay in one place like this requires a system with major cost and complexity. So the idea we pursued was based on freeing the balloons and letting them sail freely on the winds. All we had to do was figure out how to control their path through the sky. We’ve now found a way to do that, using just wind and solar power: we can move the balloons up or down to catch the winds we want them to travel in. That solution then led us to a new problem: how to manage a fleet of balloons sailing around the world so that each balloon is in the area you want it right when you need it. We’re solving this with some complex algorithms and lots of computing power.

Now we need some help—this experiment is going to take way more than our team alone. This week we started a pilot program in the Canterbury area of New Zealand with 50 testers trying to connect to our balloons. This is the first time we’ve launched this many balloons (30 this week, in fact) and tried to connect to this many receivers on the ground, and we’re going to learn a lot that will help us improve our technology and balloon design.

Over time, we’d like to set up pilots in countries at the same latitude as New Zealand. We also want to find partners for the next phase of our project—we can’t wait to hear feedback and ideas from people who’ve been working for far longer than we have on this enormous problem of providing Internet access to rural and remote areas. We imagine someday you’ll be able to use your cell phone with your existing service provider to connect to the balloons and get connectivity where there is none today.

This is still highly experimental technology and we have a long way to go—we’d love your support as we keep trying and keep flying! Follow our Google+ page to keep up with Project Loon’s progress.

Onward and upward.

Posted by Mike Cassidy, Project Lead

ddd

Darting around Fab Friday

June 14th, 2013 | by Mano Marks | published in Uncategorized

It’s Friday again, you made it through another week! OK, we made it through another week.

I’m actually pretty excited because next week I’m going to Israel. If you’re in Tel Aviv, I’ll be speaking at the GDG on Tuesday the 19th, come and say hi. Pieter Greyling and Kasia Derc-Fenske will be speaking with me, so you’ll get a three-in-one.

On Tuesday, I hosted another Google Maps Shortcut episode, this time on Tiling in the Google Maps SDK for iOS. Check it out.

Next week Brett Morgan will be hosting a Maps Shortcut on using Google Maps with Dart. Be sure to check that out.

That’s all I’ve got this week. Have a great weekend and, as always, happy mapping!

Posted by Mano Marks, Maps Developer Relations Team

ddd

Optimal Logging

June 14th, 2013 | by Google Testing Bloggers | published in Google Testing

by Anthony Vallone

How long does it take to find the root cause of a failure in your system? Five minutes? Five days? If you answered close to five minutes, it’s very likely that your production system and tests have great logging. All too often, seemingly unessential features like logging, exception handling, and (dare I say it) testing are an implementation afterthought. Like exception handling and testing, you really need to have a strategy for logging in both your systems and your tests. Never underestimate the power of logging. With optimal logging, you can even eliminate the necessity for debuggers. Below are some guidelines that have been useful to me over the years.

Channeling Goldilocks

Never log too much. Massive, disk-quota burning logs are a clear indicator that little thought was put in to logging. If you log too much, you’ll need to devise complex approaches to minimize disk access, maintain log history, archive large quantities of data, and query these large sets of data. More importantly, you’ll make it very difficult to find valuable information in all the chatter.

The only thing worse than logging too much is logging too little. There are normally two main goals of logging: help with bug investigation and event confirmation. If your log can’t explain the cause of a bug or whether a certain transaction took place, you are logging too little.

Good things to log:

Important startup configuration
Errors
Warnings
Changes to persistent data
Requests and responses between major system components
Significant state changes
User interactions
Calls with a known risk of failure
Waits on conditions that could take measurable time to satisfy
Periodic progress during long-running tasks
Significant branch points of logic and conditions that led to the branch
Summaries of processing steps or events from high level functions – Avoid logging every step of a complex process in low-level functions.

Bad things to log:

Function entry – Don’t log a function entry unless it is significant or logged at the debug level.
Data within a loop – Avoid logging from many iterations of a loop. It is OK to log from iterations of small loops or to log periodically from large loops.
Content of large messages or files – Truncate or summarize the data in some way that will be useful to debugging.
Benign errors – Errors that are not really errors can confuse the log reader. This sometimes happens when exception handling is part of successful execution flow.
Repetitive errors – Do not repetitively log the same or similar error. This can quickly fill a log and hide the actual cause. Frequency of error types is best handled by monitoring. Logs only need to capture detail for some of those errors.

There is More Than One Level

Don’t log everything at the same log level. Most logging libraries offer several log levels, and you can enable certain levels at system startup. This provides a convenient control for log verbosity.

The classic levels are:

Debug – verbose and only useful while developing and/or debugging.
Info – the most popular level.
Warning – strange or unexpected states that are acceptable.
Error – something went wrong, but the process can recover.
Critical – the process cannot recover, and it will shutdown or restart.

Practically speaking, only two log configurations are needed:

Production – Every level is enabled except debug. If something goes wrong in production, the logs should reveal the cause.
Development & Debug – While developing new code or trying to reproduce a production issue, enable all levels.

Test Logs Are Important Too

Log quality is equally important in test and production code. When a test fails, the log should clearly show whether the failure was a problem with the test or production system. If it doesn’t, then test logging is broken.

Test logs should always contain:

Test execution environment
Initial state
Setup steps
Test case steps
Interactions with the system
Expected results
Actual results
Teardown steps

Conditional Verbosity With Temporary Log Queues

When errors occur, the log should contain a lot of detail. Unfortunately, detail that led to an error is often unavailable once the error is encountered. Also, if you’ve followed advice about not logging too much, your log records prior to the error record may not provide adequate detail. A good way to solve this problem is to create temporary, in-memory log queues. Throughout processing of a transaction, append verbose details about each step to the queue. If the transaction completes successfully, discard the queue and log a summary. If an error is encountered, log the content of the entire queue and the error. This technique is especially useful for test logging of system interactions.

Failures and Flakiness Are Opportunities

When production problems occur, you’ll obviously be focused on finding and correcting the problem, but you should also think about the logs. If you have a hard time determining the cause of an error, it’s a great opportunity to improve your logging. Before fixing the problem, fix your logging so that the logs clearly show the cause. If this problem ever happens again, it’ll be much easier to identify.

If you cannot reproduce the problem, or you have a flaky test, enhance the logs so that the problem can be tracked down when it happens again.

Using failures to improve logging should be used throughout the development process. While writing new code, try to refrain from using debuggers and only use the logs. Do the logs describe what is going on? If not, the logging is insufficient.

Might As Well Log Performance Data

Logged timing data can help debug performance issues. For example, it can be very difficult to determine the cause of a timeout in a large system, unless you can trace the time spent on every significant processing step. This can be easily accomplished by logging the start and finish times of calls that can take measurable time:

Significant system calls
Network requests
CPU intensive operations
Connected device interactions
Transactions

Following the Trail Through Many Threads and Processes

You should create unique identifiers for transactions that involve processing across many threads and/or processes. The initiator of the transaction should create the ID, and it should be passed to every component that performs work for the transaction. This ID should be logged by each component when logging information about the transaction. This makes it much easier to trace a specific transaction when many transactions are being processed concurrently.

Monitoring and Logging Complement Each Other

A production service should have both logging and monitoring. Monitoring provides a real-time statistical summary of the system state. It can alert you if a percentage of certain request types are failing, it is experiencing unusual traffic patterns, performance is degrading, or other anomalies occur. In some cases, this information alone will clue you to the cause of a problem. However, in most cases, a monitoring alert is simply a trigger for you to start an investigation. Monitoring shows the symptoms of problems. Logs provide details and state on individual transactions, so you can fully understand the cause of problems.

ddd

Map of the Week: Citymapper Android App

June 13th, 2013 | by Mano Marks | published in Uncategorized

Map of the Week: Citymapper Android App

Why we like it: Citymapper is a great example of combining Google’s data and basemap with an app developer’s own data and making a slick, useful interface. Citymapper helps Londoners get around by showing them locations of tube stations, bus routes, taxi fares, the status of transit lines, and much more. It’s built on top of the Google Maps Android API v2 and our Directions and Geocoding services.

It all starts with figuring out what you want to do.

From there you can get walking, biking, transit, or taxi directions. It’ll even tell you how many calories you’ll burn, or how much the taxi should cost.

You can also get information on Tube closures.

And play with a rampaging Android.

You can save your favorite lines and stations in the app as well, allowing you to customize your experience. All around, this is a great combination of our maps with highly localized data.

Posted by Mano Marks, Maps Developer Relations Team

ddd

Retiring Chrome Frame

June 13th, 2013 | by Jane Smith | published in Uncategorized

Posted by Robert Shield, Google Chrome Engineer

(Cross-posted on the Chromium Blog)

The main goal of the Chromium project has always been to help unlock the potential of the open web. We work closely with the industry to standardize, implement and evangelize web technologies that help enable completely new types of experiences, and push the leading edge of the web platform forward.

But in 2009, many people were using browsers that lagged behind the leading edge. In order to reach the broadest base of users, developers often had to either build multiple versions of their applications or not use the new capabilities at all. We created Chrome Frame — a secure plug-in that brings a modern engine to old versions of Internet Explorer — to allow developers to bring better experiences to more users, even those who were unable to move to a more capable browser.

Today, most people are using modern browsers that support the majority of the latest web technologies. Better yet, the usage of legacy browsers is declining significantly and newer browsers stay up to date automatically, which means the leading edge has become mainstream.

Given these factors we’ve decided to retire Chrome Frame, and will cease support and updates for the product in January 2014. If you are a developer with an app that points users to Chrome Frame, please prompt visitors to upgrade to a modern browser. You can learn more about these changes in our FAQ.

If you’re an IT administrator you can give your employees the full capabilities of a modern browser today, even if you depend on older technology to run certain web apps. Check out Chrome for Business coupled with Legacy Browser Support, which allows employees to switch seamlessly between Chrome and another browser. Chrome is secure, stable and speedy, and runs on all major desktop and mobile OSs. IT admins can also configure 100+ policies to make Chrome fit their needs.

It’s unusual to build something and hope it eventually makes itself obsolete, but in this case we see the retirement of Chrome Frame as evidence of just how far the web has come.

ddd

Live at 10:30PT/ 1:30ET – Hangout with Mark Walker, SVP, Disney.com

June 13th, 2013 | by Yamini Gupta | published in Uncategorized

How is Disney.com leveraging online video to deliver engaging user experiences?That’s the question Xavier Kochhar, CEO, Structured Data Intelligence, is going to explore in our hangout on air with Mark Walker, SVP, Disney.com.You can view the conversat…

ddd

Experience 1,001 New Destinations with Street View

June 13th, 2013 | by Lat Long | published in Uncategorized

Today we’re adding more than 1,000 locations around the world to Google Maps, making it more comprehensive and useful for you. From historical landmarks to sports stadiums, these panoramic photos available via Street View can help you ease into vacation mode with just a few simple clicks. Below are highlights from Asia, Europe, Latin America, the U.S. and Canada that you can use to preview a vacation spot, to plot your next hiking route or just to become an armchair explorer from wherever you may be:

Go from city life to wildlife park in Singapore:

Planning on stopping by Singapore this summer? You can now explore more of the island’s diverse landscapes right from Google Maps. To get a taste of modern city life in Singapore, simply search for Marina Bay Waterfront Promenade and Fullerton Heritage Promenade and use Street View to explore the city’s popular bay front and bay skyline. If you’re planning to travel with family or are simply an animal lover at heart, you can also now go on a virtual adventure to the Singapore Zoo.

Fullerton Heritage Promenade (View Larger Map)

Discover some of Europe’s riches:
While you’re basking in Seville’s sun and sampling its famous oranges, check out the stunning Seville Cathedral against the bright blue Spanish sky. It’s the largest Gothic cathedral (and third largest church) in the world and served as a trading hub and bastion of the city’s wealth in the years following the Reconquista in the 13th century. Or maybe take a virtual sightseeing trip down the serene canals of Copenhagen, Denmark this summer. From the boat you can see cultural landmarks like the Royal Opera and Theater Houses and even the sculpture of The Little Mermaid from Hans Christian Andersen’s famous fairytale.

The canals in Copenhagen (Vizualizare hartă mărită)

Take a pilgrimage to Latin America:
Take a virtual journey to Brazil’s Basilica of the National Shrine of Our Lady Aparecida, the most visited Marian shrine in the world. You can also visit Brazil’s Vila Belmiro stadium, home to Santos Soccer Club and of past and present phenoms, Pelé and Neymar. Experience Bosque de Chapultepec (Chapultepec Park), a natural oasis in the middle of Mexico City and one of the largest city parks in the Western Hemisphere. Or get ready for the slopes with a preview of Valle Nevada Resort, one Chile’s hottest ski resorts just a few miles outside of Santiago.

Basilica of the National Shrine of Our Lady Aparecida (View Larger Map)

Take a trip down memory lane in the US:
Visit some of the nation’s historic landmarks with a road trip down the East Coast. Stops include The Mark Twain House & Museum in Hartford, Connecticut where one of America’s greatest authors and his family lived from 1874 to 1891; the Isaac Bell House in Newport, Rhode Island built in 1883 for the famous cotton broker and investor; and the Cape Henry Lighthouse in Virginia, which has guarded the Chesapeake Bay since 1792. Finally, explore the historic Vermont State House where for over 150 years citizen legislators have gathered every winter to debate the laws of Vermont.

The Vermont State House (View Larger Map)

Canadian stages taking centre stage:
Just in time for summer theatre season, Street View users can virtually visit The Shaw Festival Theatre, Edmonton’s Citadel Theatre, Manitoba Centennial Concert Hall, Roy Thomson Hall and the Four Seasons Centre for the Performing Arts, home of the Canadian Opera Company.

Roy Thomson Hall (View Larger Map)

Whether you’re hitting the slopes in South America or soaking up the summer sunshine north of the equator, we hope you enjoy exploring the world! To see this imagery and experience it through Street View, download the Google Maps app for Android or iPhone today.

Post content Posted by Deanna Yick, Street View Program Manager

ddd

Excellent Papers for 2012

June 13th, 2013 | by Research @ Google | published in Uncategorized

Posted by Corinna Cortes and Alfred Spector, Google Research

Googlers across the company actively engage with the scientific community by publishing technical papers, contributing open-source packages, working on standards, introducing new APIs and tools, giving talks and presentations, participating in ongoing technical debates, and much more. Our publications offer technical and algorithmic advances, feature aspects we learn as we develop novel products and services, and shed light on some of the technical challenges we face at Google.

In an effort to highlight some of our work, we periodically select a number of publications to be featured on this blog. We first posted a set of papers on this blog in mid-2010 and subsequently discussed them in more detail in the following blog postings. In a second round, we highlighted new noteworthy papers from the later half of 2010 and again in 2011. This time we honor the influential papers authored or co-authored by Googlers covering all of 2012 — covering roughly 6% of our total publications. It’s tough choosing, so we may have left out some important papers. So, do see the publications list to review the complete group.

In the coming weeks we will be offering a more in-depth look at some of these publications, but here are the summaries:

Algorithms and Theory

Online Matching with Stochastic Rewards
Aranyak Mehta*, Debmalya Panigrahi [FOCS'12]
Online advertising is inherently stochastic: value is realized only if the user clicks on the ad, while the ad platform knows only the probability of the click. This paper is the first to introduce the stochastic nature of the rewards to the rich algorithmic field of online allocations. The core algorithmic problem it formulates is online bipartite matching with stochastic rewards, with known click probabilities. The main result is an online algorithm which obtains a large fraction of the optimal value. The paper also shows the difficulty introduced by the stochastic nature, by showing how it behaves very differently from the classic (non-stochastic) online matching problem.

Matching with our Eyes Closed
Gagan Goel*, Pushkar Tripathi* [FOCS'12]
In this paper we propose a simple randomized algorithm for finding a matching in a large graph. Unlike most solutions to this problem, our approach does not rely on building large combinatorial structures (like blossoms) but works completely locally. We analyze the performance of our algorithm and show that it does significantly better than the greedy algorithm. In doing so we improve a celebrated 18 year old result by Aronson et. al.

Simultaneous Approximations for Adversarial and Stochastic Online Budgeted Allocation
Vahab Mirrokni*, Shayan Oveis Gharan, Morteza Zadimoghaddam, [SODA'12]
In this paper, we study online algorithms that simultaneously perform well in worst-case and average-case instances, or equivalently algorithms that perform well in both stochastic and adversarial models at the same time. This is motivated by online allocation of queries to advertisers with budget constraints. Stochastic models are not robust enough to deal with traffic spikes and adversarial models are too pessimistic. While several algorithms have been proposed for these problems, each algorithm was known to perform well in one model and not both, and we present new results for a single algorithm that works well in both models.

Economics and EC

Polyhedral Clinching Auctions and the Adwords Polytope
Gagan Goel*, Vahab Mirrokni*, Renato Paes Leme [STOC'12]
Budgets play a major role in ad auctions where advertisers explicitly declare budget constraints. Very little is known in auctions about satisfying such budget constraints while keeping incentive compatibility and efficiency. The problem becomes even harder in the presence of complex combinatorial constraints over the set of feasible allocations. We present a class of ascending-price auctions addressing this problem for a very general class of (polymatroid) allocation constraints including the AdWords problem with multiple keywords and multiple slots.

HCI

Backtracking Events as Indicators of Usability Problems in Creation-Oriented Applications
David Akers*, Robin Jeffries*, Matthew Simpson*, Terry Winograd [TOCHI '12]
Backtracking events such as undo can be useful automatic indicators of usability problems for creation-oriented applications such as word processors and photo editors. Our paper presents a new cost-effective usability evaluation method based on this insight.

Talking in Circles: Selective Sharing in Google+
Sanjay Kairam, Michael J. Brzozowski*, David Huffaker*, Ed H. Chi*, [CHI'12]
This paper explores why so many people share selectively on Google+: to protect their privacy but also to focus and target their audience. People use Circles to support these goals, organizing contacts by life facet, tie strength, and interest.

Information Retrieval

Online selection of diverse results
Debmalya Panigrahi, Atish Das Sarma, Gagan Aggarwal*, and Andrew Tomkins*, [WSDM'12]
We consider the problem of selecting subsets of items that are simultaneously diverse in multiple dimensions, which arises in the context of recommending interesting content to users. We formally model this optimization problem, identify its key structural characteristics, and use these observations to design an extremely scalable and efﬁcient algorithm. We prove that the algorithm always produces a nearly optimal solution and also perform experiments on real-world data that indicate that the algorithm performs even better in practice than the analytical guarantees.

Machine Learning

Large Scale Distributed Deep Networks
Jeffrey Dean, Greg S. Corrado*, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc’Aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, Andrew Y. Ng, NIPS 2012;
In this paper, we examine several techniques to improve the time to convergence for neural networks and other models trained by gradient-based methods. The paper describes a system we have built that exploits both model-level parallelism (by partitioning the nodes of a large model across multiple machines) and data-level parallelism (by having multiple replicas of a model process different training data and coordinating the application of updates to the model state through a centralized-but-partitioned parameter server system). Our results show that very large neural networks can be trained effectively and quickly on large clusters of machines.

Open Problem: Better Bounds for Online Logistic Regression
Brendan McMahan* and Matthew Streeter*, COLT/ICML’12 Joint Open Problem Session, JMLR: Workshop and Conference Proceedings.
One of the goals of research at Google is to help point out important open problems–precise questions that are interesting academically but also have important practical ramifications. This open problem is about logistic regression, a widely used algorithm for predicting probabilities (what is the probability an email message is spam, or that a search ad will be clicked). We show that in the simple one-dimensional case, much better results are possible than current theoretical analysis suggests, and we ask whether our results can be generalized to arbitrary logistic regression problems.

Spectral Learning of General Weighted Automata via Constrained Matrix Completion
Borja Balle and Mehryar Mohri*, NIPS 2012.
Learning weighted automata from finite samples drawn from an unknown distribution is a central problem in machine learning and computer science in general, with a variety of applications in text and speech processing, bioinformatics, and other areas. This paper presents a new family of algorithms for tackling this problem for which it proves learning guarantees. The algorithms introduced combine ideas from two different domains: matrix completion and spectral methods.

Machine Translation

Improved Domain Adaptation for Statistical Machine Translation
Wei Wang*, Klaus Macherey*, Wolfgang Macherey*, Franz Och* and Peng Xu*, [AMTA'12]
Research in domain adaptation for machine translation has been mostly focusing on one domain. We present a simple and effective domain adaptation infrastructure that makes an MT system with a single translation model capable of providing adapted, close-to-upper-bound domain-specific accuracy while preserving the generic translation accuracy. Large-scale experiments on 20 language pairs for patent and generic domains show the viability of our approach.

Multimedia and Computer Vision

Reconstructing the World’s Museums
Jianxiong Xiao and Yasutaka Furukawa*, [ECCV '12]
Virtual navigation and exploration of large indoor environments (e.g., museums) have been so far limited to either blueprint-style 2D maps that lack photo-realistic views of scenes, or ground-level image-to-image transitions, which are immersive but ill-suited for navigation. This paper presents a novel vision-based 3D reconstruction and visualization system to automatically produce clean and well-regularized texture-mapped 3D models for large indoor scenes, from ground-level photographs and 3D laser points. For the first time, we enable users to easily browse a large scale indoor environment from a bird’s-eye view, locate specific room interiors, fly into a place of interest, view immersive ground-level panoramas, and zoom out again, all with seamless 3D transitions.

The intervalgram: An audio feature for large-scale melody recognition
Thomas C. Walters*, David Ross*, Richard F. Lyon*, [CMMR'12]
Intervalgrams are small images that summarize the structure of short segments of music by looking at the musical intervals between the notes present in the music. We use them for finding cover songs – different pieces of music that share the same underlying composition. Wedo this by comparing ‘heatmaps’ which look at the similarity between intervalgrams from different pieces of music over time. If we see a strong diagonal line in the heatmap, it’s good evidence that the songs are musically similar.

General and Nested Wiberg Minimization
Dennis Strelow*, [CVPR'12]
Eriksson and van den Hengel’s CVPR 2010 paper showed that Wiberg’s least squares matrix factorization, which effectively eliminates one matrix from the factorization problem, could be applied to the harder case of L1 factorization. Our paper generalizes their approach beyond factorization to general nonlinear problems in two sets of variables, like perspective structure-from-motion. We also show that with our generalized method, one Wiberg minimization can also be nested inside another, effectively eliminating two of three sets of unknowns, and we demonstrated this idea using projective struture-from-motion

Calibration-Free Rolling Shutter Removal
Matthias Grundmann*, Vivek Kwatra*, Daniel Castro, Irfan Essa*, International Conference on Computational Photography ’12. Best paper.
Mobile phones and current generation DSLR’s, contain an electronic rolling shutter, capturing each frame one row of pixels at a time. Consequently, if the camera moves during capture, it will cause image distortions ranging from shear to wobbly distortions. We propose a calibration-free solution based on a novel parametric mixture model to correct these rolling shutter distortions in videos that enables real-time rolling shutter rectification as part of YouTube’s video stabilizer.

Natural Language Processing

Vine Pruning for Efficient Multi-Pass Dependency Parsing
Alexander Rush, Slav Petrov*, The 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL ’12), Best Paper Award.
Being able to accurately analyze the grammatical structure of sentences is crucial for language understanding applications such as machine translation or question answering. In this paper we present a method that is up to 200 times faster than existing methods and enables the grammatical analysis of text in large-scale applications. The key idea is to perform the analysis in multiple coarse-to-fine passes, resolving easy ambiguities first and tackling the harder ones later on.

Cross-lingual Word Clusters for Direct Transfer of Linguistic Structure
Oscar Tackstrom, Ryan McDonald*, Jakob Uszkoreit*, North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL ’12), Best Student Paper Award.
This paper studies how to build meaningful cross-lingual word clusters, i.e., clusters containing lexical items from two languages that are coherent along some abstract dimension. This is done by coupling distributional statistics learned from huge amounts of language specific data coupled with constraints generated from parallel corpora. The resulting clusters are used to improve the accuracy of multi-lingual syntactic parsing for languages without any training resources.

Networks

How to Split a Flow
Tzvika Hartman*, Avinatan Hassidim*, Haim Kaplan*, Danny Raz*, Michal Segalov*, [INFOCOM '12]
Decomposing a ﬂow into a small number of paths is a very important task arises in various network optimization mechanisms. In this paper we develop an an approximation algorithm for this problem that has both provable worst case performance grantees as well as good practical behavior.

Deadline-Aware Datacenter TCP (D2TCP)
Balajee Vamanan, Jahangir Hasan*, T. N. Vijaykumar, [SIGCOMM '12]
Some of our most important products like search and ads operate under soft-real-time constraints. They are architected and fine-tuned to return results to users within a few hundred milliseconds. Deadline-Aware Datacenter TCP is a research effort into making the datacenter networks deadline aware, thus improving the performance of such key applications.

Trickle: Rate Limiting YouTube Video Streaming
Monia Ghobadi, Yuchung Cheng*, Ankur Jain*, Matt Mathis* [USENIX '12]
Trickle is a server-side mechanism to stream YouTube video smoothly to reduce burst and buffer-bloat. It paces the video stream by placing an upper bound on TCP’s congestion window based on the streaming rate and the round-trip time. In initial evaluation Trickle reduces the TCP loss rate by up to 43% and the RTT by up to 28%. Given the promising results we are deploying Trickle to all YouTube servers.

Social Systems

Look Who I Found: Understanding the Effects of Sharing Curated Friend Groups
Lujun Fang*, Alex Fabrikant*, Kristen LeFevre*, [Web Science '12]. Best Student Paper award.
In this paper, we studied the impact of the Google+ circle-sharing feature, which allows individual users to share (publicly and privately) pre-curated groups of friends and contacts. We specifically investigated the impact on the growth and structure of the Google+ social network. In the course of the analysis, we identified two natural categories of shared circles (“communities” and “celebrities”). We also observed that the circle-sharing feature is associated with the accelerated densification of community-type circles.

Software Engineering

AddressSanitizer: A Fast Address Sanity Checker
Konstantin Serebryany*, Derek Bruening*, Alexander Potapenko*, Dmitry Vyukov*, [USENIX ATC '12].
The paper “AddressSanitizer: A Fast Address Sanity Checker” describes a dynamic tool that finds memory corruption bugs in C or C++ programs with only a 2x slowdown. The major feature of AddressSanitizer is simplicity — this is why the tool is very fast.

Speech

Japanese and Korean Voice Search
Mike Schuster*, Kaisuke Nakajima*, IEEE International Conference on Acoustics, Speech, and Signal Processing [ICASSP'12].
“Japanese and Korean voice search” explains in detail how the Android voice search systems for these difficult languages were developed. We describe how to segment statistically to be able to handle infinite vocabularies without out-of-vocabulary words, how to handle the lack of spaces between words for language modeling and dictionary generation, and how to deal best with multiple ambiguities during evaluation scoring of reference transcriptions against hypotheses. The combination of techniques presented led to high quality speech recognition systems–as of 6/2013 Japanese and Korean are #2 and #3 in terms of traffic after the US.

Google’s Cross-Dialect Arabic Voice Search
Fadi Biadsy*, Pedro J. Moreno*, Martin Jansche*, IEEE International Conference on Acoustics, Speech, and Signal Processing [ICASSP 2012].
This paper describes Google’s automatic speech recognition systems for recognizing several Arabic dialects spoken in the Middle East, with the potential to reach more than 125 million users. We suggest solutions for challenges specific to Arabic, such as the diacritization problem, where short vowels are not written in Arabic text. We conduct experiments to identify the optimal manner in which acoustic data should be clustered among dialects.

Deep Neural Networks for Acoustic Modeling in Speech Recognition
Geoffrey Hinton*, Li Deng, Dong Yu, George Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew W. Senior*, Vincent Vanhoucke*, Patrick Nguyen, Tara Sainath, Brian Kingsbury, Signal Processing Magazine (2012)”
Survey paper on the DNN breakthrough in automatic speech recognition accuracy.

Statistics

Empowering Online Advertisements by Empowering Viewers with the Right to Choose
Max Pashkevich*, Sundar Dorai-Raj*, Melanie Kellar*, Dan Zigmond*, Journal of Advertising Research, vol. 52 (2012).
YouTube’s TrueView in-stream video advertising format (a form of skippable in-stream ads) can improve the online video viewing experience for users without sacrificing advertising value for advertisers or content owners.

Structured Data

Efficient Spatial Sampling of Large Geographical Tables
Anish Das Sarma*, Hongrae Lee*, Hector Gonzalez*, Jayant Madhavan*, Alon Halevy*, [SIGMOD '12].
This paper presents fundamental results for the “thinning problem”: determining appropriate samples of data to be shown on specific geographical regions and zoom levels. This problem is widely applicable for a number of cloud-based geographic visualization systems such as Google Maps, Fusion Tables, and the developed algorithms are part of the Fusion Tables backend. The SIGMOD 2012 paper was selected among the best papers of the conference, and invited to a special best-papers issue of TODS.

Systems

Spanner: Google’s Globally-Distributed Database
James C. Corbett*, Jeffrey Dean*, Michael Epstein*, Andrew Fikes*, Christopher Frost*, JJ Furman*, Sanjay Ghemawat*, Andrey Gubarev*, Christopher Heiser*, Peter Hochschild*, Wilson Hsieh*, Sebastian Kanthak*, Eugene Kogan*, Hongyi Li*, Alexander Lloyd*, Sergey Melnik*, David Mwaura*, David Nagle*, Sean Quinlan*, Rajesh Rao*, Lindsay Rolig*, Dale Woodford*, Yasushi Saito*, Christopher Taylor*, Michal Szymaniak*, Ruth Wang*, [OSDI '12]
This paper shows how a new time API and its implementation can provide the abstraction of tightly synchronized clocks, even on a global scale. We describe how we used this technology to build a globally-distributed database that supports a variety of powerful features: non-blocking reads in the past, lock-free snapshot transactions, and atomic schema changes.

ddd

Google Data

Archive for June, 2013

Chromebooks: coming to more stores near you

Chromebooks: coming to more stores near you

Google Summer of Code coding starts today!

Happy Small Business Week.

Happy Small Business Week.

Celebrating 10 years of shared success

Our continued commitment to combating child exploitation online

Introducing Project Loon: Balloon-powered Internet access

Darting around Fab Friday

Optimal Logging

Map of the Week: Citymapper Android App

Retiring Chrome Frame

Live at 10:30PT/ 1:30ET – Hangout with Mark Walker, SVP, Disney.com

Experience 1,001 New Destinations with Street View

Excellent Papers for 2012

Categories

Tags