August, 2016 | Google Data

Archive for August, 2016

Meet Parsey’s Cousins: Syntax for 40 languages, plus new SyntaxNet capabilities

August 8th, 2016 | by Research Blog | published in Google Research

Posted by Chris Alberti, Dave Orr & Slav Petrov, Google Natural Language Understanding Team

Just in time for ACL 2016, we are pleased to announce that Parsey McParseface, released in May as part of SyntaxNet and the basis for the Cloud Natural Language API, now has 40 cousins! Parsey’s Cousins is a collection of pretrained syntactic models for 40 languages, capable of analyzing the native language of more than half of the world’s population at often unprecedented accuracy. To better address the linguistic phenomena occurring in these languages we have endowed SyntaxNet with new abilities for Text Segmentation and Morphological Analysis.

When we released Parsey, we were already planning to expand to more languages, and it soon became clear that this was both urgent and important, because researchers were having trouble creating top notch SyntaxNet models for other languages.

The reason for that is a little bit subtle. SyntaxNet, like other TensorFlow models, has a lot of knobs to turn, which affect accuracy and speed. These knobs are called hyperparameters, and control things like the learning rate and its decay, momentum, and random initialization. Because neural networks are more sensitive to the choice of these hyperparameters than many other machine learning algorithms, picking the right hyperparameter setting is very important. Unfortunately there is no tested and proven way of doing this and picking good hyperparameters is mostly an empirical science — we try a bunch of settings and see what works best.

An additional challenge is that training these models can take a long time, several days on very fast hardware. Our solution is to train many models in parallel via MapReduce, and when one looks promising, train a bunch more models with similar settings to fine-tune the results. This can really add up — on average, we train more than 70 models per language. The plot below shows how the accuracy varies depending on the hyperparameters as training progresses. The best models are up to 4% absolute more accurate than ones trained without hyperparameter tuning.

Held-out set accuracy for various English parsing models with different hyperparameters (each line corresponds to one training run with specific hyperparameters). In some cases training is a lot slower and in many cases a suboptimal choice of hyperparameters leads to significantly lower accuracy. We are releasing the best model that we were able to train for each language.

In order to do a good job at analyzing the grammar of other languages, it was not sufficient to just fine-tune our English setup. We also had to expand the capabilities of SyntaxNet. The first extension is a model for text segmentation, which is the task of identifying word boundaries. In languages like English, this isn’t very hard — you can mostly look for spaces and punctuation. In Chinese, however, this can be very challenging, because words are not separated by spaces. To correctly analyze dependencies between Chinese words, SyntaxNet needs to understand text segmentation — and now it does.

Analysis of a Chinese string into a parse tree showing dependency labels, word tokens, and parts of speech (read top to bottom for each word token).

The second extension is a model for morphological analysis. Morphology is a language feature that is poorly represented in English. It describes inflection: i.e., how the grammatical function and meaning of the word changes as its spelling changes. In English, we add an -s to a word to indicate plurality. In Russian, a heavily inflected language, morphology can indicate number, gender, whether the word is the subject or object of a sentence, possessives, prepositional phrases, and more. To understand the syntax of a sentence in Russian, SyntaxNet needs to understand morphology — and now it does.

Parse trees showing dependency labels, parts of speech, and morphology.

As you might have noticed, the parse trees for all of the sentences above look very similar. This is because we follow the content-head principle, under which dependencies are drawn between content words, with function words becoming leaves in the parse tree. This idea was developed by the Universal Dependencies project in order to increase parallelism between languages. Parsey’s Cousins are trained on treebanks provided by this project and are designed to be cross-linguistically consistent and thus easier to use in multi-lingual language understanding applications.

Using the same set of labels across languages can help us understand how sentences in different languages, or variations in the same language, convey the same meaning. In all of the above examples, the root indicates the main verb of the sentence and there is a passive nominal subject (indicated by the arc labeled with ‘nsubjpass’) and a passive auxiliary (‘auxpass’). If you look closely, you will also notice some differences because the grammar of each language differs. For example, English uses the preposition ‘by,’ where Russian uses morphology to mark that the phrase ‘the publisher (издателем)’ is in instrumental case — the meaning is the same, it is just expressed differently.

Google has been involved in the Universal Dependencies project since its inception and we are very excited to be able to bring together our efforts on datasets and modeling. We hope that this release will facilitate research progress in building computer systems that can understand all of the world’s languages.

Parsey’s Cousins can be found on GitHub, along with Parsey McParseface and SyntaxNet.

ddd

Learn how Google’s research tools can enhance your content

August 8th, 2016 | by John A.Smith | published in Google Adsense

When you know what the world’s talking about, you can participate in the conversation. But the online world moves quickly, so if you want to keep the crowds coming back to your site, your content needs to move with it.

Google’s News Lab is Google’s effort to empower innovation at the intersection of technology and media. Its mission is to collaborate with journalists and entrepreneurs to build the future of media. An important part of that is ensuring that Google tools are available and easy-to-use for journalists around the world.

Google News Lab offers lessons on how to use Google tools relevant to publishers’ needs. Say something newsworthy at a sports event is grabbing headlines, Google tools can ensure that you’re informed. It’s then over to you to draw on this story and incorporate it into your content.

A great way to keep your finger on the pulse is Google Alerts, a tool that allows you to follow developing stories from your inbox. Simply select the topics you want to follow and have emails sent directly to your inbox any time that Google finds new results for this topic.

Google Alerts removes the need for you to keep checking back on a topic and simplifies the journalistic process by having all your information come from one reliable source. Once you’re using Google Alerts to stay informed about an event, you can ensure that the content on your site stays current and keeps users coming back for more.

If you want to take a more proactive approach, Google Trends gives you access to global data, to power insightful storytelling. One way to use this data could be to look at what users are searching globally. You can select topic areas and drill down into regions for those topics, ensuring that you can take advantage of the data relevant to you and your users to create the most timely and engaging content.

Google tools are designed to help you create great content. The more topical your content, the more likely you are to keep drawing the crowds.

Get started with Google News Lab today.

New to AdSense? Sign up now and turn your passion into profit

Posted by Jay Castro,
From the AdSense team
@jayciro

ddd

Data Exploration with Google Data Studio

August 8th, 2016 | by Adam Singer | published in Google Analytics

If you analyze and visualize data often enough, there are good chances that at some point you felt the “analyst’s block” (a less famous version of the writer’s block). We thought you might feel that way at times, so we provide here some ideas for you to explore and build great Reports in Data Studio.

In this post we will use a sample dataset from the U.S. Census Bureau. The data is about annual operating expenses of U.S. Retail, Accommodation, and Food Services between 2006 and 2014. The dataset is not complex, just 10 types of businesses and their expenses in that time period. Here is the Google Sheet data that was connected to Data Studio.

Now the important question: what should you do first when opening a blank canvas? Below is a set of three charts that will often give you some insight into the nature of data, they will help you to explore the data and build an insightful report. You will probably also have requests coming from your audience, but those can be helpful both for your own understanding and for enhancing those requests.

Below is a quick explanation of each chart and how they can bring insights into your data:

Line chart: this is extremely useful if you have time series data, it will help you quickly identify trends over time. It is recommended you use not only the time dimensions (which would aggregate all other dimensions), but also segment the data by a second dimension, to see how different groups behave over time. In this case we are using the Business Type to segment the main trend. Once you do that, you will see one line per value (see legend above the chart) – as you can see, 10 lines is a bit crowded, so you might want to use 6-8 lines only
Table: it is hard to find a better way to get a feeling for the data than tabular data! To help visualize the stats, you can also use bar charts and heatmaps inside the tables (see blue bars on second column and red heatmap on third column), they are pretty helpful visual clues especially in tables with lots of data.
Scatter chart: the scatter charts are great to understand how two metrics correlate. In the screenshot above you will also note that there is a trend line (green) in the chart; it shows that as the expense grows, the YoY Change has a lower value, meaning that it decreases quicker.

Hopefully those three charts will help you get a feeling for the data. You can also take a look at the Report at https://goo.gl/QqNFWn

Happy visualizing!

Posted by Daniel Waisberg, Analytics Advocate

ddd

What’s the deal with programmatic deals?

August 8th, 2016 | by Stephen Kliff | published in Google DoubleClick

A few weeks back, Paul Muret, Google’s VP of Display, Video and Analytics, made several announcements about enhancements to the DoubleClick platform to support Programmatic Direct deals. Paul also shared that the number of Programmatic Direct deals transacted on DoubleClick Ad Exchange tripled in 2015¹ alone.

Everyone knows that programmatic is growing and is increasingly becoming the way we transact digital advertising. But what’s the deal with programmatic deals, or as we say, Programmatic Direct?

To answer that question, we dug into the data to analyze the key drivers of Programmatic Direct growth on our platforms. You can explore some of the data for yourself, with our interactive report: The State of Programmatic Direct.

Looking through Ad Exchange data from October 2014 to December 2015, one thing became incredibly clear: In every region, across every platform and publisher category, there are fascinating trends of adoption to be found.

Programmatic Direct has gone mainstream…

Depending on whom you ask, the history of programmatic exchanges can be traced back nearly a decade. However, it was only a few years ago that some of the world’s largest global spenders started making big commitments to programmatic and “programmatic” became ANA’s Marketing Word of the Year.

It hasn’t taken as long for advertisers and publishers to use programmatic technologies to transact the deals they’d traditionally buy and sell directly. In reviewing the data from DoubleClick AdExchange, we found that:

90+ marketers on the Ad Age Top 100 Global Marketers list made Programmatic Direct deals in 2015².
More than half of the publishers in the US comScore top 50 list from December 2015 offered their inventory through Programmatic Direct deals³.

… On every screen

Programmatic may have been born on the desktop, but Programmatic Direct is taking off on mobile — probably not a surprise if you’re reading this article on your phone. Programmatic Direct impressions served on mobile and tablet grew 4x faster than desktop in the period surveyed⁴.

… In every region

The growth of Programmatic Direct isn’t limited to any specific country or region. So, where is it well adopted and where is it growing fast?

Programmatic Direct impressions in Ukraine, Turkey and Spain each more than doubled in just 12 months⁵.
Japan was the strongest adopter in APAC but Taiwan and Indonesia saw Programmatic Direct impressions grow more than 20% monthly⁶.

Stay tuned over the next few weeks as we dig through the data to share more insights. You can also explore our interactive research report to find additional trends, like how quickly Programmatic Direct impressions on mobile apps grew for game publishers in Japan. To get started, take a look at the infographic below.

Posted by Carlo Acenas
Associate Product Marketing Manager
Yamini Gupta
Sr. Product Marketing Manager ^{1 DoubleClick Ad Exchange data, year end 2014 to year end 2015.}
^{2 DoubleClick Ad Exchange data, Oct 2014-Dec 2015. Minimum $1K spend.}
^{3 DoubleClick Ad Exchange data, Oct 2014-Dec 2015. Cross-referenced with comScore 50 US list, December 2015.}
^{4 DoubleClick Ad Exchange data, Oct 2014 – Dec 2015.}
^{5 DoubleClick Ad Exchange data, Oct 2014 – Dec 2015.}
^{6 DoubleClick Ad Exchange data, Oct 2014 – Dec 2015.}

ddd

What’s the deal with programmatic deals?

August 8th, 2016 | by Stephen Kliff | published in Google DoubleClick

Everyone knows that programmatic is growing and is increasingly becoming the way we transact digital advertising. But what’s the deal with programmatic deals, or as we say, Programmatic Direct?

Programmatic Direct has gone mainstream…

90+ marketers on the Ad Age Top 100 Global Marketers list made Programmatic Direct deals in 2015².
More than half of the publishers in the US comScore top 50 list from December 2015 offered their inventory through Programmatic Direct deals³.

… On every screen

… In every region

The growth of Programmatic Direct isn’t limited to any specific country or region. So, where is it well adopted and where is it growing fast?

Programmatic Direct impressions in Ukraine, Turkey and Spain each more than doubled in just 12 months⁵.
Japan was the strongest adopter in APAC but Taiwan and Indonesia saw Programmatic Direct impressions grow more than 20% monthly⁶.

ddd

ACL 2016 & Research at Google

August 7th, 2016 | by Research Blog | published in Google Research

Posted by Slav Petrov, Research Scientist

This week, Berlin hosts the 2016 Annual Meeting of the Association for Computational Linguistics (ACL 2016), the premier conference of the field of computational linguistics, covering a broad spectrum of diverse research areas that are concerned with computational approaches to natural language. As a leader in Natural Language Processing (NLP) and a Platinum Sponsor of the conference, Google will be on hand to showcase research interests that include syntax, semantics, discourse, conversation, multilingual modeling, sentiment analysis, question answering, summarization, and generally building better learners using labeled and unlabeled data, state-of-the-art modeling, and learning from indirect supervision.

Our systems are used in numerous ways across Google, impacting user experience in search, mobile, apps, ads, translate and more. Our work spans the range of traditional NLP tasks, with general-purpose syntax and semantic algorithms underpinning more specialized systems.
Our researchers are experts in natural language processing and machine learning, and combine methodological research with applied science, and our engineers are equally involved in long-term research efforts and driving immediate applications of our technology.

If you’re attending ACL 2016, we hope that you’ll stop by the booth to check out some demos, meet our researchers and discuss projects and opportunities at Google that go into solving interesting problems for billions of people. Learn more about Google research being presented at ACL 2016 below (Googlers highlighted in blue), and visit the Natural Language Understanding Team page at g.co/NLUTeam.

Papers
Generalized Transition-based Dependency Parsing via Control Parameters
Bernd Bohnet, Ryan McDonald, Emily Pitler, Ji Ma

Learning the Curriculum with Bayesian Optimization for Task-Specific Word Representation Learning
Yulia Tsvetkov, Manaal Faruqui, Wang Ling (Google DeepMind), Chris Dyer (Google DeepMind)

Morpho-syntactic Lexicon Generation Using Graph-based Semi-supervised Learning (TACL)
Manaal Faruqui, Ryan McDonald, Radu Soricut

Many Languages, One Parser (TACL)
Waleed Ammar, George Mulcaire, Miguel Ballesteros, Chris Dyer (Google DeepMind)^*, Noah A. Smith

Latent Predictor Networks for Code Generation
Wang Ling (Google DeepMind), Phil Blunsom (Google DeepMind), Edward Grefenstette (Google DeepMind), Karl Moritz Hermann (Google DeepMind), Tomáš Kočiský (Google DeepMind), Fumin Wang (Google DeepMind), Andrew Senior (Google DeepMind)

Collective Entity Resolution with Multi-Focal Attention
Amir Globerson, Nevena Lazic, Soumen Chakrabarti, Amarnag Subramanya, Michael Ringgaard, Fernando Pereira

Plato: A Selective Context Model for Entity Resolution (TACL)
Nevena Lazic, Amarnag Subramanya, Michael Ringgaard, Fernando Pereira

WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia
Daniel Hewlett, Alexandre Lacoste, Llion Jones, Illia Polosukhin, Andrew Fandrianto, Jay Han, Matthew Kelcey, David Berthelot

Stack-propagation: Improved Representation Learning for Syntax
Yuan Zhang, David Weiss

Cross-lingual Models of Word Embeddings: An Empirical Comparison
Shyam Upadhyay, Manaal Faruqui, Chris Dyer (Google DeepMind), Dan Roth

Globally Normalized Transition-Based Neural Networks (Outstanding Papers Session)
Daniel Andor, Chris Alberti, David Weiss, Aliaksei Severyn, Alessandro Presta, Kuzman Ganchev, Slav Petrov, Michael Collins

Posters
Cross-lingual projection for class-based language models
Beat Gfeller, Vlad Schogol, Keith Hall

Synthesizing Compound Words for Machine Translation
Austin Matthews, Eva Schlinger^*, Alon Lavie, Chris Dyer (Google DeepMind)^*

Cross-Lingual Morphological Tagging for Low-Resource Languages
Jan Buys, Jan A. Botha

Workshops
1st Workshop on Representation Learning for NLP
Keynote Speakers include: Raia Hadsell (Google DeepMind)
Workshop Organizers include: Edward Grefenstette (Google DeepMind), Phil Blunsom (Google DeepMind), Karl Moritz Hermann (Google DeepMind)
Program Committee members include: Tomáš Kočiský (Google DeepMind), Wang Ling (Google DeepMind), Ankur Parikh (Google), John Platt (Google), Oriol Vinyals (Google DeepMind)

1st Workshop on Evaluating Vector-Space Representations for NLP
Contributed Papers:
Problems With Evaluation of Word Embeddings Using Word Similarity Tasks
Manaal Faruqui, Yulia Tsvetkov, Pushpendre Rastogi, Chris Dyer (Google DeepMind)^*

Correlation-based Intrinsic Evaluation of Word Vector Representations
Yulia Tsvetkov, Manaal Faruqui, Chris Dyer (Google DeepMind)

SIGFSM Workshop on Statistical NLP and Weighted Automata
Contributed Papers:
Distributed representation and estimation of WFST-based n-gram models
Cyril Allauzen, Michael Riley, Brian Roark

Pynini: A Python library for weighted finite-state grammar compilation
Kyle Gorman

* Work completed at CMU^↩

ddd

Making Rubyists more comfortable on Google Cloud Platform

August 5th, 2016 | by Open Source Programs Office | published in Google Open Source

One of the many open source efforts at Google is the Google Cloud Platform (GCP) native libraries for our most popular languages. One of these libraries is the gcloud-ruby project on GitHub which is released as the gcloud gem on rubygems.org. There are several gems for accessing Google Cloud Platform resources from Ruby but this gem is different. It is hand coded by Rubyists for Rubyists and that has some distinct advantages.

Many of us have had experience working with libraries that are clearly ported from another language. I usually talk about them as Ruby with a Java accent or Python with a Perl accent. Generally they work just fine but you can run into some low level friction — sometimes things just don’t feel right. Native gems written by members of the community solve this problem. In the case of gcloud-ruby there are some really concrete examples.

First, gcloud-ruby uses syntax that is similar to other popular Ruby libraries. For example, the syntax for specifying a table schema in BigQuery (Google Cloud Platform’s very large scale data warehouse) looks like this:

table = dataset.create_table "baby_names" do |schema|
  schema.string "name"
  schema.string "sex"
  schema.integer "number"
end

Creating the same table in popular Ruby on Rails looks like this:

create_table "baby_names" do |schema|
  schema.string "name"
  schema.string "sex"
  schema.integer "number
end

The two are nearly identical. That makes getting up to speed on BigQuery easier and quicker than it would be if the Ruby library didn’t use patterns that are already known to the majority of Rubyists.

Another way the gcloud-ruby library meets the community where it is at is by embracing the community’s fondness for doing things several different ways. In Ruby there are often several correct ways to do a given task.

The gcloud-ruby library is no exception. There are a few different ways to authenticate and create the objects you use to interact with the API. Ruby also has many common methods that have aliases. In the standard library Enumerable#map and Enumerable#collect actually run the same code path for example. In gcloud-ruby the vision API uses aliases. Google Cloud Vision provides a single endpoint: annotate. gcloud-ruby has an annotate method but also aliases this method as mark and detect if those make more sense to you (detect is the method that makes the most sense to my brain so that’s the one I use). By providing a couple of different aliases it can mean the first thing you try is more likely to work. This speeds up development time and makes learning the library easier.

The last way the gcloud-ruby gem makes Rubyists feel at home is by having comprehensive tests, a common value and popular discussion topic for the Ruby community. gcloud-ruby uses minitest-spec for testing, a popular choice that most Rubyists can easily read. When I was learning the storage API I looked at the tests for storage to learn how to use the library. There is outstanding documentation as well for those who prefer learning that way but I’m so used to looking at tests that I really appreciated that gcloud-ruby has well written and easily accessible tests.

Above are three examples of how hand-coded libraries from within the community can improve the user experience when learning to use tools. Of course, doing all the development on GitHub in the open also helps. Users can easily see what bugs people have run into and what features are next up in the production queue. And if a user has a feature request (like the previously mentioned Cloud Vision support) they can create a GitHub issue.

If you’re a Rubyist, give gcloud-ruby a shot and let us know what you think!

By Aja Hammerly, Developer Advocate

ddd

Guided in-process fuzzing of Chrome components

August 5th, 2016 | by Google Security PR | published in Google Online Security

Posted by Max Moroz, Chrome Security Engineer and Kostya Serebryany, Sanitizer Tsar

In the past, we’ve posted about innovations in fuzzing, a software testing technique used to discover coding errors and security vulnerabilities. The topics have included AddressSanitizer, ClusterFuzz, SyzyASAN, ThreadSanitizer and others.

Today we’d like to talk about libFuzzer (part of the LLVM project), an engine for in-process, coverage-guided, white-box fuzzing:

By in-process, we mean that we don’t launch a new process for every test case, and that we mutate inputs directly in memory.
By coverage-guided, we mean that we measure code coverage for every input, and accumulate test cases that increase overall coverage.
By white-box, we mean that we use compile-time instrumentation of the source code.

LibFuzzer makes it possible to fuzz individual components of Chrome. This means you don’t need to generate an HTML page or network payload and launch the whole browser, which adds overhead and flakiness to testing. Instead, you can fuzz any function or internal API directly. Based on our experience, libFuzzer-based fuzzing is extremely efficient, more reliable, and usually thousands of times faster than traditional out-of-process fuzzing.

Our goal is to have fuzz testing for every component of Chrome where fuzzing is applicable, and we hope all Chromium developers and external security researchers will contribute to this effort.

How to write a fuzz target

With libFuzzer, you need to write only one function, which we call a target function or a fuzz target. It accepts a data buffer and length as input and then feeds it into the code we want to test. And… that’s it!

The fuzz targets are not specific to libFuzzer. Currently, we also run them with AFL, and we expect to use other fuzzing engines in the future.
Sample Fuzzer

extern “C“ int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
std::string buf;
woff2::WOFF2StringOut out(&buf);
out.SetMaxSize(30 * 1024 * 1024);
woff2::ConvertWOFF2ToTTF(data, size, &out);
return 0;
}
See also the build rule.
Sample Bug

==9896==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x62e000022836 at pc 0x000000499c51 bp 0x7fffa0dc1450 sp 0x7fffa0dc0c00

WRITE of size 41994 at 0x62e000022836 thread T0

SCARINESS: 45 (multi-byte-write-heap-buffer-overflow)

#0 0x499c50 in __asan_memcpy

#1 0x4e6b50 in Read third_party/woff2/src/buffer.h:86:7

#2 0x4e6b50 in ReconstructGlyf third_party/woff2/src/woff2_dec.cc:500

#3 0x4e6b50 in ReconstructFont third_party/woff2/src/woff2_dec.cc:917

#4 0x4e6b50 in woff2::ConvertWOFF2ToTTF(unsigned char const*, unsigned long, woff2::WOFF2Out*) third_party/woff2/src/woff2_dec.cc:1282

#5 0x4dbfd6 in LLVMFuzzerTestOneInput testing/libfuzzer/fuzzers/convert_woff2ttf_fuzzer.cc:15:3

Check out our documentation for additional information.

Integrating LibFuzzer with ClusterFuzz

ClusterFuzz is Chromium’s infrastructure for large scale fuzzing. It automates crash detection, report deduplication, test minimization, and other tasks. Once you commit a fuzz target into the Chromium codebase (examples), ClusterFuzz will automatically pick it up and fuzz it with libFuzzer and AFL.

ClusterFuzz supports most of the libFuzzer features like dictionaries, seed corpus and custom options for different fuzzers. Check out our Efficient Fuzzer Guide to learn how to use them.

Besides the initial seed corpus, we store, minimize, and synchronize the corpora for every fuzzer and across all bots. This allows us to continuously increase code coverage over time and find interesting bugs along the way.

ClusterFuzz uses the following memory debugging tools with libFuzzer-based fuzzers:

AddressSanitizer (ASan): 500 GCE VMs
MemorySanitizer (MSan): 100 GCE VMs
UndefinedBehaviorSanitizer (UBSan): 100 GCE VMs

Sample Fuzzer Statistics

It’s important to track and analyze performance of fuzzers. So, we have this dashboard to track fuzzer statistics, that is accessible to all chromium developers:

Overall statistics for the last 30 days:

120 fuzzers
112 bugs filed
Aaaaaand…. 14,366,371,459,772 unique test inputs!

Analysis of the bugs found so far

Looking at the 324 bugs found so far, we can say that ASan and MSan have been very effective memory tools for finding security vulnerabilities. They give us comparable numbers of crashes, though ASan crashes usually are more severe than MSan ones. LSan (part of ASan) and UBSan have a great impact for Stability – another one of our 4 core principles.

Extending Chrome’s Vulnerability Reward Program

Under Chrome’s Trusted Researcher Program, we invite submission of fuzzers. We run them for you on ClusterFuzz and automatically nominate bugs they find for reward payments.

Today we’re pleased to announce that the invite-only Trusted Researcher Program is being replaced with the Chrome Fuzzer Program which encourages fuzzer submissions from all, and also covers libFuzzer-based fuzzers! Full guidelines are listed on Chrome’s Vulnerability Reward Program page.

ddd

Google Maps goes for the win with Rio updates

August 5th, 2016 | by Lat Long | published in Google Earth

Mapping a sprawling, densely populated city of 6 million people like Rio de Janeiro is a tough task. With an extra 10,000 athletes, half a million travelers, and tens of thousands of volunteers heading to the city this month, you can expect additional friction caused by road closures, traffic, and jam-packed attractions. Google Maps is putting the finishing touches on some first prize-worthy updates to help tourists and Rio residents alike get around “the Marvelous City” with ease. We even threw in a couple changes for those enjoying the events from home to feel like they’re in the middle of the action.

Getting around Rio without a hitch
For folks on the ground in Rio, Maps can be your real-world assistant, helping you get where you’re going via whichever mode of transportation you prefer. In April, we launched real-time transit for 1,300 bus lines in the Rio metro area, as well as bike routes throughout Rio and the rest of Brazil.

Construction, security and crowds during large-scale events can put a damper on a driver’s day. We’re working with the City of Rio to make sure Google Maps has the most up-to-date info on traffic, road closures and detours and help get you where you’re going faster.

Breezing through traffic and beating the crowds is reason for celebration. With the Explore feature on Google Maps for Android and iOS in Brazil, anyone can uncover the local gems wherever you go by simply tapping “Explore food & drinks near you” at the bottom of the app. From there you can swipe through the best breakfast, lunch, coffee, dinner, and drinks spots around them.

Getting all of Rio on the Map
The favelas of Rio aren’t well-known to many outsiders, partly because there’s limited information about these areas to include on maps. We partnered with the local Brazilian nonprofit Grupo Cultural AfroReggae on a project called “Tá No Mapa” (“It’s On the Map” in English). Together with AfroReggae we trained 150 favela residents on digital mapping skills and in just two years they’ve mapped 26 favelas and gotten more than 3,000 businesses on the map. Not only does this allow locals to find businesses like Bar do David—an award-winning restaurant in the favela Chapeu Mangueira—it’s helped some local residents get a mailing address for the first time.

Getting in on the action from home
For those of you (*raises hand*) who can’t make it to Rio this summer, you can still get in on the excitement from the comfort of your home. We refreshed our Google Street View imagery to give virtual travelers an insider’s look at the stadiums. You can almost taste the caipirinhas!

For those who really want to feel like they’re in the the game, we also launched indoor maps of all 25 official indoor venues and added more details to the maps of the 12 outdoor venues – like the custom-made golf course where you can now practically see all 18 holes.

No matter what city you find yourself in this summer, these very same features can help you find the perfect spot to watch the action and get there with ease.

Posted by Marcus Leal, Product Manager, Google Maps

ddd

Go bananas for the 2016 Doodle Fruit Games

August 4th, 2016 | by Google Blogs | published in Google Blog

The summer just got sweeter. Today marks the season opener of the 2016 Doodle Fruit Games. For the next couple of weeks in the latest Google app for Android and iOS, journey to an otherwise unassuming fruit stand in Rio, where produce from all over the market are ripe to compete for the title of freshest fruit.

The name of today’s featured game is to see who’s the fastest fruit on the track in this berry special race. Don’t be MELONcholy if your sprint turns into more of a smoothie.

If you like the taste of that, be sure to weave your way through the ice cooler to see if you’re the chillest lemon around. Remember: No one likes sour losers!

We hope you find these fruits as apPEELing as we do. And don’t forget to share your cherry impressive results with friends to see who claims the top seed. These two games are just a taste of what’s in store, so come back to the Google app throughout the week to catch the featured game of the day.

Posted by Matt Cruickshank, Google Doodler https://2.bp.blogspot.com/-xausnUUeFZw/V6OLFnHPhAI/AAAAAAAASwg/7tqruAhwtdsVJhjIBFTqutNRgoYctAH9gCLcB/s1600/peach_autograph.gif Matt Cruickshank Google Doodler Google http://1.bp.blogspot.com/-mX0dxJxp8dg/Vo8MSdxypWI/AAAAAAAARsI/EjaFhvgAEgc/s1600/Beutler_Google_Giftwrap_-v2TW.png Matt Cruickshank Google Doodler Google –>

ddd

Improvements to PDF, Microsoft Office, image file previewing in Google Drive on web

August 4th, 2016 | by Jane Smith | published in Google Apps

The Google Drive preview feature is a way to quickly preview files you’d typically open in Microsoft Office, Adobe Acrobat, or photo editors. Available in Gmail, Inbox, and Google Drive, previewing a file is a useful and fast experience and works across a wide range of files. Starting this week, we’re rolling out some improvements to the preview feature to make it simpler and easier to use:

Cleaner interface: Buttons and toolbars stay out of the way when you’re not using them.

Spreadsheet zooming support: If you preview Microsoft Excel, OpenOffice, or other spreadsheet files in Drive, you can now zoom in and out to inspect specific cells. This complements earlier launched improvements where we added support for frozen rows and columns, and the ability to switch between sheets.
Simpler zooming for Microsoft and OpenOffice document files: Zoom buttons are now easier to find and make images and documents larger than you could before.

Launch Details
Release track:
Launching to Rapid release, with Scheduled release coming in 2 weeks

Rollout pace:
Full rollout (1-3 days for feature visibility)

Impact:
All end users

Action:
Change management suggested/FYI

More Information
Help Center

Note: all launches are applicable to all Google Apps editions unless otherwise noted

Launch release calendar
Launch detail categories
Get these product update alerts by email
Subscribe to the RSS feed of these updates

ddd

New research: Zeroing in on deceptive software installations

August 4th, 2016 | by Google Security PR | published in Google Online Security

Posted by Kurt Thomas, Research Scientist and Juan A. Elices Crespo, Software Engineer

As part of Google’s ongoing effort to protect users from unwanted software, we have been zeroing in on the deceptive installation tactics and actors that play a role in unwanted software delivery. This software includes unwanted ad injectors that insert unintended ads into webpages and browser settings hijackers that change search settings without user consent.

Every week, Google Safe Browsing generates over 60 million warnings to help users avoid installing unwanted software–that’s more than 3x the number of warnings we show for malware. Many of these warnings appear when users unwittingly download software bundles laden with several additional applications, a business model known as pay-per-install that earns up to $1.50 for each successful install. Recently, we finished the first in-depth investigation with NYU Tandon School of Engineering into multiple pay-per-install networks and the unwanted software families purchasing installs. The full report, which you can read here, will be presented next week at the USENIX Security Symposium.

Over a year-long period, we found four of the largest pay-per-install networks routinely distributed unwanted ad injectors, browser settings hijackers, and scareware flagged by over 30 anti-virus engines. These bundles were deceptively promoted through fake software updates, phony content lockers, and spoofed brands–techniques openly discussed on underground forums as ways to trick users into unintentionally downloading software and accepting the installation terms. While not all software bundles lead to unwanted software, critically, it takes only one deceptive party in a chain of web advertisements, pay-per-install networks, and application developers for abuse to manifest.
Behind the scenes of unwanted software distribution

Software bundle installation dialogue. Accepting the express install option will cause eight other programs to be installed with no indication of each program’s functionality.

If you have ever encountered an installation dialog like the one above, then you are already familiar with the pay-per-install distribution model. Behind the scenes there are a few different players:

Advertisers: In pay-per-install lingo, advertisers are software developers, including unwanted software developers, paying for installs via bundling. In our example above, these advertisers include Plus-HD and Vuupc among others. The cost per install ranges anywhere from $0.10 in South America to $1.50 in the United States. Unwanted software developers will recoup this loss via ad injection, selling search traffic, or levying subscription fees. During our investigation, we identified 1,211 advertisers paying for installs.

Affiliate networks: Affiliate networks serve as middlemen between advertisers looking to buy installs and popular software packages willing to bundle additional applications in return for a fee. These affiliate networks provide the core technology for tracking successful installs and billing. Additionally, they provide tools that attempt to thwart Google Safe Browsing or anti-virus detection. We spotted at least 50 affiliate networks fueling this business.

Publishers: Finally, popular software applications re-package their binaries to include several advertiser offers. Publishers are then responsible for getting users to download and install their software through whatever means possible: download portals, organic page traffic, or often times deceptive ads. Our study uncovered 2,518 publishers distributing through 191,372 webpages.

This decentralized model encourages advertisers to focus solely on monetizing users upon installation and for publishers to maximize conversion, irrespective of the final user experience. It takes only one bad actor anywhere in the distribution chain for unwanted installs to manifest.

What gets bundled?

We monitored the offers bundled by four of the largest pay-per-install affiliate networks on a daily basis for over a year. In total, we collected 446K offers related to 843 unique software packages. The most commonly bundled software included unwanted ad injectors, browser settings hijackers, and scareware purporting to fix urgent issues with a victim’s machine for $30-40. Here’s an example of an ad injector impersonating an anti-virus alert to scam users into fixing non-existent system issues:

Deceptive practices

Taken as a whole, we found 59% of weekly offers bundled by pay-per-install affiliate networks were flagged by at least one anti-virus engine as potentially unwanted. In response, software bundles will first fingerprint a user’s machine prior to installation to detect the presence of “hostile” anti-virus engines. Furthermore, in response to protections provide by Google Safe Browsing, publishers have resorted to increasingly convoluted tactics to try and avoid detection, like the defunct technique shown below of password protecting compressed binaries:

Paired with deceptive promotional tools like fake video codecs, software updates, or misrepresented brands, there are a multitude of deceptive behaviors currently pervasive to software bundling.

Cleaning up the ecosystem

We are constantly improving Google Safe Browsing defenses and the Chrome Cleanup Tool to protect users from unwanted software installs. When it comes to our ads policy, we take quick action to block and remove advertisers who misrepresent downloads or distribute software that violates Google’s unwanted software policy.

Additionally, Google is pushing for real change from businesses involved in the pay-per-install market to address the deceptive practices of some participants. As part of this, Google recently hosted a Clean Software Summit bringing together members of the anti-virus industry, bundling platforms, and the Clean Software Alliance. Together, we laid the groundwork for an industry-wide initiative to provide users with clear choices when installing software and to block deceptive actors pushing unwanted installs. We continue to advocate on behalf of users to ensure they remain safe while downloading software online.

ddd

Introducing the Google Analytics Demo Account

August 3rd, 2016 | by Adam Singer | published in Google Analytics

In theory, theory and practice are the same. In practice, they are not – Albert Einstein

There are many resources available to learn Google Analytics, from the courses and training we offer, to advice from the community, or from the many books, guides, and articles written about Google Analytics. However, we’ve heard many of you would also like a resource so you can learn through practical experience and to apply your theoretical analytics knowledge. It can be difficult to gain practical experience since not everyone has access to a fully-implemented Google Analytics account. To fix this we’re introducing a fully functional Google Analytics Demo Account, available to everyone (get access here).

The Demo Account includes data from the Google Merchandise Store, an active Ecommerce site that sells Google branded merchandise. The ongoing Google Analytics implementation which will be completed this month already includes all the major features you would typically implement, like AdWords linking, Goals and Enhanced Ecommerce. The result is a fully functional account, with real business data.

Demo Account: Checkout Behavior Analysis Report

“Have you wondered why you’ve always gravitated towards people with real-world experience rather than on-paper experiences? The real-world part So while it hurts my feelings a bit to say that my best selling analytics books are not enough, I’m massively excited that the GA team has figured out a solution for the entire universe to get real-world experience. Get the access, download my awesome bundle of segments, dashboards and custom reports, and really start your learning experience!”
- Avinash Kaushik, Author – Web Analytics 2.0 and Web Analytics : An Hour a Day

Self-Learning

You can use the Demo Account to learn about Google Analytics features and functionality, for example:

Access all the Standard reports to see which ones are useful to you
Get inspiration from predefined dashboards and segments imported from the Solutions Gallery to create your own
Alter reports by adding table filters and secondary dimensions, and by changing the report type
Learn how to compare audience, acquisition, behavior and conversion performance to a previous date range period
Create your own personal assets such as custom reports, annotations, shortcuts and custom alerts
Become familiar with the predefined attribution models and even create your own
Determine whether features you don’t haven’t implemented could be beneficial to you e.g. AdWords and Search Console integrations
Use it as a companion when following a training course

Education Programs

If you’re an educator trying to teach others to use Google Analytics then we encourage you to use the Demo Account as a tool. You can use it to create tests, quizzes, and other learning materials for your students. In fact, we’re excited to announce that some organizations are already starting to integrate it into their learning materials.

General Assembly offers courses both online and at their campuses around the world that will help you master new skills in design, marketing, technology, and data. Their Digital Marketing course includes a unit covering Marketing Analytics that utilizes the Demo Account.

Google Analytics Partners, including E-Nor and Loves Data, use the Demo Account to provide online and classroom style trainings to cater to beginners and advanced analytics users. Their specialists will provide actionable training to create and improve your analytics configuration, implementation and marketing performance.

Access the Demo account

You can get access to the Demo Account and learn more about it, from this help article. If you need some help please let us know within the FAQs post and share any feature requests or ideas to make the Demo Account more useful within the Feature Requests post. We hope the Demo Account gives you a practical way to try new features and learn about Google Analytics.

Happy analyzing!

Posted by Deepak Aujla, Program Manager, Google Analytics

ddd

Computational Thinking for All Students

August 3rd, 2016 | by Research Blog | published in Google Research

Posted by Maggie Johnson, Director of Education and University Relations, Google

(Crossposted on the Google for Education Blog, and the the Huffington Post)

Last year, I wrote about the importance of teaching computational thinking to all K-12 students. Given the growing use of computing, algorithms and data in all fields from the humanities to medicine to business, it’s becoming increasingly important for students to understand the basics of computer science (CS). One lesson we have learned through Google’s CS education outreach efforts is that these skills can be accessible to all students, if we introduce them early in K-5. These are truly 21st century skills which can, over time, produce a workforce ready for a technology-enabled and driven economy.

How can teachers start introducing computational thinking in early school curriculum? It is already present in many topic areas – algorithms for solving math problems, for example. However, what is often missing in current examples of computational thinking is the explicit connection between what students are learning and its application in computing. For example, once a student has mastered adding multi-digit numbers, the following algorithm could be presented:

Add together the digits in the ones place. If the result is = 10 or greater, the ones digit of the result becomes the ones digit of the answer, and you add 1 to the next column.
Add together the digits in the tens place, plus the 1 carried over from the ones place, if necessary. If the answer = 10, the ones digit becomes the tens digit of the answer and 1 is added to the next column.
Repeat this process for any additional columns until they are all added.

This allows a teacher to present the concept of an algorithm and its use in computing, as well as the most important elements of any computer program: conditional branching (“if the result is less than 10…”) and iteration (“repeat this process…”). Going a step farther, a teacher translating the algorithm into a running program can have a compelling effect. When something that students have used to solve an instance of a problem can automatically solve all instances of the that problem, it’s quite a powerful moment for them even if they don’t do the coding themselves.

Google has created an online course for K-12 teachers to learn about computational thinking and how to make these explicit connections for their students. We also have a large repository of lessons, explorations and programs to support teachers and students. Our videos illustrate real-world examples of the application of computational thinking in Google’s products and services, and we have compiled a set of great resources showing how to integrate computational thinking into existing curriculum. We also recently announced Project Bloks to engage younger children in computational thinking. Finally, code.org, for whom Google is a primary sponsor, has curriculum and materials for K-5 teachers and students.

We feel that computational thinking is a core skill for all students. If we can make these explicit connections for students, they will see how the devices and apps that they use everyday are powered by algorithms and programs. They will learn the importance of data in making decisions. They will learn skills that will prepare them for a workforce that will be doing vastly different tasks than the workforce of today. We owe it to all students to give them every possible opportunity to be productive and successful members of society.

ddd

Learn from Experts in the AdWords Community

August 3rd, 2016 | by Rob Newton | published in Google Adwords

The AdWords Community exists to help advertisers like you improve performance and share best practices. If you haven’t visited the AdWords Community in a while, you might be surprised by what you find. Over the last six months, we’ve made a number of exciting changes. From a new design to expanded content areas, the Google Advertiser Community continues to evolve and help you connect with our experts and improve your performance.

Ask an Expert

One of the most valuable resources you’ll find on the Community is the experts who frequent the forum on a daily (or even hourly) basis. These experts, part of Google’s Top Contributor program, have years of professional experience with AdWords and a passion for helping fellow Community members succeed.

Most of our Top Contributors have an area of expertise. If you ask a question on the AdWords Community about account automation, you will probably meet Jon Gritton. Jon hails from the UK and runs his own AdWords agency. He is also one of our most tenured AdWords Top Contributors (he’s been answering questions on the forum since 2006!) and our resident AdWords Scripts expert. He has solved well over 800 questions in his time on the AdWords Community.

An Improved Look and Feel

We also wanted to improve the look and feel of the Community.

Using Material Design, the Community now offers the same modern and intuitive experience that’s at the core of our favorite Google apps like Maps, Search, and Gmail.
Managing your advertising isn’t something you only do at your desk, which is why we’re re-designing AdWords for marketing in a mobile-first world. The redesigned Community is also responsive and mobile-friendly—perfect for browsing and resolving questions on-the-go.

Go Beyond AdWords

Finally, we know that AdWords isn’t the only way to promote your business with Google. You can get information about other important Google products on the Community: Google Analytics, Google My Business, Google Partners and Google Small Business.

Get Involved

Anyone can join the Advertiser Community, post questions and find answers. Visit today to connect with other advertisers.

Julian Chu, Director of Customer Engagement

ddd

Google Data

Archive for August, 2016

Meet Parsey’s Cousins: Syntax for 40 languages, plus new SyntaxNet capabilities

Learn how Google’s research tools can enhance your content

Data Exploration with Google Data Studio

What’s the deal with programmatic deals?

Programmatic Direct has gone mainstream…

… On every screen

… In every region

What’s the deal with programmatic deals?

Programmatic Direct has gone mainstream…

… On every screen

… In every region

ACL 2016 & Research at Google

Making Rubyists more comfortable on Google Cloud Platform

Guided in-process fuzzing of Chrome components

Google Maps goes for the win with Rio updates

Go bananas for the 2016 Doodle Fruit Games

Improvements to PDF, Microsoft Office, image file previewing in Google Drive on web

New research: Zeroing in on deceptive software installations

Introducing the Google Analytics Demo Account

Computational Thinking for All Students

Learn from Experts in the AdWords Community

Ask an Expert

An Improved Look and Feel

Go Beyond AdWords

Get Involved

Categories

Tags