March, 2015 | Google Data

Archive for March, 2015

Mentoring Organizations for Google Summer of Code 2015

March 2nd, 2015 | by Open Source Programs Office | published in Google Open Source

We are excited to announce the mentoring organizations that have been accepted for this year’s Google Summer of Code program. As always, we had many more great projects than we could accept. After reviewing 416 applications, we have chosen 137 open source projects, 37 of which are new to Google Summer of Code. You can visit our Google Summer of Code 2015 program website for a complete list of the accepted orgs.

Over the next two weeks, students interested in applying for the Google Summer of Code 2015 program can learn more about the 137 accepted open source projects. The student application period begins on Monday, March 16, 2015 at 19:00 UTC.

Interested? Start by reviewing the Ideas Page from each organization to learn about the project and how you might contribute. Some of the most successful proposals have been completely new ideas submitted by students, so if you don’t see a project on an Ideas Page that appeals to you, don’t be afraid to suggest a new idea to the organization! There are points of contact listed for each organization on their Ideas Page – students can contact the organization directly to discuss a new proposal. All organizations list their preferred method of communication on the organization homepage, available on the Google Summer of Code program website. We strongly encourage students to reach out to the organizations before they apply. Please see our Frequently Asked Questions page for more information.

Congratulations to all of our mentoring organizations! We look forward to working with all of you during this next Google Summer of Code!

By Carol Smith, Open Source Team

ddd

Large-Scale Machine Learning for Drug Discovery

March 2nd, 2015 | by Research Blog | published in Google Research

Posted by Patrick Riley and Dale Webster, Google Research and Bharath Ramsundar, Google Research Intern and Stanford Ph.D. candidate

Discovering new treatments for human diseases is an immensely complicated challenge; Even after extensive research to develop a biological understanding of a disease, an effective therapeutic that can improve the quality of life must still be found. This process often takes years of research, requiring the creation and testing of millions of drug-like compounds in an effort to find a just a few viable drug treatment candidates. These high-throughput screens are often automated in sophisticated labs and are expensive to perform.

Recently, deep learning with neural networks has been applied in virtual drug screening^1,2,3, which attempts to replace or augment the high-throughput screening process with the use of computational methods in order to improve its speed and success rate.⁴ Traditionally, virtual drug screening has used only the experimental data from the particular disease being studied. However, as the volume of experimental drug screening data across many diseases continues to grow, several research groups have demonstrated that data from multiple diseases can be leveraged with multitask neural networks to improve the virtual screening effectiveness.

In collaboration with the Pande Lab at Stanford University, we’ve released a paper titled “Massively Multitask Networks for Drug Discovery“, investigating how data from a variety of sources can be used to improve the accuracy of determining which chemical compounds would be effective drug treatments for a variety of diseases. In particular, we carefully quantified how the amount and diversity of screening data from a variety of diseases with very different biological processes can be used to improve the virtual drug screening predictions.

Using our large-scale neural network training system, we trained at a scale 18x larger than previous work with a total of 37.8M data points across more than 200 distinct biological processes. Because of our large scale, we were able to carefully probe the sensitivity of these models to a variety of changes in model structure and input data. In the paper, we examine not just the performance of the model but why it performs well and what we can expect for similar models in the future. The data in the paper represents more than 50M total CPU hours.

This graph shows a measure of prediction accuracy (ROC AUC is the area under the receiver operating characteristic curve) for virtual screening on a fixed set of 10 biological processes as more datasets are added.

One encouraging conclusion from this work is that our models are able to utilize data from many different experiments to increase prediction accuracy across many diseases. To our knowledge, this is the first time the effect of adding additional data has been quantified in this domain, and our results suggest that even more data could improve performance even further.

Machine learning at scale has significant potential to accelerate drug discovery and improve human health. We look forward to continued improvement in virtual drug screening and its increasing impact in the discovery process for future drugs.

Thank you to our other collaborators David Konerding (Google), Steven Kearnes (Stanford), and Vijay Pande (Stanford).

References:

1. Thomas Unterthiner, Andreas Mayr, Günter Klambauer, Marvin Steijaert, Jörg Kurt Wegner, Hugo Ceulemans, Sepp Hochreiter. Deep Learning as an Opportunity in Virtual Screening. Deep Learning and Representation Learning Workshop: NIPS 2014

2. Dahl, George E, Jaitly, Navdeep, and Salakhutdinov, Ruslan. Multi-task neural networks for QSAR predictions. arXiv preprint arXiv:1406.1231, 2014.

3. Ma, Junshui, Sheridan, Robert P, Liaw, Andy, Dahl, George, and Svetnik, Vladimir. Deep neural nets as a method for quantitative structure-activity relationships. Journal of Chemical Information and Modeling, 2015.

4. Peter Ripphausen, Britta Nisius, Lisa Peltason, and Jürgen Bajorath. Quo Vadis, Virtual Screening? A Comprehensive Survey of Prospective Applications. Journal of Medicinal Chemistry 2010 53 (24), 8461-8467

ddd

Audience Insights Series: What the future holds

March 2nd, 2015 | by Ioannis Koutrakos | published in Google DoubleClick

This is the second post in our series to explore the convergence of audience data and search marketing. In our last post, we heard from industry leaders on the opportunity and how audience data helps them deliver even more relevant and resonant messages.

This week, we explore what the future holds. iProspect’s Ben Wood, Havas Media’s Paul Frampton and the IAB’s Steve Chester share perspectives on the continued convergence of audience data and search marketing, implications for digital marketing teams and how they work together, as well as how audience data in search will help bridge the gap between branding and direct response.

Look for our next post in the series, where we will explore best practices for advertisers who are looking to embrace audience data as part of their search marketing efforts.

ddd

Zipline through the Amazon Forest with Street View

March 2nd, 2015 | by Google Blogs | published in Google Earth

Home to millions of plant, animal and insect species, the Amazon rainforest is one of the most diverse ecosystems in the world. Undiscovered species thrive in the canopies of the primary forests, atop trees that have stood for centuries. Starting today, with the help of our partners at the Amazonas Sustainable Foundation (FAS), you can begin to unlock some of the wonders of the forest, by traveling from the upper canopy to the forest floor with Google Maps’ first zipline Street View collection.

Trekker on a zipline in the Amazon Rainforest

High up in the canopy, you can see thick moss on the trunks, miles of hanging vine, and some of the many plants and insects that call this place home.

Top of the Amazon canopy zipline

Now zip back down to the forest floor, and wind through a maze of towering old-growth trees. Looking up, the canopies are so thick, the sun barely peeks through.

Towering trees in the Amazon Forest

You can also come out from the shade and take a virtual float down the dreamy waters of the Rio Aripuanã or the Rio Mariepauá and come out to the Rio Madeira, one of the largest tributaries of the Amazon.

Float down the Rio Mariepauá

And don’t forget to stop by one of the 17 communities of local people who live along the river and in the forest. These people are the devoted stewards of the river and forests, and protect it by living with it, preventing the destruction of the trees and the life that depends on them.

Community of Abelha, along the Rio Mariepaua

This project is the next step in our partnership with FAS, who first invited us to Rio Negro Sustainable Development Reserve just three years ago. Their hope is that sharing the imagery of their local communities, rain forests and rivers with the world will raise awareness and support for their efforts to conserve these areas. Collected through the Trekker Loan Program, this new imagery is the result of boating down 500 km of rivers, walking 20 km of forest trails and ziplining through forest canopies. We hope it inspires you to embark on your own virtual expedition of the Amazon (you can leave the bug repellent at home!).

Posted by Karin Tuxen-Bettman, Program Manager, Google Earth Outreach

ddd

Google Data

Archive for March, 2015

Mentoring Organizations for Google Summer of Code 2015

Large-Scale Machine Learning for Drug Discovery

Audience Insights Series: What the future holds

Zipline through the Amazon Forest with Street View

Categories

Tags