Mentoring Organizations for Google Summer of Code 2015
March 2nd, 2015 | by Open Source Programs Office | published in Google Open Source
By Carol Smith, Open Source Team
March 2nd, 2015 | by Open Source Programs Office | published in Google Open Source
By Carol Smith, Open Source Team
March 2nd, 2015 | by Research Blog | published in Google Research
Posted by Patrick Riley and Dale Webster, Google Research and Bharath Ramsundar, Google Research Intern and Stanford Ph.D. candidate
Discovering new treatments for human diseases is an immensely complicated challenge; Even after extensive research to develop a biological understanding of a disease, an effective therapeutic that can improve the quality of life must still be found. This process often takes years of research, requiring the creation and testing of millions of drug-like compounds in an effort to find a just a few viable drug treatment candidates. These high-throughput screens are often automated in sophisticated labs and are expensive to perform.
Recently, deep learning with neural networks has been applied in virtual drug screening1,2,3, which attempts to replace or augment the high-throughput screening process with the use of computational methods in order to improve its speed and success rate.4 Traditionally, virtual drug screening has used only the experimental data from the particular disease being studied. However, as the volume of experimental drug screening data across many diseases continues to grow, several research groups have demonstrated that data from multiple diseases can be leveraged with multitask neural networks to improve the virtual screening effectiveness.
In collaboration with the Pande Lab at Stanford University, we’ve released a paper titled “Massively Multitask Networks for Drug Discovery“, investigating how data from a variety of sources can be used to improve the accuracy of determining which chemical compounds would be effective drug treatments for a variety of diseases. In particular, we carefully quantified how the amount and diversity of screening data from a variety of diseases with very different biological processes can be used to improve the virtual drug screening predictions.
Using our large-scale neural network training system, we trained at a scale 18x larger than previous work with a total of 37.8M data points across more than 200 distinct biological processes. Because of our large scale, we were able to carefully probe the sensitivity of these models to a variety of changes in model structure and input data. In the paper, we examine not just the performance of the model but why it performs well and what we can expect for similar models in the future. The data in the paper represents more than 50M total CPU hours.
This graph shows a measure of prediction accuracy (ROC AUC is the area under the receiver operating characteristic curve) for virtual screening on a fixed set of 10 biological processes as more datasets are added. |
One encouraging conclusion from this work is that our models are able to utilize data from many different experiments to increase prediction accuracy across many diseases. To our knowledge, this is the first time the effect of adding additional data has been quantified in this domain, and our results suggest that even more data could improve performance even further.
Machine learning at scale has significant potential to accelerate drug discovery and improve human health. We look forward to continued improvement in virtual drug screening and its increasing impact in the discovery process for future drugs.
Thank you to our other collaborators David Konerding (Google), Steven Kearnes (Stanford), and Vijay Pande (Stanford).
References:
1. Thomas Unterthiner, Andreas Mayr, Günter Klambauer, Marvin Steijaert, Jörg Kurt Wegner, Hugo Ceulemans, Sepp Hochreiter. Deep Learning as an Opportunity in Virtual Screening. Deep Learning and Representation Learning Workshop: NIPS 2014
2. Dahl, George E, Jaitly, Navdeep, and Salakhutdinov, Ruslan. Multi-task neural networks for QSAR predictions. arXiv preprint arXiv:1406.1231, 2014.
3. Ma, Junshui, Sheridan, Robert P, Liaw, Andy, Dahl, George, and Svetnik, Vladimir. Deep neural nets as a method for quantitative structure-activity relationships. Journal of Chemical Information and Modeling, 2015.
4. Peter Ripphausen, Britta Nisius, Lisa Peltason, and Jürgen Bajorath. Quo Vadis, Virtual Screening? A Comprehensive Survey of Prospective Applications. Journal of Medicinal Chemistry 2010 53 (24), 8461-8467
March 2nd, 2015 | by Ioannis Koutrakos | published in Google DoubleClick
March 2nd, 2015 | by Google Blogs | published in Google Earth
Home to millions of plant, animal and insect species, the Amazon rainforest is one of the most diverse ecosystems in the world. Undiscovered species thrive in the canopies of the primary forests, atop trees that have stood for centuries. Starting today, with the help of our partners at the Amazonas Sustainable Foundation (FAS), you can begin to unlock some of the wonders of the forest, by traveling from the upper canopy to the forest floor with Google Maps’ first zipline Street View collection.
High up in the canopy, you can see thick moss on the trunks, miles of hanging vine, and some of the many plants and insects that call this place home.
Now zip back down to the forest floor, and wind through a maze of towering old-growth trees. Looking up, the canopies are so thick, the sun barely peeks through.
You can also come out from the shade and take a virtual float down the dreamy waters of the Rio Aripuanã or the Rio Mariepauá and come out to the Rio Madeira, one of the largest tributaries of the Amazon.
And don’t forget to stop by one of the 17 communities of local people who live along the river and in the forest. These people are the devoted stewards of the river and forests, and protect it by living with it, preventing the destruction of the trees and the life that depends on them.
This project is the next step in our partnership with FAS, who first invited us to Rio Negro Sustainable Development Reserve just three years ago. Their hope is that sharing the imagery of their local communities, rain forests and rivers with the world will raise awareness and support for their efforts to conserve these areas. Collected through the Trekker Loan Program, this new imagery is the result of boating down 500 km of rivers, walking 20 km of forest trails and ziplining through forest canopies. We hope it inspires you to embark on your own virtual expedition of the Amazon (you can leave the bug repellent at home!).
Posted by Karin Tuxen-Bettman, Program Manager, Google Earth Outreach