<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Google Data &#187; Research Blog</title>
	<atom:link href="/author/research-blog/feed/" rel="self" type="application/rss+xml" />
	<link>https://googledata.org</link>
	<description>Everything Google: News, Products, Services, Content, Culture</description>
	<lastBuildDate>Wed, 28 Dec 2016 21:09:26 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=4.1.13</generator>
	<item>
		<title>Get moving with the new Motion Stills</title>
		<link>https://googledata.org/google-research/get-moving-with-the-new-motion-stills/</link>
		<comments>https://googledata.org/google-research/get-moving-with-the-new-motion-stills/#comments</comments>
		<pubDate>Thu, 15 Dec 2016 17:48:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=05befdd157c4afe90f05882b237f719a</guid>
		<description><![CDATA[<span>Posted by Matthias Grundmann and Ken Conley, Machine Perception</span><br /><br />Last June, we <a href="https://research.googleblog.com/2016/06/motion-stills-create-beautiful-gifs.html">released Motion Stills</a>, an <a href="https://itunes.apple.com/us/app/motion-stills-create-live/id1086172168?ls=1&#38;mt=8">iOS app</a> that uses our video stabilization technology to create easily shareable GIFs from Apple Live Photos. Since then, we <a href="http://goo.gl/rvbUXP">integrated Motion Stills into Google Photos for iOS</a> and thought of ways to improve it, taking into account your ideas for new features.<br /><br />Today, we are happy to announce a major new update to the <a href="https://itunes.apple.com/us/app/motion-stills-create-live/id1086172168?ls=1&#38;mt=8">Motion Stills app</a> that will help you create even more beautiful videos and fun GIFs using motion-tracked text overlays, super-resolution videos, and automatic <a href="https://en.wikipedia.org/wiki/Cinemagraph">cinemagraphs</a>.<br /><br /><b>Motion Text</b><br /><br />We&#8217;ve added motion text so you can create moving text effects, similar to what you might see in movies and TV shows, directly on your phone. With Motion Text, you can easily position text anywhere over your video to get the exact result you want. It only takes a second to initialize while you type, and a tracks at 1000 FPS throughout the whole Live Photo, so the process feels instantaneous.<br /><div><a href="https://1.bp.blogspot.com/-LJ8-y6cVzoI/WFLWZCY_qYI/AAAAAAAABdk/dmNsFzdlXqUmfpaex8BZN55XMfO_V0kdwCLcB/s1600/image00.gif"><img border="0" height="480" src="https://1.bp.blogspot.com/-LJ8-y6cVzoI/WFLWZCY_qYI/AAAAAAAABdk/dmNsFzdlXqUmfpaex8BZN55XMfO_V0kdwCLcB/s640/image00.gif" width="640"></a></div>To make this possible, we took the motion tracking technology that we run on YouTube servers for <a href="https://youtube-creators.googleblog.com/2016/02/blur-moving-objects-in-your-video-with.html">&#8220;Privacy Blur&#8221;</a>, and made it run even faster on your device. How? We first create motion metadata for your video by leveraging machine learning to classify foreground/background features as well as to model temporally coherent camera motion. We then take this metadata, and use it as input to an algorithm that can track individual objects while discriminating it from others. The algorithm models each object&#8217;s state that includes its motion in space, an implicit appearance model (described as a set of its moving parts), and its centroid and extent, as shown in the figure below.<br /><div><a href="https://3.bp.blogspot.com/-gl64-qycaMo/WFLWibp3XRI/AAAAAAAABdo/goLh3izJ2PY4LmuGJlbHC869bh7qYQbpgCLcB/s1600/image01.gif"><img border="0" height="640" src="https://3.bp.blogspot.com/-gl64-qycaMo/WFLWibp3XRI/AAAAAAAABdo/goLh3izJ2PY4LmuGJlbHC869bh7qYQbpgCLcB/s640/image01.gif" width="480"></a></div><b>Enhance! your videos with better detail and loops</b><br /><br />Last month, <a href="https://research.googleblog.com/2016/11/enhance-raisr-sharp-images-with-machine.html">we published the details of our state-of-the-art RAISR technology</a>, which employs machine learning to create super-resolution detail in images. This technology is now available in Motion Stills, automatically sharpening every video you export. <br /><br />We are also going beyond stabilization to bring you fully automatic cinemagraphs. After freezing the background into a still photo, we analyze our result to optimize for the perfect loop transition. By considering a range of start and end frames, we build a matrix of transition scores between frame pairs. A significant minimum in this matrix reflects the perfect transition, resulting in an endless loop of motion stillness.<br /><div><a href="https://4.bp.blogspot.com/-Ijg0TobMGCA/WFLWodTqqgI/AAAAAAAABds/MVb0ZUTg_3EmQvTditRJCYsDU_NfkG6iwCLcB/s1600/image02.gif"><img border="0" height="480" src="https://4.bp.blogspot.com/-Ijg0TobMGCA/WFLWodTqqgI/AAAAAAAABds/MVb0ZUTg_3EmQvTditRJCYsDU_NfkG6iwCLcB/s640/image02.gif" width="640"></a></div><b>Continuing improve the experience</b><br /><br />Thanks to your feedback, we&#8217;ve additionally rebuilt our navigation and added more tutorials. We&#8217;ve also added Apple&#8217;s 3D touch to let you &#8220;peek and pop&#8221; clips in your stream and movie tray. Lots more is coming to address your top requests, so please <a href="https://itunes.apple.com/us/app/motion-stills-create-live/id1086172168?ls=1&#38;mt=8">download the new release of Motion Stills</a> and keep sending us feedback with #motionstills on your favorite social media.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Matthias Grundmann and Ken Conley, Machine Perception</span><br /><br />Last June, we <a href="https://research.googleblog.com/2016/06/motion-stills-create-beautiful-gifs.html">released Motion Stills</a>, an <a href="https://itunes.apple.com/us/app/motion-stills-create-live/id1086172168?ls=1&amp;mt=8">iOS app</a> that uses our video stabilization technology to create easily shareable GIFs from Apple Live Photos. Since then, we <a href="http://goo.gl/rvbUXP">integrated Motion Stills into Google Photos for iOS</a> and thought of ways to improve it, taking into account your ideas for new features.<br /><br />Today, we are happy to announce a major new update to the <a href="https://itunes.apple.com/us/app/motion-stills-create-live/id1086172168?ls=1&amp;mt=8">Motion Stills app</a> that will help you create even more beautiful videos and fun GIFs using motion-tracked text overlays, super-resolution videos, and automatic <a href="https://en.wikipedia.org/wiki/Cinemagraph">cinemagraphs</a>.<br /><br /><b>Motion Text</b><br /><br />We’ve added motion text so you can create moving text effects, similar to what you might see in movies and TV shows, directly on your phone. With Motion Text, you can easily position text anywhere over your video to get the exact result you want. It only takes a second to initialize while you type, and a tracks at 1000 FPS throughout the whole Live Photo, so the process feels instantaneous.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-LJ8-y6cVzoI/WFLWZCY_qYI/AAAAAAAABdk/dmNsFzdlXqUmfpaex8BZN55XMfO_V0kdwCLcB/s1600/image00.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="480" src="https://1.bp.blogspot.com/-LJ8-y6cVzoI/WFLWZCY_qYI/AAAAAAAABdk/dmNsFzdlXqUmfpaex8BZN55XMfO_V0kdwCLcB/s640/image00.gif" width="640" /></a></div>To make this possible, we took the motion tracking technology that we run on YouTube servers for <a href="https://youtube-creators.googleblog.com/2016/02/blur-moving-objects-in-your-video-with.html">“Privacy Blur”</a>, and made it run even faster on your device. How? We first create motion metadata for your video by leveraging machine learning to classify foreground/background features as well as to model temporally coherent camera motion. We then take this metadata, and use it as input to an algorithm that can track individual objects while discriminating it from others. The algorithm models each object’s state that includes its motion in space, an implicit appearance model (described as a set of its moving parts), and its centroid and extent, as shown in the figure below.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-gl64-qycaMo/WFLWibp3XRI/AAAAAAAABdo/goLh3izJ2PY4LmuGJlbHC869bh7qYQbpgCLcB/s1600/image01.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="640" src="https://3.bp.blogspot.com/-gl64-qycaMo/WFLWibp3XRI/AAAAAAAABdo/goLh3izJ2PY4LmuGJlbHC869bh7qYQbpgCLcB/s640/image01.gif" width="480" /></a></div><b>Enhance! your videos with better detail and loops</b><br /><br />Last month, <a href="https://research.googleblog.com/2016/11/enhance-raisr-sharp-images-with-machine.html">we published the details of our state-of-the-art RAISR technology</a>, which employs machine learning to create super-resolution detail in images. This technology is now available in Motion Stills, automatically sharpening every video you export. <br /><br />We are also going beyond stabilization to bring you fully automatic cinemagraphs. After freezing the background into a still photo, we analyze our result to optimize for the perfect loop transition. By considering a range of start and end frames, we build a matrix of transition scores between frame pairs. A significant minimum in this matrix reflects the perfect transition, resulting in an endless loop of motion stillness.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-Ijg0TobMGCA/WFLWodTqqgI/AAAAAAAABds/MVb0ZUTg_3EmQvTditRJCYsDU_NfkG6iwCLcB/s1600/image02.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="480" src="https://4.bp.blogspot.com/-Ijg0TobMGCA/WFLWodTqqgI/AAAAAAAABds/MVb0ZUTg_3EmQvTditRJCYsDU_NfkG6iwCLcB/s640/image02.gif" width="640" /></a></div><b>Continuing improve the experience</b><br /><br />Thanks to your feedback, we’ve additionally rebuilt our navigation and added more tutorials. We’ve also added Apple’s 3D touch to let you “peek and pop” clips in your stream and movie tray. Lots more is coming to address your top requests, so please <a href="https://itunes.apple.com/us/app/motion-stills-create-live/id1086172168?ls=1&amp;mt=8">download the new release of Motion Stills</a> and keep sending us feedback with #motionstills on your favorite social media.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/get-moving-with-the-new-motion-stills/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>App Discovery With Google Play, Part 2: Personalized Recommendations with Related Apps</title>
		<link>https://googledata.org/google-research/app-discovery-with-google-play-part-2-personalized-recommendations-with-related-apps/</link>
		<comments>https://googledata.org/google-research/app-discovery-with-google-play-part-2-personalized-recommendations-with-related-apps/#comments</comments>
		<pubDate>Wed, 14 Dec 2016 19:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=0ac823a8a9eab3e5ecdf53179c178e8b</guid>
		<description><![CDATA[<span>Posted by Ananth Balashankar &#38; Levent Koc, Software Engineers, and Norberto Guimaraes, Product Manager</span><br /><br /><i>In <a href="https://research.googleblog.com/2016/11/app-discovery-with-google-play-part-1.html">Part 1 of this series</a> on app discovery, we discussed using machine learning to gain a deeper understanding of the topics associated with an app, in order to provide a better search and discovery experience on the <a href="https://play.google.com/store/apps?hl=en">Google Play Apps Store</a>. In this post, we discuss a deep learning framework to provide personalized recommendations to users based on their previous app downloads and the context in which they are used. </i><br /><br />Providing useful and relevant app recommendations to visitors of the <a href="https://play.google.com/store/apps?hl=en">Google Play Apps Store</a> is a key goal of our apps discovery team. An <a href="https://research.googleblog.com/2016/11/app-discovery-with-google-play-part-1.html">understanding of the topics associated with an app</a>, however, is only one part of creating a system that best serves the user. In order to create a better overall experience, one must also take into account the tastes of the user and provide personalized recommendations. If one didn&#8217;t, the &#8220;You might also like&#8221; recommendation would look the same for everyone! <br /><br />Discovering these nuances requires both an understanding what an app does, and also the context of the app with respect to the user. For example, to an avid sci-fi gamer, similar game recommendations may be of interest, but if a user installs a fitness app, recommending a health recipe app may be more relevant than five more fitness apps. As users may be more interested in downloading an app or game that complements one they already have installed, we provide recommendations based on app relatedness with each other (&#8220;You might also like&#8221;), in addition to providing recommendations based on the topic associated with an app (&#8220;Similar apps&#8221;). <br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://3.bp.blogspot.com/-YezY_DnbD04/WFGFJKsVUVI/AAAAAAAABdA/SxGmMuwun6s1wcFujOtTvsFQ11y7iBzCACLcB/s1600/Fig.1.png"><img border="0" height="530" src="https://3.bp.blogspot.com/-YezY_DnbD04/WFGFJKsVUVI/AAAAAAAABdA/SxGmMuwun6s1wcFujOtTvsFQ11y7iBzCACLcB/s640/Fig.1.png" width="640"></a></td></tr><tr><td>Suggestions of similar apps and apps that you also might like shown both before making an install decision (left) and while the current install is in progress (right).</td></tr></tbody></table><br />One particularly strong contextual signal is app relatedness, based on previous installs and search query clicks. As an example, a user who has searched for and plays a lot of graphics-heavy games likely has a preference for apps which are also graphically intense rather than apps with simpler graphics. So, when this user installs a car racing game, the  &#8220;You might also like&#8221; suggestions includes apps which relate to the &#8220;seed&#8221; app (because they are graphically intense racing games) ranked higher than racing apps with simpler graphics. This allows for a finer level of personalization where the characteristics of the apps are matched with the preferences of the user.<br /><br />To incorporate this app relatedness in our recommendations, we take a two pronged approach: (a) offline candidate generation i.e. the generation of the potential related apps that other users have downloaded, in addition to the app in question, and (b) online personalized re-ranking, where we re-rank these candidates  using a personalized ML model.<br /><br /><b>Offline Candidate Generation</b><br />The problem of finding related apps can be formulated as a <a href="https://en.wikipedia.org/wiki/Nearest_neighbor_search">nearest neighbor search</a> problem. Given an app X, we want to find the k nearest apps. In the case of &#8220;you might also like&#8221;, a naive approach would be one based on counting, where if many people installed apps X and Y, then the app Y would be used as candidate for seed app X. However, this approach is intractable as it is difficult to learn and generalize effectively in the huge problem space. Given that there are over a million apps on Google Play, the total number of possible app pairs is over ~10<sup>12</sup>. <br /><br />To solve this, we trained a deep neural network to predict the next app installed by the user given their previous installs. Output <a href="https://en.wikipedia.org/wiki/Embedding">embeddings</a> at the final layer of this deep neural network generally represents the types of apps a given user has installed. We then apply the nearest neighbor algorithm to find related apps for a given seed app in the trained embedding space. Thus, we perform dimensionality reduction by representing apps using embeddings to help prune the space of potential candidates.<br /><br /><b>Online Personalized Re-ranking</b><br />The candidates generated in the previous step represent relatedness along multiple dimensions. The objective is to assign scores to the candidates so they can be re-ranked in a personalized way, in order to provide an experience that is crafted to the user&#8217;s overall interests and yet maintain relevance for the user installing a given app. In order to do this, we take the characteristics of the app candidates as input to a  separate deep neural network, which is then trained with real-time with user specific context features (region, language, app store search queries, etc.) to predict the likelihood of a related app being specifically relevant to the user.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://4.bp.blogspot.com/-MxUhl0cs0F0/WFGG_XIEdVI/AAAAAAAABdI/U7oEdWud80kdp258pAI8kN85niKE1SX4ACLcB/s1600/image00.png"><img border="0" height="282" src="https://4.bp.blogspot.com/-MxUhl0cs0F0/WFGG_XIEdVI/AAAAAAAABdI/U7oEdWud80kdp258pAI8kN85niKE1SX4ACLcB/s640/image00.png" width="640"></a></td></tr><tr><td>Architecture for personalized related apps</td></tr></tbody></table><br />One of the takeaways from this work is that re-ranking content, like related apps, is one of the critical ways of app discovery in the store, and can bring great value to the user without impacting perceived relevance. Compared to the control (where no re-ranking was done), we saw a 20% increase in the app install rate from the &#8220;You might also like&#8221; suggestions. This had no user perceivable change in latency.<br /><br />In Part 3 of this series, we will discuss how we employ machine learning to keep bad actors who try to manipulate the signals we use for search and personalization at bay.<br /><br /><b>Acknowledgements</b><br />This work was done within the Google Play team in collaboration with Halit Erdogan, Mark Taylor, Michael Watson, Huazhong Ning, Stan Bileschi, John Kraemer, and Chuan Yu Foo.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Ananth Balashankar &amp; Levent Koc, Software Engineers, and Norberto Guimaraes, Product Manager</span><br /><br /><i>In <a href="https://research.googleblog.com/2016/11/app-discovery-with-google-play-part-1.html">Part 1 of this series</a> on app discovery, we discussed using machine learning to gain a deeper understanding of the topics associated with an app, in order to provide a better search and discovery experience on the <a href="https://play.google.com/store/apps?hl=en">Google Play Apps Store</a>. In this post, we discuss a deep learning framework to provide personalized recommendations to users based on their previous app downloads and the context in which they are used. </i><br /><br />Providing useful and relevant app recommendations to visitors of the <a href="https://play.google.com/store/apps?hl=en">Google Play Apps Store</a> is a key goal of our apps discovery team. An <a href="https://research.googleblog.com/2016/11/app-discovery-with-google-play-part-1.html">understanding of the topics associated with an app</a>, however, is only one part of creating a system that best serves the user. In order to create a better overall experience, one must also take into account the tastes of the user and provide personalized recommendations. If one didn’t, the “You might also like” recommendation would look the same for everyone! <br /><br />Discovering these nuances requires both an understanding what an app does, and also the context of the app with respect to the user. For example, to an avid sci-fi gamer, similar game recommendations may be of interest, but if a user installs a fitness app, recommending a health recipe app may be more relevant than five more fitness apps. As users may be more interested in downloading an app or game that complements one they already have installed, we provide recommendations based on app relatedness with each other (“You might also like”), in addition to providing recommendations based on the topic associated with an app (“Similar apps”). <br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://3.bp.blogspot.com/-YezY_DnbD04/WFGFJKsVUVI/AAAAAAAABdA/SxGmMuwun6s1wcFujOtTvsFQ11y7iBzCACLcB/s1600/Fig.1.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="530" src="https://3.bp.blogspot.com/-YezY_DnbD04/WFGFJKsVUVI/AAAAAAAABdA/SxGmMuwun6s1wcFujOtTvsFQ11y7iBzCACLcB/s640/Fig.1.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Suggestions of similar apps and apps that you also might like shown both before making an install decision (left) and while the current install is in progress (right).</td></tr></tbody></table><br />One particularly strong contextual signal is app relatedness, based on previous installs and search query clicks. As an example, a user who has searched for and plays a lot of graphics-heavy games likely has a preference for apps which are also graphically intense rather than apps with simpler graphics. So, when this user installs a car racing game, the  “You might also like” suggestions includes apps which relate to the “seed” app (because they are graphically intense racing games) ranked higher than racing apps with simpler graphics. This allows for a finer level of personalization where the characteristics of the apps are matched with the preferences of the user.<br /><br />To incorporate this app relatedness in our recommendations, we take a two pronged approach: (a) offline candidate generation i.e. the generation of the potential related apps that other users have downloaded, in addition to the app in question, and (b) online personalized re-ranking, where we re-rank these candidates  using a personalized ML model.<br /><br /><b>Offline Candidate Generation</b><br />The problem of finding related apps can be formulated as a <a href="https://en.wikipedia.org/wiki/Nearest_neighbor_search">nearest neighbor search</a> problem. Given an app X, we want to find the k nearest apps. In the case of “you might also like”, a naive approach would be one based on counting, where if many people installed apps X and Y, then the app Y would be used as candidate for seed app X. However, this approach is intractable as it is difficult to learn and generalize effectively in the huge problem space. Given that there are over a million apps on Google Play, the total number of possible app pairs is over ~10<sup>12</sup>. <br /><br />To solve this, we trained a deep neural network to predict the next app installed by the user given their previous installs. Output <a href="https://en.wikipedia.org/wiki/Embedding">embeddings</a> at the final layer of this deep neural network generally represents the types of apps a given user has installed. We then apply the nearest neighbor algorithm to find related apps for a given seed app in the trained embedding space. Thus, we perform dimensionality reduction by representing apps using embeddings to help prune the space of potential candidates.<br /><br /><b>Online Personalized Re-ranking</b><br />The candidates generated in the previous step represent relatedness along multiple dimensions. The objective is to assign scores to the candidates so they can be re-ranked in a personalized way, in order to provide an experience that is crafted to the user’s overall interests and yet maintain relevance for the user installing a given app. In order to do this, we take the characteristics of the app candidates as input to a  separate deep neural network, which is then trained with real-time with user specific context features (region, language, app store search queries, etc.) to predict the likelihood of a related app being specifically relevant to the user.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-MxUhl0cs0F0/WFGG_XIEdVI/AAAAAAAABdI/U7oEdWud80kdp258pAI8kN85niKE1SX4ACLcB/s1600/image00.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="282" src="https://4.bp.blogspot.com/-MxUhl0cs0F0/WFGG_XIEdVI/AAAAAAAABdI/U7oEdWud80kdp258pAI8kN85niKE1SX4ACLcB/s640/image00.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Architecture for personalized related apps</td></tr></tbody></table><br />One of the takeaways from this work is that re-ranking content, like related apps, is one of the critical ways of app discovery in the store, and can bring great value to the user without impacting perceived relevance. Compared to the control (where no re-ranking was done), we saw a 20% increase in the app install rate from the “You might also like” suggestions. This had no user perceivable change in latency.<br /><br />In Part 3 of this series, we will discuss how we employ machine learning to keep bad actors who try to manipulate the signals we use for search and personalization at bay.<br /><br /><b>Acknowledgements</b><br />This work was done within the Google Play team in collaboration with Halit Erdogan, Mark Taylor, Michael Watson, Huazhong Ning, Stan Bileschi, John Kraemer, and Chuan Yu Foo.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/app-discovery-with-google-play-part-2-personalized-recommendations-with-related-apps/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Open sourcing the Embedding Projector: a tool for visualizing high dimensional data</title>
		<link>https://googledata.org/google-research/open-sourcing-the-embedding-projector-a-tool-for-visualizing-high-dimensional-data/</link>
		<comments>https://googledata.org/google-research/open-sourcing-the-embedding-projector-a-tool-for-visualizing-high-dimensional-data/#comments</comments>
		<pubDate>Wed, 07 Dec 2016 09:15:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=021b4e29eb805e90b44dfc4724562ecc</guid>
		<description><![CDATA[<span>Posted by Daniel Smilkov and the Big Picture group</span> <br /><br />Recent advances in Machine Learning (ML) have shown impressive results, with applications ranging from <a href="https://research.googleblog.com/2016/08/improving-inception-and-image.html">image recognition</a>, <a href="https://research.googleblog.com/2016/11/zero-shot-translation-with-googles.html">language translation</a>, <a href="https://research.googleblog.com/2016/11/deep-learning-for-detection-of-diabetic.html">medical diagnosis</a> and more. With the widespread adoption of ML systems, it is increasingly important for research scientists to be able to explore how the data is being interpreted by the models. However, one of the main challenges in exploring this data is that it often has hundreds or even thousands of dimensions, requiring special tools to investigate the space. <br /><br />To enable a more intuitive exploration process, we are <a href="https://www.tensorflow.org/versions/master/how_tos/embedding_viz/index.html">open-sourcing the Embedding Projector</a>, a web application for interactive visualization and analysis of high-dimensional data recently shown as an <a href="https://aiexperiments.withgoogle.com/visualizing-high-dimensional-space">A.I. Experiment</a>, as part of <a href="https://www.tensorflow.org/">TensorFlow</a>.  We are also releasing a standalone version at <a href="http://projector.tensorflow.org/">projector.tensorflow.org</a>, where users can visualize their high-dimensional data without the need to install and run TensorFlow.<br /><br /><div><a href="https://2.bp.blogspot.com/-yL_425HS2ck/WEDZLk5cq0I/AAAAAAAABcI/kwy4F4Cmfi4jyG_InIiYu6F7y2-BKTXWQCLcB/s1600/embedding-mnist.gif"><img border="0" height="324" src="https://2.bp.blogspot.com/-yL_425HS2ck/WEDZLk5cq0I/AAAAAAAABcI/kwy4F4Cmfi4jyG_InIiYu6F7y2-BKTXWQCLcB/s640/embedding-mnist.gif" width="640"></a></div><br /><b>Exploring Embeddings</b><br /><br />The data needed to train machine learning systems comes in a form that computers don't immediately understand. To translate the things we understand naturally (e.g. words, sounds, or videos) to a form that the algorithms can process, we use <i><a href="https://en.wikipedia.org/wiki/Embedding">embeddings</a></i>, a mathematical vector representation that captures different facets (dimensions) of the data. For example, in <a href="https://opensource.googleblog.com/2013/08/learning-meaning-behind-words.html">this language embedding</a>, similar words are mapped to points that are close to each other.<br /><br />With the Embedding Projector, you can navigate through views of data in either a 2D or a 3D mode, zooming, rotating, and panning using natural click-and-drag gestures. Below is a figure showing the nearest points to the embedding for the word &#8220;important&#8221; after training a TensorFlow model using the <a href="https://www.tensorflow.org/versions/r0.12/tutorials/word2vec/index.html">word2vec tutorial</a>. Clicking on any point (which represents the learned embedding for a given word) in this visualization, brings up a list of nearest points and distances, which shows which words the algorithm has learned to be semantically related. This type of interaction represents an important way in which one can explore how an algorithm is performing.<br /><b><br /></b> <br /><div><a href="https://2.bp.blogspot.com/-Uql7bl2KEYM/WEfQ4Kl_0YI/AAAAAAAABck/GkktuPM8KoMcMl2Tot6GzH3-NgwPNETMgCLcB/s1600/image03.png"><img border="0" height="424" src="https://2.bp.blogspot.com/-Uql7bl2KEYM/WEfQ4Kl_0YI/AAAAAAAABck/GkktuPM8KoMcMl2Tot6GzH3-NgwPNETMgCLcB/s640/image03.png" width="640"></a></div><b>Methods of Dimensionality Reduction</b><br /><br />The Embedding Projector offers three commonly used methods of data dimensionality reduction, which allow easier visualization of complex data: <a href="https://en.wikipedia.org/wiki/Principal_component_analysis">PCA</a>, <a href="https://en.wikipedia.org/wiki/T-distributed_stochastic_neighbor_embedding">t-SNE</a> and custom linear projections. <a href="https://en.wikipedia.org/wiki/Principal_component_analysis">PCA</a> is often effective at exploring the internal structure of the embeddings, revealing the most influential dimensions in the data. <a href="https://en.wikipedia.org/wiki/T-distributed_stochastic_neighbor_embedding">t-SNE</a>, on the other hand, is useful for exploring local neighborhoods and finding clusters, allowing developers to make sure that an embedding preserves the meaning in the data (e.g. in the <a href="https://en.wikipedia.org/wiki/MNIST_database">MNIST dataset</a>, seeing that the same digits are clustered together). Finally, custom linear projections can help discover meaningful "directions" in data sets - such as the distinction between a formal and casual tone in a language generation model - which would allow the design of more adaptable ML systems.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://1.bp.blogspot.com/-5vEgY1mh1cA/WEfRAmER3iI/AAAAAAAABco/beMK-6LNq2M37QOUGQVwXMT1B6FIMLAxgCLcB/s1600/image00.png"><img border="0" height="468" src="https://1.bp.blogspot.com/-5vEgY1mh1cA/WEfRAmER3iI/AAAAAAAABco/beMK-6LNq2M37QOUGQVwXMT1B6FIMLAxgCLcB/s640/image00.png" width="640"></a></td></tr><tr><td>A custom linear projection of the 100 nearest points of "See attachments." onto the "yes" - "yeah" vector (&#8220;yes&#8221; is right, &#8220;yeah&#8221; is left) of a corpus of <a href="https://research.googleblog.com/2015/11/computer-respond-to-this-email.html">35k frequently used phrases in emails</a></td></tr></tbody></table>The Embedding Projector <a href="http://projector.tensorflow.org/">website</a> includes a few datasets to play with. We&#8217;ve also made it easy for users to publish and share their embeddings with others (just click on the &#8220;Publish&#8221; button on the left pane). It is our hope that the <a href="https://www.tensorflow.org/versions/master/how_tos/embedding_viz/index.html">Embedding Projector</a> will be a useful tool to help the research community explore and refine their ML applications, as well as enable anyone to better understand how ML algorithms interpret data. If you'd like to get the full details on the Embedding Projector, you can read the paper <a href="https://arxiv.org/pdf/1611.05469v1.pdf">here</a>. Have fun exploring the world of embeddings!<br /><br /><br />]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Daniel Smilkov and the Big Picture group</span> <br /><br />Recent advances in Machine Learning (ML) have shown impressive results, with applications ranging from <a href="https://research.googleblog.com/2016/08/improving-inception-and-image.html">image recognition</a>, <a href="https://research.googleblog.com/2016/11/zero-shot-translation-with-googles.html">language translation</a>, <a href="https://research.googleblog.com/2016/11/deep-learning-for-detection-of-diabetic.html">medical diagnosis</a> and more. With the widespread adoption of ML systems, it is increasingly important for research scientists to be able to explore how the data is being interpreted by the models. However, one of the main challenges in exploring this data is that it often has hundreds or even thousands of dimensions, requiring special tools to investigate the space. <br /><br />To enable a more intuitive exploration process, we are <a href="https://www.tensorflow.org/versions/master/how_tos/embedding_viz/index.html">open-sourcing the Embedding Projector</a>, a web application for interactive visualization and analysis of high-dimensional data recently shown as an <a href="https://aiexperiments.withgoogle.com/visualizing-high-dimensional-space">A.I. Experiment</a>, as part of <a href="https://www.tensorflow.org/">TensorFlow</a>.  We are also releasing a standalone version at <a href="http://projector.tensorflow.org/">projector.tensorflow.org</a>, where users can visualize their high-dimensional data without the need to install and run TensorFlow.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-yL_425HS2ck/WEDZLk5cq0I/AAAAAAAABcI/kwy4F4Cmfi4jyG_InIiYu6F7y2-BKTXWQCLcB/s1600/embedding-mnist.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="324" src="https://2.bp.blogspot.com/-yL_425HS2ck/WEDZLk5cq0I/AAAAAAAABcI/kwy4F4Cmfi4jyG_InIiYu6F7y2-BKTXWQCLcB/s640/embedding-mnist.gif" width="640" /></a></div><br /><b>Exploring Embeddings</b><br /><br />The data needed to train machine learning systems comes in a form that computers don't immediately understand. To translate the things we understand naturally (e.g. words, sounds, or videos) to a form that the algorithms can process, we use <i><a href="https://en.wikipedia.org/wiki/Embedding">embeddings</a></i>, a mathematical vector representation that captures different facets (dimensions) of the data. For example, in <a href="https://opensource.googleblog.com/2013/08/learning-meaning-behind-words.html">this language embedding</a>, similar words are mapped to points that are close to each other.<br /><br />With the Embedding Projector, you can navigate through views of data in either a 2D or a 3D mode, zooming, rotating, and panning using natural click-and-drag gestures. Below is a figure showing the nearest points to the embedding for the word “important” after training a TensorFlow model using the <a href="https://www.tensorflow.org/versions/r0.12/tutorials/word2vec/index.html">word2vec tutorial</a>. Clicking on any point (which represents the learned embedding for a given word) in this visualization, brings up a list of nearest points and distances, which shows which words the algorithm has learned to be semantically related. This type of interaction represents an important way in which one can explore how an algorithm is performing.<br /><b><br /></b> <br /><div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-Uql7bl2KEYM/WEfQ4Kl_0YI/AAAAAAAABck/GkktuPM8KoMcMl2Tot6GzH3-NgwPNETMgCLcB/s1600/image03.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="424" src="https://2.bp.blogspot.com/-Uql7bl2KEYM/WEfQ4Kl_0YI/AAAAAAAABck/GkktuPM8KoMcMl2Tot6GzH3-NgwPNETMgCLcB/s640/image03.png" width="640" /></a></div><b>Methods of Dimensionality Reduction</b><br /><br />The Embedding Projector offers three commonly used methods of data dimensionality reduction, which allow easier visualization of complex data: <a href="https://en.wikipedia.org/wiki/Principal_component_analysis">PCA</a>, <a href="https://en.wikipedia.org/wiki/T-distributed_stochastic_neighbor_embedding">t-SNE</a> and custom linear projections. <a href="https://en.wikipedia.org/wiki/Principal_component_analysis">PCA</a> is often effective at exploring the internal structure of the embeddings, revealing the most influential dimensions in the data. <a href="https://en.wikipedia.org/wiki/T-distributed_stochastic_neighbor_embedding">t-SNE</a>, on the other hand, is useful for exploring local neighborhoods and finding clusters, allowing developers to make sure that an embedding preserves the meaning in the data (e.g. in the <a href="https://en.wikipedia.org/wiki/MNIST_database">MNIST dataset</a>, seeing that the same digits are clustered together). Finally, custom linear projections can help discover meaningful "directions" in data sets - such as the distinction between a formal and casual tone in a language generation model - which would allow the design of more adaptable ML systems.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-5vEgY1mh1cA/WEfRAmER3iI/AAAAAAAABco/beMK-6LNq2M37QOUGQVwXMT1B6FIMLAxgCLcB/s1600/image00.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="468" src="https://1.bp.blogspot.com/-5vEgY1mh1cA/WEfRAmER3iI/AAAAAAAABco/beMK-6LNq2M37QOUGQVwXMT1B6FIMLAxgCLcB/s640/image00.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">A custom linear projection of the 100 nearest points of "See attachments." onto the "yes" - "yeah" vector (“yes” is right, “yeah” is left) of a corpus of <a href="https://research.googleblog.com/2015/11/computer-respond-to-this-email.html">35k frequently used phrases in emails</a></td></tr></tbody></table>The Embedding Projector <a href="http://projector.tensorflow.org/">website</a> includes a few datasets to play with. We’ve also made it easy for users to publish and share their embeddings with others (just click on the “Publish” button on the left pane). It is our hope that the <a href="https://www.tensorflow.org/versions/master/how_tos/embedding_viz/index.html">Embedding Projector</a> will be a useful tool to help the research community explore and refine their ML applications, as well as enable anyone to better understand how ML algorithms interpret data. If you'd like to get the full details on the Embedding Projector, you can read the paper <a href="https://arxiv.org/pdf/1611.05469v1.pdf">here</a>. Have fun exploring the world of embeddings!<br /><br /><br />]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/open-sourcing-the-embedding-projector-a-tool-for-visualizing-high-dimensional-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>NIPS 2016 &amp; Research at Google</title>
		<link>https://googledata.org/google-research/nips-2016-research-at-google/</link>
		<comments>https://googledata.org/google-research/nips-2016-research-at-google/#comments</comments>
		<pubDate>Mon, 05 Dec 2016 07:34:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=75dd52252a6cf125a25c0e3b1545e647</guid>
		<description><![CDATA[<span>Posted by Doug Eck, Research Scientist, Google Brain Team</span><br /><br />This week, Barcelona hosts the <a href="https://nips.cc/Conferences/2016">30<sup>th</sup> Annual Conference on Neural Information Processing Systems</a> (NIPS 2016), a machine learning and computational neuroscience conference that includes invited talks, demonstrations and oral and poster presentations of some of the latest in machine learning research. Google will have a strong presence at NIPS 2016, with over 280 Googlers attending in order to contribute to and learn from the broader academic research community by presenting technical talks and posters, in addition to hosting workshops and tutorials.<br /><br />Research at Google is at the forefront of innovation in <a href="http://research.google.com/pubs/MachineIntelligence.html">Machine Intelligence</a>, actively exploring virtually all aspects of machine learning including classical algorithms as well as cutting-edge techniques such as <a href="http://g.co/brain">deep learning</a>. Focusing on both theory as well as application, much of our work on language understanding, speech, translation, visual processing, ranking, and prediction relies on Machine Intelligence. In all of those tasks and many others, we gather large volumes of direct or indirect evidence of relationships of interest, and develop learning approaches to understand and generalize. <br /><br />If you are attending NIPS 2016, we hope you&#8217;ll stop by our booth and chat with our researchers about the projects and opportunities at Google that go into solving interesting problems for billions of people, and to see demonstrations of some of the exciting research we pursue. You can also learn more about our work being presented at NIPS 2016 in the list below (Googlers highlighted in <span><span>blue</span></span>).<br /><br />Google is a Platinum Sponsor of NIPS 2016.<br /><br /><b><u>Organizing Committee</u></b><br />Executive Board includes: <i><span>Corinna Cortes, Fernando Pereira</span></i><br />Advisory Board includes: <i><span>John C. Platt</span></i><br />Area Chairs include: <i><span>John Shlens</span></i><i>,</i><i><span>&#160;Moritz Hardt</span></i><i>,</i><i><span>&#160;Navdeep Jaitly</span></i><i>,&#160;</i><i><span>Hugo Larochelle</span></i><i>,</i><i><span>&#160;Honglak Lee</span></i><i>,</i><i><span>&#160;Sanjiv Kumar</span></i><i>,</i><i><span>&#160;Gal Chechik</span></i><br /><br /><b><u>Invited Talk</u></b><br /><a href="https://nips.cc/Conferences/2016/Schedule?showEvent=6194">Dynamic Legged Robots</a><br /><i><span>Marc Raibert</span></i><br /><br /><b><u>Accepted Papers:</u></b><br /><a href="http://papers.nips.cc/paper/6336-boosting-with-abstention">Boosting with Abstention</a><br /><i><span>Corinna Cortes</span>, Giulia DeSalvo, <span>Mehryar Mohri</span></i><br /><br /><a href="http://papers.nips.cc/paper/6173-community-detection-on-evolving-graphs">Community Detection on Evolving Graphs</a><br /><i>Stefano Leonardi, Aris Anagnostopoulos, Jakub &#321;&#261;cki, <span>Silvio Lattanzi</span>, <span>Mohammad Mahdian</span></i><br /><br /><a href="http://papers.nips.cc/paper/6500-linear-relaxations-for-finding-diverse-elements-in-metric-spaces">Linear Relaxations for Finding Diverse Elements in Metric Spaces</a><br /><i>Aditya Bhaskara, Mehrdad Ghadiri, <span>Vahab Mirrokni</span>, Ola Svensson</i><br /><br /><a href="http://papers.nips.cc/paper/6535-nearly-isometric-embedding-by-relaxation">Nearly Isometric Embedding by Relaxation</a><br /><i>James McQueen, Marina Meila, <span>Dominique Joncas</span></i><br /><a href="http://papers.nips.cc/paper/6429-optimistic-bandit-convex-optimization"><br /></a> <a href="http://papers.nips.cc/paper/6429-optimistic-bandit-convex-optimization">Optimistic Bandit Convex Optimization</a><br /><i><span>Mehryar Mohri</span>, Scott Yang</i><br /><br /><a href="http://papers.nips.cc/paper/6547-reward-augmented-maximum-likelihood-for-neural-structured-prediction">Reward Augmented Maximum Likelihood for Neural Structured Prediction</a><br /><i><span>Mohammad Norouzi</span></i><i>,</i><i><span>&#160;Samy Bengio</span></i><i>,</i><i><span>&#160;Zhifeng Chen</span></i><i>,</i><i><span>&#160;Navdeep Jaitly</span></i><i>,</i><i><span>&#160;Mike Schuster</span></i><i>,</i><i><span>&#160;Yonghui Wu</span></i><i>,</i><i><span>&#160;Dale Schuurmans</span></i><br /><a href="http://papers.nips.cc/paper/6359-stochastic-gradient-mcmc-with-stale-gradients"><br /></a> <a href="http://papers.nips.cc/paper/6359-stochastic-gradient-mcmc-with-stale-gradients">Stochastic Gradient MCMC with Stale Gradients</a><br /><i>Changyou Chen, <span>Nan Ding</span>, Chunyuan Li, Yizhe Zhang, Lawrence Carin</i><br /><br /><a href="http://papers.nips.cc/paper/6161-unsupervised-learning-for-physical-interaction-through-video-prediction">Unsupervised Learning for Physical Interaction through Video Prediction</a><br /><i><span>Chelsea Finn</span><a href="http://research.googleblog.com/#1" name="top1"><sup>*</sup></a>, Ian Goodfellow, <span>Sergey Levine</span></i><br /><br /><a href="http://papers.nips.cc/paper/6057-using-fast-weights-to-attend-to-the-recent-past">Using Fast Weights to Attend to the Recent Past</a><br /><i>Jimmy Ba, <span>Geoffrey Hinton</span>, Volodymyr Mnih, Joel Leibo, Catalin Ionescu</i><br /><br /><a href="http://papers.nips.cc/paper/6256-a-credit-assignment-compiler-for-joint-prediction">A Credit Assignment Compiler for Joint Prediction</a><br /><i>Kai-Wei Chang, He He, <span>Stephane Ross</span>, Hal III</i><br /><br /><a href="http://papers.nips.cc/paper/6594-a-neural-transducer">A Neural Transducer</a><br /><i><span>Navdeep Jaitly</span>, <span>Quoc Le</span>, Oriol Vinyals, Ilya Sutskever, <span>David Sussillo</span>, <span>Samy Bengio</span></i><br /><br /><a href="http://papers.nips.cc/paper/6230-attend-infer-repeat-fast-scene-understanding-with-generative-models">Attend, Infer, Repeat: Fast Scene Understanding with Generative Models</a><br /><i>S. M. Ali Eslami, Nicolas Heess, Theophane Weber, Yuval Tassa, David Szepesvari, Koray Kavukcuoglu, <span>Geoffrey Hinton</span></i><br /><br /><a href="http://papers.nips.cc/paper/6085-bi-objective-online-matching-and-submodular-allocations">Bi-Objective Online Matching and Submodular Allocations</a><br /><i>Hossein Esfandiari, <span>Nitish Korula</span>, <span>Vahab Mirrokni</span></i><br /><br /><a href="http://papers.nips.cc/paper/6595-combinatorial-energy-learning-for-image-segmentation">Combinatorial Energy Learning for Image Segmentation</a><br /><i><span>Jeremy Maitin-Shepard</span></i><i>,</i><i><span>&#160;Viren Jain</span></i><i>,</i><i><span>&#160;Michal Januszewski</span></i><i>,</i><i><span>&#160;Peter Li</span>, Pieter Abbeel</i><br /><br /><a href="http://papers.nips.cc/paper/6315-deep-learning-games">Deep Learning Games</a><br /><i><span>Dale Schuurmans</span>, <span>Martin Zinkevich</span></i><br /><br /><a href="http://papers.nips.cc/paper/6280-deepmath-deep-sequence-models-for-premise-selection">DeepMath - Deep Sequence Models for Premise Selection</a><br /><i><span>Geoffrey Irving</span></i><i>,</i><i><span>&#160;Christian Szegedy</span></i><i>,</i><i><span>&#160;Niklas Een</span></i><i>,</i><i><span>&#160;Alexander Alemi</span></i><i>,</i><i><span>&#160;Fran&#231;ois Chollet</span>, Josef Urban</i><br /><br /><a href="http://papers.nips.cc/paper/6217-density-estimation-via-discrepancy-based-adaptive-sequential-partition">Density Estimation via Discrepancy Based Adaptive Sequential Partition</a><br /><i>Dangna Li, <span>Kun Yang</span>, Wing Wong</i><br /><br /><a href="http://papers.nips.cc/paper/6254-domain-separation-networks">Domain Separation Networks</a><br /><i><span>Konstantinos Bousmalis</span>, George Trigeorgis, <span>Nathan Silberman</span></i><i>,&#160;</i><i><span>&#160;Dilip Krishnan</span></i><i>,&#160;</i><i><span>Dumitru Erhan</span></i><br /><a href="http://papers.nips.cc/paper/6540-fast-distributed-submodular-cover-public-private-data-summarization"><br /></a> <a href="http://papers.nips.cc/paper/6540-fast-distributed-submodular-cover-public-private-data-summarization">Fast Distributed Submodular Cover: Public-Private Data Summarization </a><br /><i>Baharan Mirzasoleiman, <span>Morteza Zadimoghaddam</span>, Amin Karbasi</i><br /><br /><a href="http://papers.nips.cc/paper/6316-satisfying-real-world-goals-with-dataset-constraints">Satisfying Real-world Goals with Dataset Constraints</a><br /><i>Gabriel Goh, <span>Andrew Cotter</span>, <span>Maya Gupta</span>, Michael P Friedlander</i><br /><br /><a href="http://papers.nips.cc/paper/6295-can-active-memory-replace-attention">Can Active Memory Replace Attention?</a><br /><i><span>&#321;ukasz Kaiser</span>, <span>Samy Bengio</span></i><br /><br /><a href="http://papers.nips.cc/paper/6377-fast-and-flexible-monotonic-functions-with-ensembles-of-lattices">Fast and Flexible Monotonic Functions with Ensembles of Lattices</a><br /><i><span>Kevin Canini</span></i><i>,&#160;</i><i><span>&#160;Andy Cotter</span></i><i>,&#160;</i><i><span>&#160;Maya Gupta</span></i><i>,&#160;</i><i><span>&#160;Mahdi Fard</span></i><i>,&#160;</i><i><span>&#160;Jan Pfeifer</span></i> <br /><br /><a href="http://papers.nips.cc/paper/6053-launch-and-iterate-reducing-prediction-churn">Launch and Iterate: Reducing Prediction Churn</a><br /><i>Quentin Cormier, <span>Mahdi Fard, Kevin Canini, Maya Gupta</span></i><br /><a href="http://papers.nips.cc/paper/6078-on-mixtures-of-markov-chains"><br /></a> <a href="http://papers.nips.cc/paper/6078-on-mixtures-of-markov-chains">On Mixtures of Markov Chains</a><br /><i>Rishi Gupta, <span>Ravi Kumar</span>, <span>Sergei Vassilvitskii</span></i><br /><br /><a href="http://papers.nips.cc/paper/6246-orthogonal-random-features">Orthogonal Random Features</a><br /><i><span>Felix Xinnan Yu</span></i><i>,&#160;</i><i><span>&#160;Ananda Theertha Suresh</span></i><i>,&#160;</i><i><span>&#160;Krzysztof Choromanski</span></i><i>,&#160;</i><i><span>&#160;Dan Holtmann-Rice</span></i><i>,&#160;</i><i><span><br />Sanjiv Kumar</span></i><br /><br /><a href="https://nips.cc/Conferences/2016/Schedule?showEvent=7241https://web.eecs.umich.edu/~honglak/nips2016-perspectiveTransformer.pdf">Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D</a><br /><a href="https://nips.cc/Conferences/2016/Schedule?showEvent=7241https://web.eecs.umich.edu/~honglak/nips2016-perspectiveTransformer.pdf">Supervision</a><br /><i>Xinchen Yan, Jimei Yang, Ersin Yumer, Yijie Guo, <span>Honglak Lee</span></i><br /><br /><a href="http://papers.nips.cc/paper/6485-structured-prediction-theory-based-on-factor-graph-complexity">Structured Prediction Theory Based on Factor Graph Complexity</a><br /><i><span>Corinna Cortes</span>, <span>Vitaly Kuznetsov</span>, <span>Mehryar Mohri</span>, Scott Yang</i><br /><br /><a href="http://papers.nips.cc/paper/6427-toward-deeper-understanding-of-neural-networks-the-power-of-initialization-and-a-dual-view-on-expressivity">Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity</a><br /><i><span>Amit Daniely</span></i><i>,</i><i><span>&#160;Roy Frostig</span></i><i>,</i><i><span>&#160;Yoram Singer</span></i><br /><br /><b><u>Demonstrations</u></b><br /><a href="https://nips.cc/Conferences/2016/Schedule?showEvent=6307">Interactive musical improvisation with Magenta </a><br /><i><span>Adam Roberts</span></i><i>,</i><i><span>&#160;Sageev Oore</span></i><i>,</i><i><span>&#160;Curtis Hawthorne</span></i><i>,</i><i><span>&#160;Douglas Eck</span></i><br /><br /><a href="https://nips.cc/Conferences/2016/Schedule?showEvent=6312">Content-based Related Video Recommendation </a><br /><i><span>Joonseok Lee</span></i><br /><b><u><br /></u></b> <b><u>Workshops, Tutorials and Symposia</u></b><br /><a href="http://approximateinference.org/">Advances in Approximate Bayesian Inference </a><br />Advisory Committee includes: <i><span>Kevin P. Murphy</span></i> <br />Invited Speakers include: <i><span>Matt Johnson</span></i> <br />Panelists include: <i><span>Ryan Sepassi</span></i><br /><br /><a href="https://sites.google.com/site/nips2016adversarial/">Adversarial Training</a><br />Accepted Authors: <i><span>Luke Metz</span></i><i>,</i><i><span>&#160;Ben Poole</span></i><i>,</i><i><span>&#160;David Pfau</span></i><i>,</i><i><span>&#160;Jascha Sohl-Dickstein</span></i><i>,</i><i><span>&#160;Augustus Odena</span></i><i>,</i><i><span>&#160;Christopher Olah</span></i><i>,</i><i><span>&#160;Jonathon Shlens</span></i><br /><br /><a href="http://bayesiandeeplearning.org/">Bayesian Deep Learning </a><br />Organizers include: <i><span>Kevin P. Murphy</span></i><br />Accepted Authors include: <i><span>Rif A. Saurous</span></i><i>,</i><i><span>&#160;Eugene Brevdo</span></i><i>,</i><i><span>&#160;Kevin Murphy</span></i><i>,</i><i><span>&#160;Eric Jang</span></i><i>,</i><i><span>&#160;</span></i><i><span>Shixiang Gu</span></i><i>,</i><i><span>&#160;</span></i><i><span>Ben Poole</span></i><br /><br /><a href="http://www.stat.ucla.edu/~akfletcher/brainsbits.html">Brains &#38; Bits: Neuroscience Meets Machine Learning </a><br />Organizers include: <i><span>Jascha Sohl-Dickstein</span></i> <br /><a href="http://virenjain.org/nips2016connectomics/"><br /></a> <a href="http://virenjain.org/nips2016connectomics/">Connectomics II: Opportunities &#38; Challanges for Machine Learning </a><br />Organizers include: <i><span>Viren Jain</span></i> <br /><br /><a href="http://www.cs.nott.ac.uk/~psztg/cml/2016/">Constructive Machine Learning</a><br />Invited Speakers include: <i><span>Douglas Eck</span></i><br /><br /><a href="https://sites.google.com/site/cldlnips2016/home">Continual Learning &#38; Deep Networks </a><br />Invited Speakers include: <i><span>Honglak Lee</span></i><br /><br /><a href="https://sites.google.com/site/nips16interaction/">Deep Learning for Action &#38; Interaction</a><br />Organizers include: <i><span>Sergey Levine</span></i><br />Invited Speakers include: <i><span>Honglak Lee</span></i><br />Accepted Authors include: <i><span>Pararth Shah</span></i><i>,</i><i><span>&#160;Dilek Hakkani-Tur</span></i><i>,</i><i><span>&#160;Larry Heck</span></i><br /><a href="https://sites.google.com/site/nips2016endtoendspeechaudio/"><br /></a> <a href="https://sites.google.com/site/nips2016endtoendspeechaudio/">End-to-end Learning for Speech and Audio Processing</a><br />Invited Speakers include: <i><span>Tara Sainath</span></i><br />Accepted Authors include: <i><span>Brian Patton</span></i><i>,</i><i><span>&#160;Yannis Agiomyrgiannakis</span></i><i>,</i><i><span>&#160;Michael Terry</span></i><i>,</i><i><span>&#160;Kevin Wilson</span></i><i>,</i><i><span>&#160;Rif A. Saurous</span></i><i>,</i><i><span>&#160;D. Sculley</span></i><br /><br /><a href="http://www.manikvarma.org/events/XC16/schedule.html">Extreme Classification: Multi-class &#38; Multi-label Learning in Extremely Large Label Spaces </a><br />Organizers include: <i><span>Samy Bengio</span></i><br /><br /><a href="https://sites.google.com/site/nips2016interpretml/">Interpretable Machine Learning for Complex Systems </a><br />Invited Speaker: <i><span>Honglak Lee</span></i> <br />Accepted Authors include: <i><span>Daniel Smilkov</span></i><i>,</i><i><span>&#160;Nikhil Thorat</span></i><i>,</i><i><span>&#160;Charles Nicholson</span></i><i>,</i><i><span>&#160;Emily Reif</span></i><i>,</i><i><span>&#160;Fernanda Viegas</span></i><i>,</i><i><span>&#160;Martin Wattenberg</span></i><br /><br /><a href="https://nips.cc/Conferences/2016/Schedule?showEvent=6252">Large Scale Computer Vision Systems </a><br />Organizers include: <i><span>Gal Chechik</span></i> <br /><br /><a href="https://sites.google.com/site/mlsysnips2016/home">Machine Learning Systems </a><br />Invited Speakers include: <i><span>Jeff Dean</span></i> <br /><br /><a href="https://sites.google.com/site/nonconvexnips2016/">Nonconvex Optimization for Machine Learning: Theory &#38; Practice </a><br />Organizers include: <i><span>Hossein Mobahi</span></i> <br /><br /><a href="http://www.probabilistic-numerics.org/meetings/NIPS2016/">Optimizing the Optimizers </a><br />Organizers include: <i><span>Alex Davies</span></i> <br /><br /><a href="https://sites.google.com/site/wildml2016nips/?pli=1">Reliable Machine Learning in the Wild</a><br />Accepted Authors: <i><span>Andres Medina</span></i><i>,</i><i><span>&#160;Sergei Vassilvitskii</span></i><br /><a href="https://autodiff-workshop.github.io/"><br /></a> <a href="https://autodiff-workshop.github.io/">The Future of Gradient-Based Machine Learning Software </a><br />Invited Speakers: <i><span>Jeff Dean</span></i><i>,</i><i><span>&#160;Matt Johnson</span></i><br /><a href="http://wimlworkshop.org/2016/program/"><br /></a> <a href="https://sites.google.com/site/nipsts2016/">Time Series Workshop</a><br />Organizers include: <i><span>Vitaly Kuznetsov </span></i><br />Invited Speakers include: <i><span>Mehryar Mohri</span></i><br /><br /><a href="https://nips.cc/Conferences/2016/Schedule?showEvent=6206">Theory and Algorithms for Forecasting Non-Stationary Time Series </a><br />Tutorial Organizers: <i>Vitaly Kuznetsov,<span> Mehryar Mohri</span></i><br /><br /><a href="http://wimlworkshop.org/2016/program/">Women in Machine Learning</a><br />Invited Speakers include: <i><span>Maya Gupta</span></i><br /><br /><hr width="100%"><span><br /><a name="1"><b>* </b></a>Work done as part of the Google Brain team <a href="http://research.googleblog.com/#top1"><sup>&#8617;</sup></a><br /></span>]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Doug Eck, Research Scientist, Google Brain Team</span><br /><br />This week, Barcelona hosts the <a href="https://nips.cc/Conferences/2016">30<sup>th</sup> Annual Conference on Neural Information Processing Systems</a> (NIPS 2016), a machine learning and computational neuroscience conference that includes invited talks, demonstrations and oral and poster presentations of some of the latest in machine learning research. Google will have a strong presence at NIPS 2016, with over 280 Googlers attending in order to contribute to and learn from the broader academic research community by presenting technical talks and posters, in addition to hosting workshops and tutorials.<br /><br />Research at Google is at the forefront of innovation in <a href="http://research.google.com/pubs/MachineIntelligence.html">Machine Intelligence</a>, actively exploring virtually all aspects of machine learning including classical algorithms as well as cutting-edge techniques such as <a href="http://g.co/brain">deep learning</a>. Focusing on both theory as well as application, much of our work on language understanding, speech, translation, visual processing, ranking, and prediction relies on Machine Intelligence. In all of those tasks and many others, we gather large volumes of direct or indirect evidence of relationships of interest, and develop learning approaches to understand and generalize. <br /><br />If you are attending NIPS 2016, we hope you’ll stop by our booth and chat with our researchers about the projects and opportunities at Google that go into solving interesting problems for billions of people, and to see demonstrations of some of the exciting research we pursue. You can also learn more about our work being presented at NIPS 2016 in the list below (Googlers highlighted in <span style="background-color: white;"><span style="color: #3d85c6;">blue</span></span>).<br /><br />Google is a Platinum Sponsor of NIPS 2016.<br /><br /><b><u>Organizing Committee</u></b><br />Executive Board includes: <i><span style="color: #3d85c6;">Corinna Cortes, Fernando Pereira</span></i><br />Advisory Board includes: <i><span style="color: #3d85c6;">John C. Platt</span></i><br />Area Chairs include: <i><span style="color: #3d85c6;">John Shlens</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Moritz Hardt</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Navdeep Jaitly</span></i><i>,&nbsp;</i><i><span style="color: #3d85c6;">Hugo Larochelle</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Honglak Lee</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Sanjiv Kumar</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Gal Chechik</span></i><br /><br /><b><u>Invited Talk</u></b><br /><a href="https://nips.cc/Conferences/2016/Schedule?showEvent=6194">Dynamic Legged Robots</a><br /><i><span style="color: #3d85c6;">Marc Raibert</span></i><br /><br /><b><u>Accepted Papers:</u></b><br /><a href="http://papers.nips.cc/paper/6336-boosting-with-abstention">Boosting with Abstention</a><br /><i><span style="color: #3d85c6;">Corinna Cortes</span>, Giulia DeSalvo, <span style="color: #3d85c6;">Mehryar Mohri</span></i><br /><br /><a href="http://papers.nips.cc/paper/6173-community-detection-on-evolving-graphs">Community Detection on Evolving Graphs</a><br /><i>Stefano Leonardi, Aris Anagnostopoulos, Jakub Łącki, <span style="color: #3d85c6;">Silvio Lattanzi</span>, <span style="color: #3d85c6;">Mohammad Mahdian</span></i><br /><br /><a href="http://papers.nips.cc/paper/6500-linear-relaxations-for-finding-diverse-elements-in-metric-spaces">Linear Relaxations for Finding Diverse Elements in Metric Spaces</a><br /><i>Aditya Bhaskara, Mehrdad Ghadiri, <span style="color: #3d85c6;">Vahab Mirrokni</span>, Ola Svensson</i><br /><br /><a href="http://papers.nips.cc/paper/6535-nearly-isometric-embedding-by-relaxation">Nearly Isometric Embedding by Relaxation</a><br /><i>James McQueen, Marina Meila, <span style="color: #3d85c6;">Dominique Joncas</span></i><br /><a href="http://papers.nips.cc/paper/6429-optimistic-bandit-convex-optimization"><br /></a> <a href="http://papers.nips.cc/paper/6429-optimistic-bandit-convex-optimization">Optimistic Bandit Convex Optimization</a><br /><i><span style="color: #3d85c6;">Mehryar Mohri</span>, Scott Yang</i><br /><br /><a href="http://papers.nips.cc/paper/6547-reward-augmented-maximum-likelihood-for-neural-structured-prediction">Reward Augmented Maximum Likelihood for Neural Structured Prediction</a><br /><i><span style="color: #3d85c6;">Mohammad Norouzi</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Samy Bengio</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Zhifeng Chen</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Navdeep Jaitly</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Mike Schuster</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Yonghui Wu</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Dale Schuurmans</span></i><br /><a href="http://papers.nips.cc/paper/6359-stochastic-gradient-mcmc-with-stale-gradients"><br /></a> <a href="http://papers.nips.cc/paper/6359-stochastic-gradient-mcmc-with-stale-gradients">Stochastic Gradient MCMC with Stale Gradients</a><br /><i>Changyou Chen, <span style="color: #3d85c6;">Nan Ding</span>, Chunyuan Li, Yizhe Zhang, Lawrence Carin</i><br /><br /><a href="http://papers.nips.cc/paper/6161-unsupervised-learning-for-physical-interaction-through-video-prediction">Unsupervised Learning for Physical Interaction through Video Prediction</a><br /><i><span style="color: #3d85c6;">Chelsea Finn</span><a href="http://research.googleblog.com/2016/12/nips-2016-research-at-google.html#1" name="top1"><sup>*</sup></a>, Ian Goodfellow, <span style="color: #3d85c6;">Sergey Levine</span></i><br /><br /><a href="http://papers.nips.cc/paper/6057-using-fast-weights-to-attend-to-the-recent-past">Using Fast Weights to Attend to the Recent Past</a><br /><i>Jimmy Ba, <span style="color: #3d85c6;">Geoffrey Hinton</span>, Volodymyr Mnih, Joel Leibo, Catalin Ionescu</i><br /><br /><a href="http://papers.nips.cc/paper/6256-a-credit-assignment-compiler-for-joint-prediction">A Credit Assignment Compiler for Joint Prediction</a><br /><i>Kai-Wei Chang, He He, <span style="color: #3d85c6;">Stephane Ross</span>, Hal III</i><br /><br /><a href="http://papers.nips.cc/paper/6594-a-neural-transducer">A Neural Transducer</a><br /><i><span style="color: #3d85c6;">Navdeep Jaitly</span>, <span style="color: #3d85c6;">Quoc Le</span>, Oriol Vinyals, Ilya Sutskever, <span style="color: #3d85c6;">David Sussillo</span>, <span style="color: #3d85c6;">Samy Bengio</span></i><br /><br /><a href="http://papers.nips.cc/paper/6230-attend-infer-repeat-fast-scene-understanding-with-generative-models">Attend, Infer, Repeat: Fast Scene Understanding with Generative Models</a><br /><i>S. M. Ali Eslami, Nicolas Heess, Theophane Weber, Yuval Tassa, David Szepesvari, Koray Kavukcuoglu, <span style="color: #3d85c6;">Geoffrey Hinton</span></i><br /><br /><a href="http://papers.nips.cc/paper/6085-bi-objective-online-matching-and-submodular-allocations">Bi-Objective Online Matching and Submodular Allocations</a><br /><i>Hossein Esfandiari, <span style="color: #3d85c6;">Nitish Korula</span>, <span style="color: #3d85c6;">Vahab Mirrokni</span></i><br /><br /><a href="http://papers.nips.cc/paper/6595-combinatorial-energy-learning-for-image-segmentation">Combinatorial Energy Learning for Image Segmentation</a><br /><i><span style="color: #3d85c6;">Jeremy Maitin-Shepard</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Viren Jain</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Michal Januszewski</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Peter Li</span>, Pieter Abbeel</i><br /><br /><a href="http://papers.nips.cc/paper/6315-deep-learning-games">Deep Learning Games</a><br /><i><span style="color: #3d85c6;">Dale Schuurmans</span>, <span style="color: #3d85c6;">Martin Zinkevich</span></i><br /><br /><a href="http://papers.nips.cc/paper/6280-deepmath-deep-sequence-models-for-premise-selection">DeepMath - Deep Sequence Models for Premise Selection</a><br /><i><span style="color: #3d85c6;">Geoffrey Irving</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Christian Szegedy</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Niklas Een</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Alexander Alemi</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;François Chollet</span>, Josef Urban</i><br /><br /><a href="http://papers.nips.cc/paper/6217-density-estimation-via-discrepancy-based-adaptive-sequential-partition">Density Estimation via Discrepancy Based Adaptive Sequential Partition</a><br /><i>Dangna Li, <span style="color: #3d85c6;">Kun Yang</span>, Wing Wong</i><br /><br /><a href="http://papers.nips.cc/paper/6254-domain-separation-networks">Domain Separation Networks</a><br /><i><span style="color: #3d85c6;">Konstantinos Bousmalis</span>, George Trigeorgis, <span style="color: #3d85c6;">Nathan Silberman</span></i><i>,&nbsp;</i><i><span style="color: #3d85c6;">&nbsp;Dilip Krishnan</span></i><i>,&nbsp;</i><i><span style="color: #3d85c6;">Dumitru Erhan</span></i><br /><a href="http://papers.nips.cc/paper/6540-fast-distributed-submodular-cover-public-private-data-summarization"><br /></a> <a href="http://papers.nips.cc/paper/6540-fast-distributed-submodular-cover-public-private-data-summarization">Fast Distributed Submodular Cover: Public-Private Data Summarization </a><br /><i>Baharan Mirzasoleiman, <span style="color: #3d85c6;">Morteza Zadimoghaddam</span>, Amin Karbasi</i><br /><br /><a href="http://papers.nips.cc/paper/6316-satisfying-real-world-goals-with-dataset-constraints">Satisfying Real-world Goals with Dataset Constraints</a><br /><i>Gabriel Goh, <span style="color: #3d85c6;">Andrew Cotter</span>, <span style="color: #3d85c6;">Maya Gupta</span>, Michael P Friedlander</i><br /><br /><a href="http://papers.nips.cc/paper/6295-can-active-memory-replace-attention">Can Active Memory Replace Attention?</a><br /><i><span style="color: #3d85c6;">Łukasz Kaiser</span>, <span style="color: #3d85c6;">Samy Bengio</span></i><br /><br /><a href="http://papers.nips.cc/paper/6377-fast-and-flexible-monotonic-functions-with-ensembles-of-lattices">Fast and Flexible Monotonic Functions with Ensembles of Lattices</a><br /><i><span style="color: #3d85c6;">Kevin Canini</span></i><i>,&nbsp;</i><i><span style="color: #3d85c6;">&nbsp;Andy Cotter</span></i><i>,&nbsp;</i><i><span style="color: #3d85c6;">&nbsp;Maya Gupta</span></i><i>,&nbsp;</i><i><span style="color: #3d85c6;">&nbsp;Mahdi Fard</span></i><i>,&nbsp;</i><i><span style="color: #3d85c6;">&nbsp;Jan Pfeifer</span></i> <br /><br /><a href="http://papers.nips.cc/paper/6053-launch-and-iterate-reducing-prediction-churn">Launch and Iterate: Reducing Prediction Churn</a><br /><i>Quentin Cormier, <span style="color: #3d85c6;">Mahdi Fard, Kevin Canini, Maya Gupta</span></i><br /><a href="http://papers.nips.cc/paper/6078-on-mixtures-of-markov-chains"><br /></a> <a href="http://papers.nips.cc/paper/6078-on-mixtures-of-markov-chains">On Mixtures of Markov Chains</a><br /><i>Rishi Gupta, <span style="color: #3d85c6;">Ravi Kumar</span>, <span style="color: #3d85c6;">Sergei Vassilvitskii</span></i><br /><br /><a href="http://papers.nips.cc/paper/6246-orthogonal-random-features">Orthogonal Random Features</a><br /><i><span style="color: #3d85c6;">Felix Xinnan Yu</span></i><i>,&nbsp;</i><i><span style="color: #3d85c6;">&nbsp;Ananda Theertha Suresh</span></i><i>,&nbsp;</i><i><span style="color: #3d85c6;">&nbsp;Krzysztof Choromanski</span></i><i>,&nbsp;</i><i><span style="color: #3d85c6;">&nbsp;Dan Holtmann-Rice</span></i><i>,&nbsp;</i><i><span style="color: #3d85c6;"><br />Sanjiv Kumar</span></i><br /><br /><a href="https://nips.cc/Conferences/2016/Schedule?showEvent=7241https://web.eecs.umich.edu/~honglak/nips2016-perspectiveTransformer.pdf">Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D</a><br /><a href="https://nips.cc/Conferences/2016/Schedule?showEvent=7241https://web.eecs.umich.edu/~honglak/nips2016-perspectiveTransformer.pdf">Supervision</a><br /><i>Xinchen Yan, Jimei Yang, Ersin Yumer, Yijie Guo, <span style="color: #3d85c6;">Honglak Lee</span></i><br /><br /><a href="http://papers.nips.cc/paper/6485-structured-prediction-theory-based-on-factor-graph-complexity">Structured Prediction Theory Based on Factor Graph Complexity</a><br /><i><span style="color: #3d85c6;">Corinna Cortes</span>, <span style="color: #3d85c6;">Vitaly Kuznetsov</span>, <span style="color: #3d85c6;">Mehryar Mohri</span>, Scott Yang</i><br /><br /><a href="http://papers.nips.cc/paper/6427-toward-deeper-understanding-of-neural-networks-the-power-of-initialization-and-a-dual-view-on-expressivity">Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity</a><br /><i><span style="color: #3d85c6;">Amit Daniely</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Roy Frostig</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Yoram Singer</span></i><br /><br /><b><u>Demonstrations</u></b><br /><a href="https://nips.cc/Conferences/2016/Schedule?showEvent=6307">Interactive musical improvisation with Magenta </a><br /><i><span style="color: #3d85c6;">Adam Roberts</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Sageev Oore</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Curtis Hawthorne</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Douglas Eck</span></i><br /><br /><a href="https://nips.cc/Conferences/2016/Schedule?showEvent=6312">Content-based Related Video Recommendation </a><br /><i><span style="color: #3d85c6;">Joonseok Lee</span></i><br /><b><u><br /></u></b> <b><u>Workshops, Tutorials and Symposia</u></b><br /><a href="http://approximateinference.org/">Advances in Approximate Bayesian Inference </a><br />Advisory Committee includes: <i><span style="color: #3d85c6;">Kevin P. Murphy</span></i> <br />Invited Speakers include: <i><span style="color: #3d85c6;">Matt Johnson</span></i> <br />Panelists include: <i><span style="color: #3d85c6;">Ryan Sepassi</span></i><br /><br /><a href="https://sites.google.com/site/nips2016adversarial/">Adversarial Training</a><br />Accepted Authors: <i><span style="color: #3d85c6;">Luke Metz</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Ben Poole</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;David Pfau</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Jascha Sohl-Dickstein</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Augustus Odena</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Christopher Olah</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Jonathon Shlens</span></i><br /><br /><a href="http://bayesiandeeplearning.org/">Bayesian Deep Learning </a><br />Organizers include: <i><span style="color: #3d85c6;">Kevin P. Murphy</span></i><br />Accepted Authors include: <i><span style="color: #3d85c6;">Rif A. Saurous</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Eugene Brevdo</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Kevin Murphy</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Eric Jang</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">Shixiang Gu</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">Ben Poole</span></i><br /><br /><a href="http://www.stat.ucla.edu/~akfletcher/brainsbits.html">Brains &amp; Bits: Neuroscience Meets Machine Learning </a><br />Organizers include: <i><span style="color: #3d85c6;">Jascha Sohl-Dickstein</span></i> <br /><a href="http://virenjain.org/nips2016connectomics/"><br /></a> <a href="http://virenjain.org/nips2016connectomics/">Connectomics II: Opportunities &amp; Challanges for Machine Learning </a><br />Organizers include: <i><span style="color: #3d85c6;">Viren Jain</span></i> <br /><br /><a href="http://www.cs.nott.ac.uk/~psztg/cml/2016/">Constructive Machine Learning</a><br />Invited Speakers include: <i><span style="color: #3d85c6;">Douglas Eck</span></i><br /><br /><a href="https://sites.google.com/site/cldlnips2016/home">Continual Learning &amp; Deep Networks </a><br />Invited Speakers include: <i><span style="color: #3d85c6;">Honglak Lee</span></i><br /><br /><a href="https://sites.google.com/site/nips16interaction/">Deep Learning for Action &amp; Interaction</a><br />Organizers include: <i><span style="color: #3d85c6;">Sergey Levine</span></i><br />Invited Speakers include: <i><span style="color: #3d85c6;">Honglak Lee</span></i><br />Accepted Authors include: <i><span style="color: #3d85c6;">Pararth Shah</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Dilek Hakkani-Tur</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Larry Heck</span></i><br /><a href="https://sites.google.com/site/nips2016endtoendspeechaudio/"><br /></a> <a href="https://sites.google.com/site/nips2016endtoendspeechaudio/">End-to-end Learning for Speech and Audio Processing</a><br />Invited Speakers include: <i><span style="color: #3d85c6;">Tara Sainath</span></i><br />Accepted Authors include: <i><span style="color: #3d85c6;">Brian Patton</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Yannis Agiomyrgiannakis</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Michael Terry</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Kevin Wilson</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Rif A. Saurous</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;D. Sculley</span></i><br /><br /><a href="http://www.manikvarma.org/events/XC16/schedule.html">Extreme Classification: Multi-class &amp; Multi-label Learning in Extremely Large Label Spaces </a><br />Organizers include: <i><span style="color: #3d85c6;">Samy Bengio</span></i><br /><br /><a href="https://sites.google.com/site/nips2016interpretml/">Interpretable Machine Learning for Complex Systems </a><br />Invited Speaker: <i><span style="color: #3d85c6;">Honglak Lee</span></i> <br />Accepted Authors include: <i><span style="color: #3d85c6;">Daniel Smilkov</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Nikhil Thorat</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Charles Nicholson</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Emily Reif</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Fernanda Viegas</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Martin Wattenberg</span></i><br /><br /><a href="https://nips.cc/Conferences/2016/Schedule?showEvent=6252">Large Scale Computer Vision Systems </a><br />Organizers include: <i><span style="color: #3d85c6;">Gal Chechik</span></i> <br /><br /><a href="https://sites.google.com/site/mlsysnips2016/home">Machine Learning Systems </a><br />Invited Speakers include: <i><span style="color: #3d85c6;">Jeff Dean</span></i> <br /><br /><a href="https://sites.google.com/site/nonconvexnips2016/">Nonconvex Optimization for Machine Learning: Theory &amp; Practice </a><br />Organizers include: <i><span style="color: #3d85c6;">Hossein Mobahi</span></i> <br /><br /><a href="http://www.probabilistic-numerics.org/meetings/NIPS2016/">Optimizing the Optimizers </a><br />Organizers include: <i><span style="color: #3d85c6;">Alex Davies</span></i> <br /><br /><a href="https://sites.google.com/site/wildml2016nips/?pli=1">Reliable Machine Learning in the Wild</a><br />Accepted Authors: <i><span style="color: #3d85c6;">Andres Medina</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Sergei Vassilvitskii</span></i><br /><a href="https://autodiff-workshop.github.io/"><br /></a> <a href="https://autodiff-workshop.github.io/">The Future of Gradient-Based Machine Learning Software </a><br />Invited Speakers: <i><span style="color: #3d85c6;">Jeff Dean</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Matt Johnson</span></i><br /><a href="http://wimlworkshop.org/2016/program/"><br /></a> <a href="https://sites.google.com/site/nipsts2016/">Time Series Workshop</a><br />Organizers include: <i><span style="color: #3d85c6;">Vitaly Kuznetsov </span></i><br />Invited Speakers include: <i><span style="color: #3d85c6;">Mehryar Mohri</span></i><br /><br /><a href="https://nips.cc/Conferences/2016/Schedule?showEvent=6206">Theory and Algorithms for Forecasting Non-Stationary Time Series </a><br />Tutorial Organizers: <i>Vitaly Kuznetsov,<span style="color: #3d85c6;"> Mehryar Mohri</span></i><br /><br /><a href="http://wimlworkshop.org/2016/program/">Women in Machine Learning</a><br />Invited Speakers include: <i><span style="color: #3d85c6;">Maya Gupta</span></i><br /><br /><hr width="100%" /><span class="Apple-style-span" style="font-size: x-small;"><br /><a name="1"><b>* </b></a>Work done as part of the Google Brain team <a href="http://research.googleblog.com/2016/12/nips-2016-research-at-google.html#top1"><sup>↩</sup></a><br /></span>]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/nips-2016-research-at-google/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Deep Learning for Detection of Diabetic Eye Disease</title>
		<link>https://googledata.org/google-research/deep-learning-for-detection-of-diabetic-eye-disease/</link>
		<comments>https://googledata.org/google-research/deep-learning-for-detection-of-diabetic-eye-disease/#comments</comments>
		<pubDate>Tue, 29 Nov 2016 16:04:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=27e7a1283f8d8d29a37f5b1e448ae812</guid>
		<description><![CDATA[<span>Posted by Lily Peng MD PhD, Product Manager and Varun Gulshan PhD, Research Engineer</span><br /><br /><a href="https://en.wikipedia.org/wiki/Diabetic_retinopathy">Diabetic retinopathy</a> (DR) is the fastest growing cause of blindness, with nearly <a href="http://www.idf.org/about-diabetes/facts-figures">415 million diabetic patients</a> at risk worldwide. If caught early, the disease can be treated; if not, it can lead to irreversible blindness. Unfortunately, medical specialists capable of detecting the disease are not available in many parts of the world where diabetes is prevalent. We believe that Machine Learning can help doctors identify patients in need, particularly among underserved populations.<br /><br />A few years ago, several of us began wondering if there was a way Google technologies could  improve the DR screening process, specifically by taking advantage of recent advances in Machine Learning and Computer Vision. In "<a href="http://jamanetwork.com/journals/jama/fullarticle/2588763">Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs</a>", published today in <a href="http://jamanetwork.com/journals/jama">JAMA</a>, we present a deep learning algorithm capable of interpreting signs of DR in retinal photographs, potentially helping doctors screen more patients in settings with limited resources. <br /><br />One of the most common ways to detect diabetic eye disease is to have a specialist examine pictures of the back of the eye (Figure 1) and rate them for disease presence and severity. Severity is determined by the type of lesions present (e.g. <a href="http://www.ucdenver.edu/academics/colleges/medicalschool/centers/BarbaraDavis/Clinical/Pages/Ophthalmology.aspx">microaneurysms, hemorrhages, hard exudates, etc</a>), which are indicative of bleeding and fluid leakage in the eye. Interpreting these photographs requires specialized training, and in many regions of the world there aren&#8217;t enough qualified graders to screen everyone who is at risk. <br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://4.bp.blogspot.com/-3E-DEWAF_fM/WD2ffT8I2RI/AAAAAAAABbo/rvuDPgSGEMs_7pZVhV-wL3svrOeqF05owCLcB/s1600/image01.png"><img border="0" height="258" src="https://4.bp.blogspot.com/-3E-DEWAF_fM/WD2ffT8I2RI/AAAAAAAABbo/rvuDPgSGEMs_7pZVhV-wL3svrOeqF05owCLcB/s640/image01.png" width="640"></a></td></tr><tr><td>Figure 1. Examples of retinal fundus photographs that are taken to screen for DR. The image on the left is of a healthy retina (A), whereas the image on the right is a retina with referable diabetic retinopathy (B) due a number of hemorrhages (red spots) present.</td></tr></tbody></table>Working closely with doctors both in India and the US, we created a development dataset of 128,000 images which were each evaluated by 3-7 ophthalmologists from a panel of 54 ophthalmologists. This dataset was used to train a deep neural network to detect referable diabetic retinopathy. We then tested the algorithm&#8217;s performance on two separate clinical validation sets totalling ~12,000 images, with the majority decision of a panel 7 or 8 U.S. board-certified ophthalmologists serving as the reference standard. The ophthalmologists selected for the validation sets were the ones that showed high consistency from the original group of 54 doctors.<br /><br />Performance of both the algorithm and the ophthalmologists on a 9,963-image validation set are shown in Figure 2. <br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://2.bp.blogspot.com/-Whx5jQDUtJg/WD2fxqQMQDI/AAAAAAAABbs/snP00Vot-kYUSCvXs-FyaagnWzMdqg4gQCLcB/s1600/image00.png"><img border="0" height="636" src="https://2.bp.blogspot.com/-Whx5jQDUtJg/WD2fxqQMQDI/AAAAAAAABbs/snP00Vot-kYUSCvXs-FyaagnWzMdqg4gQCLcB/s640/image00.png" width="640"></a></td></tr><tr><td>Figure 2. Performance of the algorithm (black curve) and eight ophthalmologists (colored dots) for the presence of referable diabetic retinopathy (moderate or worse diabetic retinopathy or referable diabetic macular edema) on a validation set consisting of 9963 images. The black diamonds on the graph correspond to the sensitivity and specificity of the algorithm at the high sensitivity and high specificity operating points. </td></tr></tbody></table>The results show that our algorithm&#8217;s performance is on-par with that of ophthalmologists. For example, on the validation set described in Figure 2, the algorithm has a <a href="https://en.wikipedia.org/wiki/F1_score">F-score</a> (combined <a href="https://en.wikipedia.org/wiki/Sensitivity_and_specificity">sensitivity and specificity</a> metric, with max=1) of 0.95, which is slightly better than the median F-score of the 8 ophthalmologists we consulted (measured at 0.91).<br /><br />These are exciting results, but there is still a lot of work to do. First, while the conventional quality measures we used to assess our algorithm are encouraging, we are working with retinal specialists to define even more robust reference standards that can be used to quantify performance. Furthermore, interpretation of a 2D fundus photograph, which we demonstrate in this paper, is only one part in a multi-step process that leads to a diagnosis for diabetic eye disease. In some cases, doctors use a 3D imaging technology, Optical Coherence Tomography (OCT), to examine various layers of a retina in detail. Applying machine learning to this 3D imaging modality is already underway, <a href="https://deepmind.com/applied/deepmind-health/research/">led by our colleagues at DeepMind</a>. In the future, these two complementary methods might be used together to assist doctors in the diagnosis of a wide spectrum of eye diseases.<br /><br />Automated DR screening methods with high accuracy have the strong potential to assist doctors in evaluating more patients and quickly routing those who need help to a specialist. We are working with doctors and researchers to study the entire process of screening in settings around the world, in the hopes that we can integrate our methods into clinical workflow in a manner that is maximally beneficial. Finally, we are working with the FDA and other regulatory agencies to further evaluate these technologies in clinical studies.<br /><br />Given the many recent advances in deep learning, we hope our study will be just one of many compelling examples to come demonstrating the ability of machine learning to help solve important problems in medical imaging in healthcare more broadly. <br /><br />Learn more about the <a href="http://g.co/brain/healthcare">Health Research efforts of the Brain team</a> at Google]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Lily Peng MD PhD, Product Manager and Varun Gulshan PhD, Research Engineer</span><br /><br /><a href="https://en.wikipedia.org/wiki/Diabetic_retinopathy">Diabetic retinopathy</a> (DR) is the fastest growing cause of blindness, with nearly <a href="http://www.idf.org/about-diabetes/facts-figures">415 million diabetic patients</a> at risk worldwide. If caught early, the disease can be treated; if not, it can lead to irreversible blindness. Unfortunately, medical specialists capable of detecting the disease are not available in many parts of the world where diabetes is prevalent. We believe that Machine Learning can help doctors identify patients in need, particularly among underserved populations.<br /><br />A few years ago, several of us began wondering if there was a way Google technologies could  improve the DR screening process, specifically by taking advantage of recent advances in Machine Learning and Computer Vision. In "<a href="http://jamanetwork.com/journals/jama/fullarticle/2588763">Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs</a>", published today in <a href="http://jamanetwork.com/journals/jama">JAMA</a>, we present a deep learning algorithm capable of interpreting signs of DR in retinal photographs, potentially helping doctors screen more patients in settings with limited resources. <br /><br />One of the most common ways to detect diabetic eye disease is to have a specialist examine pictures of the back of the eye (Figure 1) and rate them for disease presence and severity. Severity is determined by the type of lesions present (e.g. <a href="http://www.ucdenver.edu/academics/colleges/medicalschool/centers/BarbaraDavis/Clinical/Pages/Ophthalmology.aspx">microaneurysms, hemorrhages, hard exudates, etc</a>), which are indicative of bleeding and fluid leakage in the eye. Interpreting these photographs requires specialized training, and in many regions of the world there aren’t enough qualified graders to screen everyone who is at risk. <br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-3E-DEWAF_fM/WD2ffT8I2RI/AAAAAAAABbo/rvuDPgSGEMs_7pZVhV-wL3svrOeqF05owCLcB/s1600/image01.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="258" src="https://4.bp.blogspot.com/-3E-DEWAF_fM/WD2ffT8I2RI/AAAAAAAABbo/rvuDPgSGEMs_7pZVhV-wL3svrOeqF05owCLcB/s640/image01.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Figure 1. Examples of retinal fundus photographs that are taken to screen for DR. The image on the left is of a healthy retina (A), whereas the image on the right is a retina with referable diabetic retinopathy (B) due a number of hemorrhages (red spots) present.</td></tr></tbody></table>Working closely with doctors both in India and the US, we created a development dataset of 128,000 images which were each evaluated by 3-7 ophthalmologists from a panel of 54 ophthalmologists. This dataset was used to train a deep neural network to detect referable diabetic retinopathy. We then tested the algorithm’s performance on two separate clinical validation sets totalling ~12,000 images, with the majority decision of a panel 7 or 8 U.S. board-certified ophthalmologists serving as the reference standard. The ophthalmologists selected for the validation sets were the ones that showed high consistency from the original group of 54 doctors.<br /><br />Performance of both the algorithm and the ophthalmologists on a 9,963-image validation set are shown in Figure 2. <br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://2.bp.blogspot.com/-Whx5jQDUtJg/WD2fxqQMQDI/AAAAAAAABbs/snP00Vot-kYUSCvXs-FyaagnWzMdqg4gQCLcB/s1600/image00.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="636" src="https://2.bp.blogspot.com/-Whx5jQDUtJg/WD2fxqQMQDI/AAAAAAAABbs/snP00Vot-kYUSCvXs-FyaagnWzMdqg4gQCLcB/s640/image00.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Figure 2. Performance of the algorithm (black curve) and eight ophthalmologists (colored dots) for the presence of referable diabetic retinopathy (moderate or worse diabetic retinopathy or referable diabetic macular edema) on a validation set consisting of 9963 images. The black diamonds on the graph correspond to the sensitivity and specificity of the algorithm at the high sensitivity and high specificity operating points. </td></tr></tbody></table>The results show that our algorithm’s performance is on-par with that of ophthalmologists. For example, on the validation set described in Figure 2, the algorithm has a <a href="https://en.wikipedia.org/wiki/F1_score">F-score</a> (combined <a href="https://en.wikipedia.org/wiki/Sensitivity_and_specificity">sensitivity and specificity</a> metric, with max=1) of 0.95, which is slightly better than the median F-score of the 8 ophthalmologists we consulted (measured at 0.91).<br /><br />These are exciting results, but there is still a lot of work to do. First, while the conventional quality measures we used to assess our algorithm are encouraging, we are working with retinal specialists to define even more robust reference standards that can be used to quantify performance. Furthermore, interpretation of a 2D fundus photograph, which we demonstrate in this paper, is only one part in a multi-step process that leads to a diagnosis for diabetic eye disease. In some cases, doctors use a 3D imaging technology, Optical Coherence Tomography (OCT), to examine various layers of a retina in detail. Applying machine learning to this 3D imaging modality is already underway, <a href="https://deepmind.com/applied/deepmind-health/research/">led by our colleagues at DeepMind</a>. In the future, these two complementary methods might be used together to assist doctors in the diagnosis of a wide spectrum of eye diseases.<br /><br />Automated DR screening methods with high accuracy have the strong potential to assist doctors in evaluating more patients and quickly routing those who need help to a specialist. We are working with doctors and researchers to study the entire process of screening in settings around the world, in the hopes that we can integrate our methods into clinical workflow in a manner that is maximally beneficial. Finally, we are working with the FDA and other regulatory agencies to further evaluate these technologies in clinical studies.<br /><br />Given the many recent advances in deep learning, we hope our study will be just one of many compelling examples to come demonstrating the ability of machine learning to help solve important problems in medical imaging in healthcare more broadly. <br /><br />Learn more about the <a href="http://g.co/brain/healthcare">Health Research efforts of the Brain team</a> at Google]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/deep-learning-for-detection-of-diabetic-eye-disease/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Zero-Shot Translation with Google’s Multilingual Neural Machine Translation System</title>
		<link>https://googledata.org/google-translate/zero-shot-translation-with-googles-multilingual-neural-machine-translation-system/</link>
		<comments>https://googledata.org/google-translate/zero-shot-translation-with-googles-multilingual-neural-machine-translation-system/#comments</comments>
		<pubDate>Tue, 22 Nov 2016 18:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>
		<category><![CDATA[Google Translate]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=d076584268737342532562b70edc4491</guid>
		<description><![CDATA[<span>Posted by Mike Schuster (Google Brain Team), Melvin Johnson (Google Translate) and Nikhil Thorat (Google Brain Team)</span><br /><br />In the last 10 years, <a href="https://translate.google.com/">Google Translate</a> has grown from supporting just a few languages to 103, translating over 140 billion words every day. To make this possible, we needed to build and maintain many different systems in order to translate between any two languages, incurring significant computational cost. With neural networks reforming many fields, we were convinced we could raise the translation quality further, but doing so would mean rethinking the technology behind Google Translate.<br /><br />In September, <a href="https://research.googleblog.com/2016/09/a-neural-network-for-machine.html">we announced</a> that Google Translate is switching to a new system called <a href="https://arxiv.org/abs/1609.08144">Google Neural Machine Translation (GNMT)</a>, an end-to-end learning framework that learns from millions of examples, and provided significant improvements in translation quality. However, while switching to GNMT improved the quality for the languages we tested it on, scaling up to all the 103 supported languages presented a significant challenge.<br /><br />In &#8220;<a href="https://arxiv.org/abs/1611.04558">Google&#8217;s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation</a>&#8221;, we address this challenge by extending our previous GNMT system, allowing for a single system to translate between multiple languages. Our proposed architecture requires no change in the base GNMT system, but instead uses an additional &#8220;token&#8221; at the beginning of the input sentence to specify the required target language to translate to. In addition to improving translation quality, our method also enables &#8220;Zero-Shot Translation&#8221; &#8212; translation between language pairs never seen explicitly by the system.<br /><div><a href="https://1.bp.blogspot.com/-jwgtcgkgG2o/WDSBrwu9jeI/AAAAAAAABbM/2Eobq-N9_nYeAdeH-sB_NZGbhyoSWgReACLcB/s1600/image01.gif"><img border="0" height="338" src="https://1.bp.blogspot.com/-jwgtcgkgG2o/WDSBrwu9jeI/AAAAAAAABbM/2Eobq-N9_nYeAdeH-sB_NZGbhyoSWgReACLcB/s640/image01.gif" width="640"></a></div>Here&#8217;s how it works. Let&#8217;s say we train a multilingual system with Japanese&#8644;English and Korean&#8644;English examples, shown by the solid blue lines in the animation. Our multilingual system, with the same size as a single GNMT system, shares its parameters to translate between these four different language pairs. This sharing enables the system to transfer the &#8220;translation knowledge&#8221; from one language pair to the others. This transfer learning and the need to translate between multiple languages forces the system to better use its modeling power.<br /><br />This inspired us to ask the following question: Can we translate between a language pair which the system has never seen before? An example of this would be translations between Korean and Japanese where Korean&#8644;Japanese examples were not shown to the system. Impressively, the answer is yes &#8212; it can generate reasonable Korean&#8644;Japanese translations, even though it has never been taught to do so. We call this &#8220;zero-shot&#8221; translation, shown by the yellow dotted lines in the animation. To the best of our knowledge, this is the first time this type of transfer learning has worked in Machine Translation. <br /><br />The success of the zero-shot translation raises another important question: Is the system learning a common representation in which sentences with the same meaning are represented in similar ways regardless of language &#8212; i.e. an &#8220;interlingua&#8221;? Using a 3-dimensional representation of internal network data, we were able to take a peek into the system as it translates a set of sentences between all possible pairs of the Japanese, Korean, and English languages.<br /><br /><div><a href="https://2.bp.blogspot.com/-AmBczBtfi3Q/WDSB0M3InDI/AAAAAAAABbQ/1U_51u5ynl4FK4L0KOEllfRCq0Oauzy5wCEw/s1600/image00.png"><img border="0" height="342" src="https://2.bp.blogspot.com/-AmBczBtfi3Q/WDSB0M3InDI/AAAAAAAABbQ/1U_51u5ynl4FK4L0KOEllfRCq0Oauzy5wCEw/s640/image00.png" width="640"></a></div>Part (a) from the figure above shows an overall geometry of these translations. The points in this view are colored by the meaning; a sentence translated from English to Korean with the same meaning as a sentence translated from Japanese to English share the same color. From this view we can see distinct groupings of points, each with their own color. Part (b) zooms in to one of the groups, and part (c) colors by the source language. Within a single group, we see a sentence with the same meaning but from three different languages. This means the network must be encoding something about the semantics of the sentence rather than simply memorizing phrase-to-phrase translations. We interpret this as a sign of existence of an interlingua in the network. <br /><br />We show many more results and analyses in our paper, and hope that its findings are not only interesting for machine learning or machine translation researchers but also to linguists and others who are interested in how multiple languages can be processed by machines using a single system.<br /><br />Finally, the described Multilingual Google Neural Machine Translation system is running in production today for all <a href="https://translate.google.com/">Google Translate</a> users. Multilingual systems are currently used to serve 10 of the recently launched 16 language pairs, resulting in improved quality and a simplified production architecture.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Mike Schuster (Google Brain Team), Melvin Johnson (Google Translate) and Nikhil Thorat (Google Brain Team)</span><br /><br />In the last 10 years, <a href="https://translate.google.com/">Google Translate</a> has grown from supporting just a few languages to 103, translating over 140 billion words every day. To make this possible, we needed to build and maintain many different systems in order to translate between any two languages, incurring significant computational cost. With neural networks reforming many fields, we were convinced we could raise the translation quality further, but doing so would mean rethinking the technology behind Google Translate.<br /><br />In September, <a href="https://research.googleblog.com/2016/09/a-neural-network-for-machine.html">we announced</a> that Google Translate is switching to a new system called <a href="https://arxiv.org/abs/1609.08144">Google Neural Machine Translation (GNMT)</a>, an end-to-end learning framework that learns from millions of examples, and provided significant improvements in translation quality. However, while switching to GNMT improved the quality for the languages we tested it on, scaling up to all the 103 supported languages presented a significant challenge.<br /><br />In “<a href="https://arxiv.org/abs/1611.04558">Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation</a>”, we address this challenge by extending our previous GNMT system, allowing for a single system to translate between multiple languages. Our proposed architecture requires no change in the base GNMT system, but instead uses an additional “token” at the beginning of the input sentence to specify the required target language to translate to. In addition to improving translation quality, our method also enables “Zero-Shot Translation” — translation between language pairs never seen explicitly by the system.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-jwgtcgkgG2o/WDSBrwu9jeI/AAAAAAAABbM/2Eobq-N9_nYeAdeH-sB_NZGbhyoSWgReACLcB/s1600/image01.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="338" src="https://1.bp.blogspot.com/-jwgtcgkgG2o/WDSBrwu9jeI/AAAAAAAABbM/2Eobq-N9_nYeAdeH-sB_NZGbhyoSWgReACLcB/s640/image01.gif" width="640" /></a></div>Here’s how it works. Let’s say we train a multilingual system with Japanese⇄English and Korean⇄English examples, shown by the solid blue lines in the animation. Our multilingual system, with the same size as a single GNMT system, shares its parameters to translate between these four different language pairs. This sharing enables the system to transfer the “translation knowledge” from one language pair to the others. This transfer learning and the need to translate between multiple languages forces the system to better use its modeling power.<br /><br />This inspired us to ask the following question: Can we translate between a language pair which the system has never seen before? An example of this would be translations between Korean and Japanese where Korean⇄Japanese examples were not shown to the system. Impressively, the answer is yes — it can generate reasonable Korean⇄Japanese translations, even though it has never been taught to do so. We call this “zero-shot” translation, shown by the yellow dotted lines in the animation. To the best of our knowledge, this is the first time this type of transfer learning has worked in Machine Translation. <br /><br />The success of the zero-shot translation raises another important question: Is the system learning a common representation in which sentences with the same meaning are represented in similar ways regardless of language — i.e. an “interlingua”? Using a 3-dimensional representation of internal network data, we were able to take a peek into the system as it translates a set of sentences between all possible pairs of the Japanese, Korean, and English languages.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-AmBczBtfi3Q/WDSB0M3InDI/AAAAAAAABbQ/1U_51u5ynl4FK4L0KOEllfRCq0Oauzy5wCEw/s1600/image00.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="342" src="https://2.bp.blogspot.com/-AmBczBtfi3Q/WDSB0M3InDI/AAAAAAAABbQ/1U_51u5ynl4FK4L0KOEllfRCq0Oauzy5wCEw/s640/image00.png" width="640" /></a></div>Part (a) from the figure above shows an overall geometry of these translations. The points in this view are colored by the meaning; a sentence translated from English to Korean with the same meaning as a sentence translated from Japanese to English share the same color. From this view we can see distinct groupings of points, each with their own color. Part (b) zooms in to one of the groups, and part (c) colors by the source language. Within a single group, we see a sentence with the same meaning but from three different languages. This means the network must be encoding something about the semantics of the sentence rather than simply memorizing phrase-to-phrase translations. We interpret this as a sign of existence of an interlingua in the network. <br /><br />We show many more results and analyses in our paper, and hope that its findings are not only interesting for machine learning or machine translation researchers but also to linguists and others who are interested in how multiple languages can be processed by machines using a single system.<br /><br />Finally, the described Multilingual Google Neural Machine Translation system is running in production today for all <a href="https://translate.google.com/">Google Translate</a> users. Multilingual systems are currently used to serve 10 of the recently launched 16 language pairs, resulting in improved quality and a simplified production architecture.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-translate/zero-shot-translation-with-googles-multilingual-neural-machine-translation-system/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Enhance! RAISR Sharp Images with Machine Learning</title>
		<link>https://googledata.org/google-research/enhance-raisr-sharp-images-with-machine-learning/</link>
		<comments>https://googledata.org/google-research/enhance-raisr-sharp-images-with-machine-learning/#comments</comments>
		<pubDate>Mon, 14 Nov 2016 19:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=a40bfe0d663043adb0362ccc8ce2d662</guid>
		<description><![CDATA[<span>Posted by Peyman Milanfar, Research Scientist</span><br /><br />Everyday the web is used to share and store millions of pictures, enabling one to explore the world, research new topics of interest, or even share a vacation with friends and family. However, many of these images are either limited by the resolution of the device used to take the picture, or purposely degraded in order to accommodate the constraints of cell phones, tablets, or the networks to which they are connected. With the ubiquity of high-resolution displays for home and mobile devices, the demand for high-quality versions of low-resolution images, quickly viewable and shareable from a wide variety of devices, has never been greater. <br /><br />With &#8220;<a href="http://arxiv.org/abs/1606.01299">RAISR: Rapid and Accurate Image Super-Resolution</a>&#8221;, we introduce a technique that incorporates machine learning in order to produce high-quality versions of low-resolution images. RAISR produces results that are comparable to or better than the currently available super-resolution methods, and does so roughly 10 to 100 times faster, allowing it to be run on a typical mobile device in real-time. Furthermore, our technique is able to avoid recreating the aliasing artifacts that may exist in the lower resolution image. <br /><br /><a href="https://en.wikipedia.org/wiki/Upsampling">Upsampling</a>, the process of producing an image of larger size with significantly more pixels and higher image quality from a low quality image, has been around for quite a while. Well-known approaches to upsampling are linear methods which fill in new pixel values using simple, and fixed, combinations of the nearby existing pixel values. These methods are fast because they are fixed linear filters (a constant convolution kernel applied uniformly across the image). But what makes these upsampling methods fast, also makes them ineffective in bringing out vivid details in the higher resolution results. As you can see in the example below, the upsampled image looks blurry &#8211; one would hesitate to call it enhanced. <br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://3.bp.blogspot.com/-CTe4aLwW8IM/WCoHM6AlIrI/AAAAAAAABaA/Plw5vdAva9UMrfXmtronpMVZeAx7u1snQCLcB/s1600/1.png"><img border="0" height="304" src="https://3.bp.blogspot.com/-CTe4aLwW8IM/WCoHM6AlIrI/AAAAAAAABaA/Plw5vdAva9UMrfXmtronpMVZeAx7u1snQCLcB/s640/1.png" width="640"></a></td></tr><tr><td>Left: Low-res original, Right: simple (bicubic) upsampled version (2x). Image Credit: <a href="http://www.telegraph.co.uk/news/2016/05/23/worldturtleday-how-much-do--you-know-about-turtles/">Masa Ushioda/Seapics/Solent News</a></td></tr></tbody></table><br />With RAISR, we instead use machine learning and train on pairs of images, one low quality, one high, to find filters that, when applied to selectively to each pixel of the low-res image, will recreate details that are of comparable quality to the original. RAISR can be trained in two ways. The first is the "direct" method, where filters are learned directly from low and high-resolution image pairs. The other method involves first applying a computationally cheap upsampler to the low resolution image (as in the figure above) and then learning the filters from the upsampled and high resolution image pairs.  While the direct method is computationally faster, the 2nd method allows for non-integer scale factors and better leveraging of hardware-based upsampling.<br /><br />For either method, RAISR filters are trained according to <a href="https://en.wikipedia.org/wiki/Edge_detection">edge features</a> found in small patches of images, - brightness/color gradients, flat/textured regions, etc. - characterized by <i>direction</i> (the angle of an edge), <i>strength</i> (sharp edges have a greater strength) and <i>coherence</i> (a measure of how directional the edge is). Below is a set of RAISR filters, learned from a database of 10,000 high and low resolution image pairs (where the low-res images were first upsampled). The training process takes about an hour. <br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://1.bp.blogspot.com/-3CJ0o8NSvKg/WCoHaSG9EkI/AAAAAAAABaE/vasPxAVX3xE1O2q-Qqfn38sQ_YB7ZrZewCLcB/s1600/image00.png"><img border="0" height="238" src="https://1.bp.blogspot.com/-3CJ0o8NSvKg/WCoHaSG9EkI/AAAAAAAABaE/vasPxAVX3xE1O2q-Qqfn38sQ_YB7ZrZewCLcB/s640/image00.png" width="640"></a></td></tr><tr><td>Collection of learned 11x11 filters for 3x super-resolution. Filters can be learned for a range of super-resolution factors, including fractional ones. Note that as the angle of the edge changes, we see the angle of the filter rotate as well.  Similarly, as the strength increases, the sharpness of the filters increases, and the anisotropy of the filter increases with rising coherence.</td></tr></tbody></table><br />From left to right, we see that the learned filters correspond selectively to the direction of the underlying edge that is being reconstructed. For example, the filter in the middle of the bottom row is most appropriate for a strong horizontal edge (gradient angle of 90 degrees) with a high degree of coherence (a straight, rather than a curved, edge). If this same horizontal edge is low-contrast, then a different filter is selected such one in the top row. <br /><br />In practice, at run-time RAISR selects and applies the most relevant filter from the list of learned filters to each pixel neighborhood in the low-resolution image. When these filters are applied to the lower quality image, they recreate details that are of comparable quality to the original high resolution, and offer a significant improvement to linear, bicubic, or <a href="https://en.wikipedia.org/wiki/Lanczos_resampling">Lanczos interpolation</a> methods.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://2.bp.blogspot.com/-kxbZn2aR73A/WCoHqvuYkcI/AAAAAAAABaI/VIimdpV8-NMIxgOjOQKZjs16R4P-XQT4QCLcB/s1600/3.png"><img border="0" height="256" src="https://2.bp.blogspot.com/-kxbZn2aR73A/WCoHqvuYkcI/AAAAAAAABaI/VIimdpV8-NMIxgOjOQKZjs16R4P-XQT4QCLcB/s640/3.png" width="640"></a></td></tr><tr><td><b>Top:</b> RAISR algorithm at run-time, applied to a cheap upscaler&#8217;s output. <b>Bottom:</b> Low-res original (left), bicubic upsampler 2x (middle),  RAISR output (right)</td></tr></tbody></table><br />Some examples of RAISR in action can be seen below:<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://4.bp.blogspot.com/-mOlb1GnMAK0/WCoH7V-XB8I/AAAAAAAABaM/1luwC-0T4fgM9V_OZtNFUwKSEsR5qcJzwCLcB/s1600/4.png"><img border="0" height="358" src="https://4.bp.blogspot.com/-mOlb1GnMAK0/WCoH7V-XB8I/AAAAAAAABaM/1luwC-0T4fgM9V_OZtNFUwKSEsR5qcJzwCLcB/s640/4.png" width="640"></a></td></tr><tr><td>Top: Original, Bottom: RAISR super-resolved 2x. <a href="http://andrzejdragan.com/wp-content/uploads/2015/05/4.jpg">Original image</a> from <a href="http://www.andrzejdragan.com/">Andrzej Dragan</a></td></tr></tbody></table><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://1.bp.blogspot.com/-g2o2_fwNxHo/WCoIIZne1nI/AAAAAAAABaU/Nc2O5Y73iS4D83EVGG8cRR0sI48dgRBYwCLcB/s1600/5.png"><img border="0" height="626" src="https://1.bp.blogspot.com/-g2o2_fwNxHo/WCoIIZne1nI/AAAAAAAABaU/Nc2O5Y73iS4D83EVGG8cRR0sI48dgRBYwCLcB/s640/5.png" width="640"></a></td></tr><tr><td>Left: Original, Right: RAISR super-resolved 3x. Image courtesy of <a href="http://research.google.com/pubs/MarcLevoy.html">Marc Levoy</a></td></tr></tbody></table><br />One of the more complex aspects of super-resolution is getting rid of <a href="https://en.wikipedia.org/wiki/Aliasing">aliasing</a> artifacts such as <a href="https://en.wikipedia.org/wiki/Moir%C3%A9_pattern">Moire patterns</a> and <a href="https://en.wikipedia.org/wiki/Jaggies">jaggies</a> that arise when high frequency content is rendered in lower resolution (as is the case when images are purposefully degraded). Depending on the shape of the underlying features, these artifacts can be varied and hard to undo. <br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://4.bp.blogspot.com/-XDoxWkmOVWg/WCoIS7jt2OI/AAAAAAAABaY/F0OYem8LGU8W_GlKcZxPwWIxKemkwvOeACLcB/s1600/image01.jpg"><img border="0" height="640" src="https://4.bp.blogspot.com/-XDoxWkmOVWg/WCoIS7jt2OI/AAAAAAAABaY/F0OYem8LGU8W_GlKcZxPwWIxKemkwvOeACLcB/s640/image01.jpg" width="524"></a></td></tr><tr><td>Example of aliasing artifacts seen on the lower right (<a href="https://en.wikipedia.org/wiki/Aliasing#/media/File:Moire_pattern_of_bricks_small.jpg">Image source</a>)</td></tr></tbody></table><br />Linear methods simply can not recover the underlying structure, but RAISR can. Below is an example where the aliased spatial frequencies are apparent under the numbers 3 and 5 in the low-resolution original on the left, while the RAISR image on the right recovered the original structure.  Another important advantage of the filter learning approach used by RAISR is that we can specialize it to remove noise, or compression artifacts unique to individual compression algorithms (such as JPEG) as part of the training process. By providing it with examples of such artifacts, RAISR can learn to undo other effects besides resolution enhancement, having them &#8220;baked&#8221; inside the resulting filters. <br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://4.bp.blogspot.com/-FS2c3xXoSvo/WCoIjjawR0I/AAAAAAAABac/zZDqKt0ezKYPaLA7bMFWiHdfVWcH7IusgCLcB/s1600/6.png"><img border="0" height="318" src="https://4.bp.blogspot.com/-FS2c3xXoSvo/WCoIjjawR0I/AAAAAAAABac/zZDqKt0ezKYPaLA7bMFWiHdfVWcH7IusgCLcB/s640/6.png" width="640"></a></td></tr><tr><td>Left: Low res original, with strong aliasing. Right: RAISR output, removing aliasing.</td></tr></tbody></table><br />Super-resolution technology, using one or many frames, has come a long way. Today, the use of machine learning, in tandem with decades of advances in imaging technology, has enabled progress in image processing that yields many potential benefits. For example, in addition to improving digital &#8220;pinch to zoom&#8221; on your phone, one could capture, save, or transmit images at lower resolution and super-resolve on demand without any visible degradation in quality, all while utilizing less of mobile data and storage plans.  <br /><br />To learn more about the details of our research and a comparison to other current architectures, check out <a href="https://arxiv.org/abs/1606.01299">our paper</a>, which will appear soon in the <a href="http://signalprocessingsociety.org/publications-resources/ieee-transactions-computational-imaging">IEEE Transactions on Computational Imaging</a>.  <br /><br /><br />]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Peyman Milanfar, Research Scientist</span><br /><br />Everyday the web is used to share and store millions of pictures, enabling one to explore the world, research new topics of interest, or even share a vacation with friends and family. However, many of these images are either limited by the resolution of the device used to take the picture, or purposely degraded in order to accommodate the constraints of cell phones, tablets, or the networks to which they are connected. With the ubiquity of high-resolution displays for home and mobile devices, the demand for high-quality versions of low-resolution images, quickly viewable and shareable from a wide variety of devices, has never been greater. <br /><br />With “<a href="http://arxiv.org/abs/1606.01299">RAISR: Rapid and Accurate Image Super-Resolution</a>”, we introduce a technique that incorporates machine learning in order to produce high-quality versions of low-resolution images. RAISR produces results that are comparable to or better than the currently available super-resolution methods, and does so roughly 10 to 100 times faster, allowing it to be run on a typical mobile device in real-time. Furthermore, our technique is able to avoid recreating the aliasing artifacts that may exist in the lower resolution image. <br /><br /><a href="https://en.wikipedia.org/wiki/Upsampling">Upsampling</a>, the process of producing an image of larger size with significantly more pixels and higher image quality from a low quality image, has been around for quite a while. Well-known approaches to upsampling are linear methods which fill in new pixel values using simple, and fixed, combinations of the nearby existing pixel values. These methods are fast because they are fixed linear filters (a constant convolution kernel applied uniformly across the image). But what makes these upsampling methods fast, also makes them ineffective in bringing out vivid details in the higher resolution results. As you can see in the example below, the upsampled image looks blurry – one would hesitate to call it enhanced. <br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://3.bp.blogspot.com/-CTe4aLwW8IM/WCoHM6AlIrI/AAAAAAAABaA/Plw5vdAva9UMrfXmtronpMVZeAx7u1snQCLcB/s1600/1.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="304" src="https://3.bp.blogspot.com/-CTe4aLwW8IM/WCoHM6AlIrI/AAAAAAAABaA/Plw5vdAva9UMrfXmtronpMVZeAx7u1snQCLcB/s640/1.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Left: Low-res original, Right: simple (bicubic) upsampled version (2x). Image Credit: <a href="http://www.telegraph.co.uk/news/2016/05/23/worldturtleday-how-much-do--you-know-about-turtles/">Masa Ushioda/Seapics/Solent News</a></td></tr></tbody></table><br />With RAISR, we instead use machine learning and train on pairs of images, one low quality, one high, to find filters that, when applied to selectively to each pixel of the low-res image, will recreate details that are of comparable quality to the original. RAISR can be trained in two ways. The first is the "direct" method, where filters are learned directly from low and high-resolution image pairs. The other method involves first applying a computationally cheap upsampler to the low resolution image (as in the figure above) and then learning the filters from the upsampled and high resolution image pairs.  While the direct method is computationally faster, the 2nd method allows for non-integer scale factors and better leveraging of hardware-based upsampling.<br /><br />For either method, RAISR filters are trained according to <a href="https://en.wikipedia.org/wiki/Edge_detection">edge features</a> found in small patches of images, - brightness/color gradients, flat/textured regions, etc. - characterized by <i>direction</i> (the angle of an edge), <i>strength</i> (sharp edges have a greater strength) and <i>coherence</i> (a measure of how directional the edge is). Below is a set of RAISR filters, learned from a database of 10,000 high and low resolution image pairs (where the low-res images were first upsampled). The training process takes about an hour. <br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-3CJ0o8NSvKg/WCoHaSG9EkI/AAAAAAAABaE/vasPxAVX3xE1O2q-Qqfn38sQ_YB7ZrZewCLcB/s1600/image00.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="238" src="https://1.bp.blogspot.com/-3CJ0o8NSvKg/WCoHaSG9EkI/AAAAAAAABaE/vasPxAVX3xE1O2q-Qqfn38sQ_YB7ZrZewCLcB/s640/image00.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Collection of learned 11x11 filters for 3x super-resolution. Filters can be learned for a range of super-resolution factors, including fractional ones. Note that as the angle of the edge changes, we see the angle of the filter rotate as well.  Similarly, as the strength increases, the sharpness of the filters increases, and the anisotropy of the filter increases with rising coherence.</td></tr></tbody></table><br />From left to right, we see that the learned filters correspond selectively to the direction of the underlying edge that is being reconstructed. For example, the filter in the middle of the bottom row is most appropriate for a strong horizontal edge (gradient angle of 90 degrees) with a high degree of coherence (a straight, rather than a curved, edge). If this same horizontal edge is low-contrast, then a different filter is selected such one in the top row. <br /><br />In practice, at run-time RAISR selects and applies the most relevant filter from the list of learned filters to each pixel neighborhood in the low-resolution image. When these filters are applied to the lower quality image, they recreate details that are of comparable quality to the original high resolution, and offer a significant improvement to linear, bicubic, or <a href="https://en.wikipedia.org/wiki/Lanczos_resampling">Lanczos interpolation</a> methods.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://2.bp.blogspot.com/-kxbZn2aR73A/WCoHqvuYkcI/AAAAAAAABaI/VIimdpV8-NMIxgOjOQKZjs16R4P-XQT4QCLcB/s1600/3.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="256" src="https://2.bp.blogspot.com/-kxbZn2aR73A/WCoHqvuYkcI/AAAAAAAABaI/VIimdpV8-NMIxgOjOQKZjs16R4P-XQT4QCLcB/s640/3.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><b>Top:</b> RAISR algorithm at run-time, applied to a cheap upscaler’s output. <b>Bottom:</b> Low-res original (left), bicubic upsampler 2x (middle),  RAISR output (right)</td></tr></tbody></table><br />Some examples of RAISR in action can be seen below:<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-mOlb1GnMAK0/WCoH7V-XB8I/AAAAAAAABaM/1luwC-0T4fgM9V_OZtNFUwKSEsR5qcJzwCLcB/s1600/4.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="358" src="https://4.bp.blogspot.com/-mOlb1GnMAK0/WCoH7V-XB8I/AAAAAAAABaM/1luwC-0T4fgM9V_OZtNFUwKSEsR5qcJzwCLcB/s640/4.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Top: Original, Bottom: RAISR super-resolved 2x. <a href="http://andrzejdragan.com/wp-content/uploads/2015/05/4.jpg">Original image</a> from <a href="http://www.andrzejdragan.com/">Andrzej Dragan</a></td></tr></tbody></table><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-g2o2_fwNxHo/WCoIIZne1nI/AAAAAAAABaU/Nc2O5Y73iS4D83EVGG8cRR0sI48dgRBYwCLcB/s1600/5.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="626" src="https://1.bp.blogspot.com/-g2o2_fwNxHo/WCoIIZne1nI/AAAAAAAABaU/Nc2O5Y73iS4D83EVGG8cRR0sI48dgRBYwCLcB/s640/5.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Left: Original, Right: RAISR super-resolved 3x. Image courtesy of <a href="http://research.google.com/pubs/MarcLevoy.html">Marc Levoy</a></td></tr></tbody></table><br />One of the more complex aspects of super-resolution is getting rid of <a href="https://en.wikipedia.org/wiki/Aliasing">aliasing</a> artifacts such as <a href="https://en.wikipedia.org/wiki/Moir%C3%A9_pattern">Moire patterns</a> and <a href="https://en.wikipedia.org/wiki/Jaggies">jaggies</a> that arise when high frequency content is rendered in lower resolution (as is the case when images are purposefully degraded). Depending on the shape of the underlying features, these artifacts can be varied and hard to undo. <br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-XDoxWkmOVWg/WCoIS7jt2OI/AAAAAAAABaY/F0OYem8LGU8W_GlKcZxPwWIxKemkwvOeACLcB/s1600/image01.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="640" src="https://4.bp.blogspot.com/-XDoxWkmOVWg/WCoIS7jt2OI/AAAAAAAABaY/F0OYem8LGU8W_GlKcZxPwWIxKemkwvOeACLcB/s640/image01.jpg" width="524" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Example of aliasing artifacts seen on the lower right (<a href="https://en.wikipedia.org/wiki/Aliasing#/media/File:Moire_pattern_of_bricks_small.jpg">Image source</a>)</td></tr></tbody></table><br />Linear methods simply can not recover the underlying structure, but RAISR can. Below is an example where the aliased spatial frequencies are apparent under the numbers 3 and 5 in the low-resolution original on the left, while the RAISR image on the right recovered the original structure.  Another important advantage of the filter learning approach used by RAISR is that we can specialize it to remove noise, or compression artifacts unique to individual compression algorithms (such as JPEG) as part of the training process. By providing it with examples of such artifacts, RAISR can learn to undo other effects besides resolution enhancement, having them “baked” inside the resulting filters. <br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-FS2c3xXoSvo/WCoIjjawR0I/AAAAAAAABac/zZDqKt0ezKYPaLA7bMFWiHdfVWcH7IusgCLcB/s1600/6.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="318" src="https://4.bp.blogspot.com/-FS2c3xXoSvo/WCoIjjawR0I/AAAAAAAABac/zZDqKt0ezKYPaLA7bMFWiHdfVWcH7IusgCLcB/s640/6.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Left: Low res original, with strong aliasing. Right: RAISR output, removing aliasing.</td></tr></tbody></table><br />Super-resolution technology, using one or many frames, has come a long way. Today, the use of machine learning, in tandem with decades of advances in imaging technology, has enabled progress in image processing that yields many potential benefits. For example, in addition to improving digital “pinch to zoom” on your phone, one could capture, save, or transmit images at lower resolution and super-resolve on demand without any visible degradation in quality, all while utilizing less of mobile data and storage plans.  <br /><br />To learn more about the details of our research and a comparison to other current architectures, check out <a href="https://arxiv.org/abs/1606.01299">our paper</a>, which will appear soon in the <a href="http://signalprocessingsociety.org/publications-resources/ieee-transactions-computational-imaging">IEEE Transactions on Computational Imaging</a>.  <br /><br /><br />]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/enhance-raisr-sharp-images-with-machine-learning/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Open Source Visualization of GPS Displacements for Earthquake Cycle Physics</title>
		<link>https://googledata.org/google-research/open-source-visualization-of-gps-displacements-for-earthquake-cycle-physics-2/</link>
		<comments>https://googledata.org/google-research/open-source-visualization-of-gps-displacements-for-earthquake-cycle-physics-2/#comments</comments>
		<pubDate>Thu, 10 Nov 2016 18:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=44f095089cd122a0e2924bee37979949</guid>
		<description><![CDATA[<span>Posted by Jimbo Wilson, Software Engineer, Google Big Picture Team and Brendan Meade, Professor, Harvard Department of Earth and Planetary Sciences</span><br /><br />The <a href="https://en.wikipedia.org/wiki/Plate_tectonics">Earth&#8217;s surface is moving</a>, ever so slightly, all the time. This slow, small, but persistent movement of the Earth's crust is responsible for the formation of mountain ranges, sudden earthquakes, and even the positions of the continents. Scientists around the world measure these almost imperceptible movements using arrays of <a href="https://en.wikipedia.org/wiki/Satellite_navigation">Global Navigation Satellite System</a> (GNSS) receivers to better understand all phases of an earthquake cycle&#8212;both how the surface responds after an earthquake, and the storage of <a href="https://en.wikipedia.org/wiki/Elastic-rebound_theory">strain energy</a> between earthquakes.<br /><br />To help researchers explore this data and better understand the Earthquake cycle, we are releasing a new, interactive data visualization which draws geodetic velocity lines on top of a relief map by amplifying position estimates relative to their true positions. Unlike existing approaches, which focus on small time slices or individual stations, our visualization can show all the data for a whole array of stations at once. Open sourced under an <a href="https://www.apache.org/licenses/LICENSE-2.0">Apache 2 license</a>, and <a href="https://github.com/google/geovelo">available on GitHub</a>, this visualization technique is a collaboration between Harvard&#8217;s <a href="http://eps.harvard.edu/">Department of Earth and Planetary Sciences</a> and Google's <a href="http://research.google.com/pubs/MachinePerception.html">Machine Perception</a> and <a href="https://research.google.com/bigpicture/">Big Picture</a> teams.<br /><br />Our approach helps scientists quickly assess deformations across all phases of the earthquake cycle&#8212;both during earthquakes (coseismic) and the time between (interseismic). For example, we can see azimuth (direction) reversals of stations as they relate to topographic structures and active faults. Digging into these movements will help scientists vet their models and their data, both of which are crucial for developing accurate computer representations that may help predict future earthquakes.<br /><br />Classical approaches to visualizing these data have fallen into two general categories: 1) a map view of velocity/displacement vectors over a fixed time interval and 2) time versus position plots of each GNSS component (longitude, latitude and altitude).<br /><br /><table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td><img border="0" src="https://1.bp.blogspot.com/-OHj0BM-qZa0/WCOXSgq6xkI/AAAAAAAABY8/8iSLahPg3VcGpO-w3bjeG4bW_J9rgPmiACEw/s320/image02.png" width="100%"></td> <td><img border="0" src="https://3.bp.blogspot.com/-o60yf_0kA_0/WCOXWgg5xJI/AAAAAAAABY8/jXSbIXw37ucqWicIbc9uMnf7XAHGjCCTwCEw/s320/image01.png" width="100%"></td> </tr></tbody></table><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td>Examples of classical approaches. On the left is a map view showing average velocity vectors over the period from 1997 to 2001[1]. On the right you can see a time versus eastward (longitudinal) position plot for a single station.</td></tr></tbody></table><br />Each of these approaches have proved to be informative ways to understand the spatial distribution of crustal movements and the time evolution of solid earth deformation. However, because geodetic shifts happen in almost imperceptible distances (mm) and over long timescales, both approaches can only show a small subset of the data at any time&#8212;a condensed average velocity per station, or a detailed view of a single station, respectively. Our visualization enables a scientist to see all the data at once, then interactively drill down to a specific subset of interest.<br /><br />Our visualization approach is straightforward; by magnifying the daily longitude and latitude position changes, we show tracks of the evolution of the position of each station. These magnified position tracks are shown as trails on top of a shaded relief topography to provide a sense of position evolution in geographic context.<br /><br />To see how it works in practice, let&#8217;s step through an an example. Consider this tiny set of longitude/latitude pairs for a single GNSS station, with the differing digits shown in bold:<br /><br /><table border="1" cellpadding="1" cellspacing="0"><tbody><tr><td><div><b>Day Index</b></div></td> <td><div><b>Longitude</b></div></td> <td><div><b>Latitude</b></div></td> </tr><tr><td><div><b>0</b></div></td> <td><div>139.069904<b>07</b></div></td> <td><div>34.949757<b>897</b></div></td> </tr><tr><td><div><b>1</b></div></td> <td><div>139.069904<b>00</b></div></td> <td><div>34.949757<b>882</b></div></td> </tr><tr><td><div><b>2</b></div></td> <td><div>139.069904<b>13</b></div></td> <td><div>34.949757<b>941</b></div></td> </tr><tr><td><div><b>3</b></div></td> <td><div>139.069904<b>09</b></div></td> <td><div>34.949757<b>921</b></div></td> </tr><tr><td><div><b>4</b></div></td> <td><div>139.069904<b>13</b></div></td> <td><div>34.949757<b>904</b></div></td> </tr></tbody></table><br />If we were to draw line segments between these points directly on a map, they&#8217;d be much too small to see at any reasonable scale. So we take these minute differences and multiply them by a user-controlled scaling factor. By default this factor is 10<sup>5.5</sup> (about 316,000x).<br /><div><a href="https://3.bp.blogspot.com/-EXyid2JWpCk/WCOadpaIyfI/AAAAAAAABZM/ej47D-05YdUmfadsLfBAbmNhcArkzGqtACLcB/s1600/image05.png"><img border="0" height="454" src="https://3.bp.blogspot.com/-EXyid2JWpCk/WCOadpaIyfI/AAAAAAAABZM/ej47D-05YdUmfadsLfBAbmNhcArkzGqtACLcB/s640/image05.png" width="640"></a></div><br />To help the user identify which end is the start of the line, we give the start and end points different colors and interpolate between them. Blue and red are the default colors, but they&#8217;re user-configurable. Although day-to-day movement of stations may seem erratic, by using this method, one can make out a general trend in the relative motion of a station.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://3.bp.blogspot.com/-StX-6aPSUQg/WCOaj1VK1RI/AAAAAAAABZQ/qCYNzJQZEXU6JJZ57LqiEiYYNvwJw3klQCLcB/s1600/image00.png"><img border="0" height="448" src="https://3.bp.blogspot.com/-StX-6aPSUQg/WCOaj1VK1RI/AAAAAAAABZQ/qCYNzJQZEXU6JJZ57LqiEiYYNvwJw3klQCLcB/s640/image00.png" width="640"></a></td></tr><tr><td>Close-up of a single station&#8217;s movement during the three year period from 2003 to 2006.</td></tr></tbody></table><br />However, static renderings of this sort suffer from the same problem that velocity vector images do; in regions with a high density of GNSS stations, tracks overlap significantly with one another, obscuring details. To solve this problem, our visualization lets the user interactively control the time range of interest, the amount of amplification and other settings. In addition, by animating the lines from start to finish, the user gets a real sense of motion that&#8217;s difficult to achieve in a static image.<br /><br />We&#8217;ve applied our new visualization to the ~20 years of data from the <a href="https://www.geospatialworld.net/article/geonet-nationwide-gps-array-of-japan/">GEONET array in Japan</a>. Through it, we can see small but coherent changes in direction before and after the great 2011 Tohoku earthquake.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://4.bp.blogspot.com/-3kjrFwhpWLk/WCOav1jxyDI/AAAAAAAABZU/8gFrgCl6e7M1nBwwzppAFr6YbtXPy1XkwCLcB/s1600/image03.gif"><img border="0" height="348" src="https://4.bp.blogspot.com/-3kjrFwhpWLk/WCOav1jxyDI/AAAAAAAABZU/8gFrgCl6e7M1nBwwzppAFr6YbtXPy1XkwCLcB/s640/image03.gif" width="640"></a></td></tr><tr><td>GPS data sets (in .json format) for both the GEONET data in Japan and the Plate Boundary Observatory (PBO) data in the western US are available at <a href="http://earthquake.rc.fas.harvard.edu/">earthquake.rc.fas.harvard.edu</a>.</td></tr></tbody></table><br />This short animation shows many of the visualization&#8217;s interactive features. In order:<br /><ol><li>Modifying the multiplier adjusts how significantly the movements are magnified.</li><li>We can adjust the time slider nubs to select a particular time range of interest.</li><li>Using the map controls provided by the <a href="https://developers.google.com/maps/documentation/javascript/">Google Maps JavaScript API</a>, we can zoom into a tiny region of the map.</li><li>By enabling map markers, we can see information about individual GNSS stations.</li></ol>By focusing on a stations of interest, we can even see curvature changes in the time periods before and after the event.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://4.bp.blogspot.com/-Hv4xcTXwBf0/WCOa_FyI4YI/AAAAAAAABZY/eLPJoBAL87IIr03GVhU3KqKYBfl4bEQUACLcB/s1600/image04.png"><img border="0" height="346" src="https://4.bp.blogspot.com/-Hv4xcTXwBf0/WCOa_FyI4YI/AAAAAAAABZY/eLPJoBAL87IIr03GVhU3KqKYBfl4bEQUACLcB/s640/image04.png" width="640"></a></td></tr><tr><td>Station designated 960601 of Japan&#8217;s GEONET array is located on the island of Mikura-jima. Here we see the period from 2006 to 2012, with movement magnified 10<sup>5.1</sup> times (126,000x).</td></tr></tbody></table><br />To achieve fast rendering of the line segments, we created a custom overlay using <a href="https://threejs.org/">THREE.js</a> to render the lines in WebGL. Data for the GNSS stations is passed to the GPU in a data texture, which allows our vertex shader to position each point on-screen dynamically based on user settings and animation.<br /><br />We&#8217;re excited to continue this productive collaboration between Harvard and Google as we explore opportunities for groundbreaking, new earthquake visualizations. If you&#8217;d like to try out the visualization yourself, follow the instructions at <a href="http://earthquake.rc.fas.harvard.edu/">earthquake.rc.fas.harvard.edu</a>. It will walk you through the setup steps, including how to download the available data sets. If you&#8217;d like to report issues, great! Please submit them through the GitHub project page.<br /><br /><b>Acknowledgments</b><br /><br />We wish to thank Bill Freeman, a researcher on <a href="http://research.google.com/pubs/MachinePerception.html">Machine Perception</a>, who hatched the idea and developed the initial prototypes, and Fernanda Vi&#233;gas and Martin Wattenberg of the <a href="https://research.google.com/bigpicture/">Big Picture Team</a> for their visualization design guidance.<br /><br /><b>References</b><br /><br />[1] Loveless, J. P., and Meade, B. J. (2010). <a href="http://onlinelibrary.wiley.com/doi/10.1029/2008JB006248/abstract">Geodetic imaging of plate motions, slip rates, and partitioning of deformation in Japan</a>, <i>Journal of Geophysical Research.</i><br /><br /><br />]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Jimbo Wilson, Software Engineer, Google Big Picture Team and Brendan Meade, Professor, Harvard Department of Earth and Planetary Sciences</span><br /><br />The <a href="https://en.wikipedia.org/wiki/Plate_tectonics">Earth’s surface is moving</a>, ever so slightly, all the time. This slow, small, but persistent movement of the Earth's crust is responsible for the formation of mountain ranges, sudden earthquakes, and even the positions of the continents. Scientists around the world measure these almost imperceptible movements using arrays of <a href="https://en.wikipedia.org/wiki/Satellite_navigation">Global Navigation Satellite System</a> (GNSS) receivers to better understand all phases of an earthquake cycle—both how the surface responds after an earthquake, and the storage of <a href="https://en.wikipedia.org/wiki/Elastic-rebound_theory">strain energy</a> between earthquakes.<br /><br />To help researchers explore this data and better understand the Earthquake cycle, we are releasing a new, interactive data visualization which draws geodetic velocity lines on top of a relief map by amplifying position estimates relative to their true positions. Unlike existing approaches, which focus on small time slices or individual stations, our visualization can show all the data for a whole array of stations at once. Open sourced under an <a href="https://www.apache.org/licenses/LICENSE-2.0">Apache 2 license</a>, and <a href="https://github.com/google/geovelo">available on GitHub</a>, this visualization technique is a collaboration between Harvard’s <a href="http://eps.harvard.edu/">Department of Earth and Planetary Sciences</a> and Google's <a href="http://research.google.com/pubs/MachinePerception.html">Machine Perception</a> and <a href="https://research.google.com/bigpicture/">Big Picture</a> teams.<br /><br />Our approach helps scientists quickly assess deformations across all phases of the earthquake cycle—both during earthquakes (coseismic) and the time between (interseismic). For example, we can see azimuth (direction) reversals of stations as they relate to topographic structures and active faults. Digging into these movements will help scientists vet their models and their data, both of which are crucial for developing accurate computer representations that may help predict future earthquakes.<br /><br />Classical approaches to visualizing these data have fallen into two general categories: 1) a map view of velocity/displacement vectors over a fixed time interval and 2) time versus position plots of each GNSS component (longitude, latitude and altitude).<br /><br /><table border="0" cellpadding="0" cellspacing="0" style="width: 100%;"><tbody><tr> <td><img border="0" src="https://1.bp.blogspot.com/-OHj0BM-qZa0/WCOXSgq6xkI/AAAAAAAABY8/8iSLahPg3VcGpO-w3bjeG4bW_J9rgPmiACEw/s320/image02.png" width="100%" /></td> <td><img border="0" src="https://3.bp.blogspot.com/-o60yf_0kA_0/WCOXWgg5xJI/AAAAAAAABY8/jXSbIXw37ucqWicIbc9uMnf7XAHGjCCTwCEw/s320/image01.png" width="100%" /></td> </tr></tbody></table><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td class="tr-caption" style="text-align: center;">Examples of classical approaches. On the left is a map view showing average velocity vectors over the period from 1997 to 2001[1]. On the right you can see a time versus eastward (longitudinal) position plot for a single station.</td></tr></tbody></table><br />Each of these approaches have proved to be informative ways to understand the spatial distribution of crustal movements and the time evolution of solid earth deformation. However, because geodetic shifts happen in almost imperceptible distances (mm) and over long timescales, both approaches can only show a small subset of the data at any time—a condensed average velocity per station, or a detailed view of a single station, respectively. Our visualization enables a scientist to see all the data at once, then interactively drill down to a specific subset of interest.<br /><br />Our visualization approach is straightforward; by magnifying the daily longitude and latitude position changes, we show tracks of the evolution of the position of each station. These magnified position tracks are shown as trails on top of a shaded relief topography to provide a sense of position evolution in geographic context.<br /><br />To see how it works in practice, let’s step through an an example. Consider this tiny set of longitude/latitude pairs for a single GNSS station, with the differing digits shown in bold:<br /><br /><table border="1" cellpadding="1" cellspacing="0" style="width: 100%;"><tbody><tr> <td><div style="text-align: center;"><b>Day Index</b></div></td> <td><div style="text-align: center;"><b>Longitude</b></div></td> <td><div style="text-align: center;"><b>Latitude</b></div></td> </tr><tr> <td><div style="text-align: center;"><b>0</b></div></td> <td><div style="text-align: center;">139.069904<b>07</b></div></td> <td><div style="text-align: center;">34.949757<b>897</b></div></td> </tr><tr> <td><div style="text-align: center;"><b>1</b></div></td> <td><div style="text-align: center;">139.069904<b>00</b></div></td> <td><div style="text-align: center;">34.949757<b>882</b></div></td> </tr><tr> <td><div style="text-align: center;"><b>2</b></div></td> <td><div style="text-align: center;">139.069904<b>13</b></div></td> <td><div style="text-align: center;">34.949757<b>941</b></div></td> </tr><tr> <td><div style="text-align: center;"><b>3</b></div></td> <td><div style="text-align: center;">139.069904<b>09</b></div></td> <td><div style="text-align: center;">34.949757<b>921</b></div></td> </tr><tr> <td><div style="text-align: center;"><b>4</b></div></td> <td><div style="text-align: center;">139.069904<b>13</b></div></td> <td><div style="text-align: center;">34.949757<b>904</b></div></td> </tr></tbody></table><br />If we were to draw line segments between these points directly on a map, they’d be much too small to see at any reasonable scale. So we take these minute differences and multiply them by a user-controlled scaling factor. By default this factor is 10<sup>5.5</sup> (about 316,000x).<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-EXyid2JWpCk/WCOadpaIyfI/AAAAAAAABZM/ej47D-05YdUmfadsLfBAbmNhcArkzGqtACLcB/s1600/image05.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="454" src="https://3.bp.blogspot.com/-EXyid2JWpCk/WCOadpaIyfI/AAAAAAAABZM/ej47D-05YdUmfadsLfBAbmNhcArkzGqtACLcB/s640/image05.png" width="640" /></a></div><br />To help the user identify which end is the start of the line, we give the start and end points different colors and interpolate between them. Blue and red are the default colors, but they’re user-configurable. Although day-to-day movement of stations may seem erratic, by using this method, one can make out a general trend in the relative motion of a station.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://3.bp.blogspot.com/-StX-6aPSUQg/WCOaj1VK1RI/AAAAAAAABZQ/qCYNzJQZEXU6JJZ57LqiEiYYNvwJw3klQCLcB/s1600/image00.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="448" src="https://3.bp.blogspot.com/-StX-6aPSUQg/WCOaj1VK1RI/AAAAAAAABZQ/qCYNzJQZEXU6JJZ57LqiEiYYNvwJw3klQCLcB/s640/image00.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Close-up of a single station’s movement during the three year period from 2003 to 2006.</td></tr></tbody></table><br />However, static renderings of this sort suffer from the same problem that velocity vector images do; in regions with a high density of GNSS stations, tracks overlap significantly with one another, obscuring details. To solve this problem, our visualization lets the user interactively control the time range of interest, the amount of amplification and other settings. In addition, by animating the lines from start to finish, the user gets a real sense of motion that’s difficult to achieve in a static image.<br /><br />We’ve applied our new visualization to the ~20 years of data from the <a href="https://www.geospatialworld.net/article/geonet-nationwide-gps-array-of-japan/">GEONET array in Japan</a>. Through it, we can see small but coherent changes in direction before and after the great 2011 Tohoku earthquake.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-3kjrFwhpWLk/WCOav1jxyDI/AAAAAAAABZU/8gFrgCl6e7M1nBwwzppAFr6YbtXPy1XkwCLcB/s1600/image03.gif" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="348" src="https://4.bp.blogspot.com/-3kjrFwhpWLk/WCOav1jxyDI/AAAAAAAABZU/8gFrgCl6e7M1nBwwzppAFr6YbtXPy1XkwCLcB/s640/image03.gif" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">GPS data sets (in .json format) for both the GEONET data in Japan and the Plate Boundary Observatory (PBO) data in the western US are available at <a href="http://earthquake.rc.fas.harvard.edu/">earthquake.rc.fas.harvard.edu</a>.</td></tr></tbody></table><br />This short animation shows many of the visualization’s interactive features. In order:<br /><ol><li>Modifying the multiplier adjusts how significantly the movements are magnified.</li><li>We can adjust the time slider nubs to select a particular time range of interest.</li><li>Using the map controls provided by the <a href="https://developers.google.com/maps/documentation/javascript/">Google Maps JavaScript API</a>, we can zoom into a tiny region of the map.</li><li>By enabling map markers, we can see information about individual GNSS stations.</li></ol>By focusing on a stations of interest, we can even see curvature changes in the time periods before and after the event.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-Hv4xcTXwBf0/WCOa_FyI4YI/AAAAAAAABZY/eLPJoBAL87IIr03GVhU3KqKYBfl4bEQUACLcB/s1600/image04.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="346" src="https://4.bp.blogspot.com/-Hv4xcTXwBf0/WCOa_FyI4YI/AAAAAAAABZY/eLPJoBAL87IIr03GVhU3KqKYBfl4bEQUACLcB/s640/image04.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Station designated 960601 of Japan’s GEONET array is located on the island of Mikura-jima. Here we see the period from 2006 to 2012, with movement magnified 10<sup>5.1</sup> times (126,000x).</td></tr></tbody></table><br />To achieve fast rendering of the line segments, we created a custom overlay using <a href="https://threejs.org/">THREE.js</a> to render the lines in WebGL. Data for the GNSS stations is passed to the GPU in a data texture, which allows our vertex shader to position each point on-screen dynamically based on user settings and animation.<br /><br />We’re excited to continue this productive collaboration between Harvard and Google as we explore opportunities for groundbreaking, new earthquake visualizations. If you’d like to try out the visualization yourself, follow the instructions at <a href="http://earthquake.rc.fas.harvard.edu/">earthquake.rc.fas.harvard.edu</a>. It will walk you through the setup steps, including how to download the available data sets. If you’d like to report issues, great! Please submit them through the GitHub project page.<br /><br /><b>Acknowledgments</b><br /><br />We wish to thank Bill Freeman, a researcher on <a href="http://research.google.com/pubs/MachinePerception.html">Machine Perception</a>, who hatched the idea and developed the initial prototypes, and Fernanda Viégas and Martin Wattenberg of the <a href="https://research.google.com/bigpicture/">Big Picture Team</a> for their visualization design guidance.<br /><br /><b>References</b><br /><br />[1] Loveless, J. P., and Meade, B. J. (2010). <a href="http://onlinelibrary.wiley.com/doi/10.1029/2008JB006248/abstract">Geodetic imaging of plate motions, slip rates, and partitioning of deformation in Japan</a>, <i>Journal of Geophysical Research.</i><br /><br /><br />]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/open-source-visualization-of-gps-displacements-for-earthquake-cycle-physics-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Celebrating TensorFlow’s First Year</title>
		<link>https://googledata.org/google-research/celebrating-tensorflows-first-year/</link>
		<comments>https://googledata.org/google-research/celebrating-tensorflows-first-year/#comments</comments>
		<pubDate>Wed, 09 Nov 2016 17:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=040ee977d4b134b04cf4a56d0fa60a5b</guid>
		<description><![CDATA[<span>Posted by Zak Stone, Product Manager for TensorFlow, on behalf of the TensorFlow team</span><br /><i><br /></i> <i>(Cross-posted on the <a href="https://opensource.googleblog.com/2016/11/celebrating-tensorflows-first-year.html">Google Open Source Blog</a> &#38; <a href="https://developers.googleblog.com/2016/11/celebrating-tensorflows-first-year.html">Google Developers Blog</a>)</i><br /><br />It has been an eventful year since the <a href="http://g.co/brain">Google Brain Team</a> <a href="https://research.googleblog.com/2015/11/tensorflow-googles-latest-machine_9.html">open-sourced TensorFlow</a> to accelerate machine learning research and <a href="https://blog.google/topics/machine-learning/tensorflow-smarter-machine-learning-for/">make technology work better for everyone</a>. There has been an amazing amount of activity around the project: more than 480 people have contributed directly to <a href="https://www.tensorflow.org/">TensorFlow</a>, including Googlers, external researchers, independent programmers, students, and senior developers at other large companies. TensorFlow is now <a href="https://github.com/tensorflow/tensorflow">the most popular</a> machine learning project on GitHub.<br /><div><a href="https://2.bp.blogspot.com/-KPtDpnhqRX4/WCNMJAxbg6I/AAAAAAAABYU/2ILORqwfaoQteN-zDsS_9zH-RxJxdu9egCLcB/s1600/TF_logo_no_shadow_1.png"><img border="0" height="255" src="https://2.bp.blogspot.com/-KPtDpnhqRX4/WCNMJAxbg6I/AAAAAAAABYU/2ILORqwfaoQteN-zDsS_9zH-RxJxdu9egCLcB/s400/TF_logo_no_shadow_1.png" width="400"></a></div><br />With more than 10,000 commits in just twelve months, we&#8217;ve made <a href="https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md">numerous performance improvements</a>, <a href="https://research.googleblog.com/2016/04/announcing-tensorflow-08-now-with.html">added support for distributed training</a>, <a href="https://petewarden.com/2016/09/27/tensorflow-for-mobile-poets/">brought TensorFlow to iOS</a> and <a href="https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/pi_examples/">Raspberry Pi</a>, and integrated TensorFlow with widely-used <a href="https://github.com/tensorflow/ecosystem">big data infrastructure</a>. We&#8217;ve also made TensorFlow accessible from <a href="https://github.com/tensorflow/tensorflow/blob/master/tensorflow/go/README.md">Go</a>, <a href="https://github.com/tensorflow/rust/blob/master/README.md">Rust</a> and <a href="https://github.com/tensorflow/haskell/blob/master/README.md">Haskell</a>, <a href="https://research.googleblog.com/2016/08/improving-inception-and-image.html">released state-of-the-art image classification models</a>, and answered thousands of questions on <a href="https://github.com/tensorflow/tensorflow/issues">GitHub</a>, <a href="http://stackoverflow.com/questions/tagged/tensorflow">StackOverflow</a>&#160;and the <a href="https://groups.google.com/a/tensorflow.org/forum/#!forum/discuss">TensorFlow mailing list</a> along the way.<br /><br />At Google, TensorFlow supports everything from large-scale product features to exploratory research. We recently launched <a href="https://research.googleblog.com/2016/09/a-neural-network-for-machine.html">major improvements to Google Translate</a> using TensorFlow (and <a href="https://cloudplatform.googleblog.com/2016/05/Google-supercharges-machine-learning-tasks-with-custom-chip.html">Tensor Processing Units</a>, which are special hardware accelerators for TensorFlow). <a href="https://magenta.tensorflow.org/welcome-to-magenta">Project Magenta</a> is working on new reinforcement learning-based models that can <a href="https://magenta.tensorflow.org/2016/11/09/tuning-recurrent-networks-with-reinforcement-learning/">produce melodies</a>, and a visiting PhD student recently worked with the Google Brain team to build a TensorFlow model that can <a href="https://research.googleblog.com/2016/10/supercharging-style-transfer.html">automatically interpolate between artistic styles</a>. <a href="https://deepmind.com/">DeepMind</a> has also <a href="https://research.googleblog.com/2016/04/deepmind-moves-to-tensorflow.html">decided to use TensorFlow</a> to power all of their research &#8211; for example, they recently produced <a href="https://deepmind.com/blog/wavenet-generative-model-raw-audio/">fascinating generative models</a> of speech and music based on raw audio.<br /><br />We&#8217;re especially excited to see how people all over the world are using TensorFlow. For example:<br /><br /><ul><li>Australian marine biologists are using TensorFlow to <a href="https://blog.google/topics/machine-learning/could-machine-learning-save-sea-cow">find sea cows</a> in tens of thousands of hi-res photos to better understand their populations, which are under threat of extinction.</li><li>An enterprising Japanese cucumber farmer trained a model with TensorFlow to <a href="https://cloud.google.com/blog/big-data/2016/08/how-a-japanese-cucumber-farmer-is-using-deep-learning-and-tensorflow">sort cucumbers</a> by size, shape, and other characteristics.</li><li>Radiologists have adapted TensorFlow to identify <a href="https://www.ncbi.nlm.nih.gov/pubmed/27730415">signs of Parkinson&#8217;s disease</a> in medical scans.</li><li>Data scientists in the Bay Area have rigged up TensorFlow and the Raspberry Pi to <a href="https://svds.com/introduction-to-trainspotting/">keep track of the Caltrain</a>.</li></ul><br />We&#8217;re committed to making sure TensorFlow scales all the way from research to production and from the tiniest Raspberry Pi all the way up to server farms filled with GPUs or TPUs. But TensorFlow is more than a single open-source project &#8211; we&#8217;re doing our best to foster an open-source ecosystem of related software and machine learning models around it:<br /><br /><ul><li>The <a href="https://research.googleblog.com/2016/02/running-your-models-in-production-with.html">TensorFlow Serving</a> project simplifies the process of serving TensorFlow models in production.</li><li>TensorFlow &#8220;<a href="https://research.googleblog.com/2016/06/wide-deep-learning-better-together-with.html">Wide and Deep</a>&#8221; models combine the strengths of traditional linear models and modern deep neural networks.</li><li>For those who are interested in working with TensorFlow in the cloud, <a href="https://cloud.google.com/">Google Cloud Platform</a> recently launched <a href="https://cloud.google.com/ml/">Cloud Machine Learning</a>, which offers TensorFlow as a managed service.</li></ul><br />Furthermore, <a href="https://github.com/tensorflow/models">TensorFlow&#8217;s repository of models</a> continues to grow with contributions from the community, with <a href="https://github.com/search?q=tensorflow">more than 3000 TensorFlow-related repositories</a> listed on GitHub alone! To participate in the TensorFlow community, you can follow our new Twitter account (<a href="https://twitter.com/tensorflow">@tensorflow</a>), <a href="https://github.com/tensorflow/tensorflow">find us on GitHub</a>, <a href="http://stackoverflow.com/questions/tagged/tensorflow">ask and answer questions on StackOverflow</a>, and join the <a href="https://groups.google.com/a/tensorflow.org/forum/#!forum/discuss">community discussion list</a>.<br /><br />Thanks very much to all of you who have already adopted TensorFlow in your cutting-edge products, your ambitious research, your fast-growing startups, and your school projects; special thanks to everyone who has <a href="https://github.com/tensorflow/tensorflow">contributed directly</a> to the codebase. In collaboration with the global machine learning community, we look forward to making TensorFlow even better in the years to come!]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Zak Stone, Product Manager for TensorFlow, on behalf of the TensorFlow team</span><br /><i><br /></i> <i>(Cross-posted on the <a href="https://opensource.googleblog.com/2016/11/celebrating-tensorflows-first-year.html">Google Open Source Blog</a> &amp; <a href="https://developers.googleblog.com/2016/11/celebrating-tensorflows-first-year.html">Google Developers Blog</a>)</i><br /><br />It has been an eventful year since the <a href="http://g.co/brain">Google Brain Team</a> <a href="https://research.googleblog.com/2015/11/tensorflow-googles-latest-machine_9.html">open-sourced TensorFlow</a> to accelerate machine learning research and <a href="https://blog.google/topics/machine-learning/tensorflow-smarter-machine-learning-for/">make technology work better for everyone</a>. There has been an amazing amount of activity around the project: more than 480 people have contributed directly to <a href="https://www.tensorflow.org/">TensorFlow</a>, including Googlers, external researchers, independent programmers, students, and senior developers at other large companies. TensorFlow is now <a href="https://github.com/tensorflow/tensorflow">the most popular</a> machine learning project on GitHub.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-KPtDpnhqRX4/WCNMJAxbg6I/AAAAAAAABYU/2ILORqwfaoQteN-zDsS_9zH-RxJxdu9egCLcB/s1600/TF_logo_no_shadow_1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="255" src="https://2.bp.blogspot.com/-KPtDpnhqRX4/WCNMJAxbg6I/AAAAAAAABYU/2ILORqwfaoQteN-zDsS_9zH-RxJxdu9egCLcB/s400/TF_logo_no_shadow_1.png" width="400" /></a></div><br />With more than 10,000 commits in just twelve months, we’ve made <a href="https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md">numerous performance improvements</a>, <a href="https://research.googleblog.com/2016/04/announcing-tensorflow-08-now-with.html">added support for distributed training</a>, <a href="https://petewarden.com/2016/09/27/tensorflow-for-mobile-poets/">brought TensorFlow to iOS</a> and <a href="https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/pi_examples/">Raspberry Pi</a>, and integrated TensorFlow with widely-used <a href="https://github.com/tensorflow/ecosystem">big data infrastructure</a>. We’ve also made TensorFlow accessible from <a href="https://github.com/tensorflow/tensorflow/blob/master/tensorflow/go/README.md">Go</a>, <a href="https://github.com/tensorflow/rust/blob/master/README.md">Rust</a> and <a href="https://github.com/tensorflow/haskell/blob/master/README.md">Haskell</a>, <a href="https://research.googleblog.com/2016/08/improving-inception-and-image.html">released state-of-the-art image classification models</a>, and answered thousands of questions on <a href="https://github.com/tensorflow/tensorflow/issues">GitHub</a>, <a href="http://stackoverflow.com/questions/tagged/tensorflow">StackOverflow</a>&nbsp;and the <a href="https://groups.google.com/a/tensorflow.org/forum/#!forum/discuss">TensorFlow mailing list</a> along the way.<br /><br />At Google, TensorFlow supports everything from large-scale product features to exploratory research. We recently launched <a href="https://research.googleblog.com/2016/09/a-neural-network-for-machine.html">major improvements to Google Translate</a> using TensorFlow (and <a href="https://cloudplatform.googleblog.com/2016/05/Google-supercharges-machine-learning-tasks-with-custom-chip.html">Tensor Processing Units</a>, which are special hardware accelerators for TensorFlow). <a href="https://magenta.tensorflow.org/welcome-to-magenta">Project Magenta</a> is working on new reinforcement learning-based models that can <a href="https://magenta.tensorflow.org/2016/11/09/tuning-recurrent-networks-with-reinforcement-learning/">produce melodies</a>, and a visiting PhD student recently worked with the Google Brain team to build a TensorFlow model that can <a href="https://research.googleblog.com/2016/10/supercharging-style-transfer.html">automatically interpolate between artistic styles</a>. <a href="https://deepmind.com/">DeepMind</a> has also <a href="https://research.googleblog.com/2016/04/deepmind-moves-to-tensorflow.html">decided to use TensorFlow</a> to power all of their research – for example, they recently produced <a href="https://deepmind.com/blog/wavenet-generative-model-raw-audio/">fascinating generative models</a> of speech and music based on raw audio.<br /><br />We’re especially excited to see how people all over the world are using TensorFlow. For example:<br /><br /><ul><li>Australian marine biologists are using TensorFlow to <a href="https://blog.google/topics/machine-learning/could-machine-learning-save-sea-cow">find sea cows</a> in tens of thousands of hi-res photos to better understand their populations, which are under threat of extinction.</li><li>An enterprising Japanese cucumber farmer trained a model with TensorFlow to <a href="https://cloud.google.com/blog/big-data/2016/08/how-a-japanese-cucumber-farmer-is-using-deep-learning-and-tensorflow">sort cucumbers</a> by size, shape, and other characteristics.</li><li>Radiologists have adapted TensorFlow to identify <a href="https://www.ncbi.nlm.nih.gov/pubmed/27730415">signs of Parkinson’s disease</a> in medical scans.</li><li>Data scientists in the Bay Area have rigged up TensorFlow and the Raspberry Pi to <a href="https://svds.com/introduction-to-trainspotting/">keep track of the Caltrain</a>.</li></ul><br />We’re committed to making sure TensorFlow scales all the way from research to production and from the tiniest Raspberry Pi all the way up to server farms filled with GPUs or TPUs. But TensorFlow is more than a single open-source project – we’re doing our best to foster an open-source ecosystem of related software and machine learning models around it:<br /><br /><ul><li>The <a href="https://research.googleblog.com/2016/02/running-your-models-in-production-with.html">TensorFlow Serving</a> project simplifies the process of serving TensorFlow models in production.</li><li>TensorFlow “<a href="https://research.googleblog.com/2016/06/wide-deep-learning-better-together-with.html">Wide and Deep</a>” models combine the strengths of traditional linear models and modern deep neural networks.</li><li>For those who are interested in working with TensorFlow in the cloud, <a href="https://cloud.google.com/">Google Cloud Platform</a> recently launched <a href="https://cloud.google.com/ml/">Cloud Machine Learning</a>, which offers TensorFlow as a managed service.</li></ul><br />Furthermore, <a href="https://github.com/tensorflow/models">TensorFlow’s repository of models</a> continues to grow with contributions from the community, with <a href="https://github.com/search?q=tensorflow">more than 3000 TensorFlow-related repositories</a> listed on GitHub alone! To participate in the TensorFlow community, you can follow our new Twitter account (<a href="https://twitter.com/tensorflow">@tensorflow</a>), <a href="https://github.com/tensorflow/tensorflow">find us on GitHub</a>, <a href="http://stackoverflow.com/questions/tagged/tensorflow">ask and answer questions on StackOverflow</a>, and join the <a href="https://groups.google.com/a/tensorflow.org/forum/#!forum/discuss">community discussion list</a>.<br /><br />Thanks very much to all of you who have already adopted TensorFlow in your cutting-edge products, your ambitious research, your fast-growing startups, and your school projects; special thanks to everyone who has <a href="https://github.com/tensorflow/tensorflow">contributed directly</a> to the codebase. In collaboration with the global machine learning community, we look forward to making TensorFlow even better in the years to come!]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/celebrating-tensorflows-first-year/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>App Discovery With Google Play, Part 1: Understanding Topics</title>
		<link>https://googledata.org/google-research/app-discovery-with-google-play-part-1-understanding-topics/</link>
		<comments>https://googledata.org/google-research/app-discovery-with-google-play-part-1-understanding-topics/#comments</comments>
		<pubDate>Tue, 08 Nov 2016 18:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=f234b99c8ed7965e5a00d5057b075423</guid>
		<description><![CDATA[<span>Posted by Malay Haldar, Matt MacMahon, Neha Jha and Raj Arasu, Software Engineers</span><br /><br />Every month, more than a billion users come to Google Play to download apps for their mobile devices. While some are looking for specific apps, like Snapchat, others come with only a broad notion of what they are interested in, like &#8220;<i>horror games</i>&#8221; or &#8220;<i>selfie apps</i>&#8221;. These broad searches by topic represent nearly half of the queries in Play Store, so it&#8217;s critical to find the most relevant apps. <br /><br />Searches by topic require more than simply indexing apps by query terms; they require an <i>understanding</i> of the topics associated with an app. Machine learning approaches have been applied to similar problems, but success heavily depends on the number of training examples to learn about a topic. While for some popular topics such as &#8220;<i>social networking</i>&#8221; we had many labeled apps to learn from, the majority of topics had only a handful of examples. Our challenge was to learn from a very limited number of training examples and scale to millions of apps across thousands of topics, forcing us to adapt our machine learning techniques. <br /><br />Our initial attempt was to build a <a href="https://en.wikipedia.org/wiki/Deep_learning">deep neural network</a> (DNN) trained to predict topics for an app based on words and phrases from the app title and description. For example, if the app description mentioned &#8220;<i>frightening</i>&#8221;, &#8220;<i>very scary</i>&#8221;, and &#8220;<i>fear</i>&#8221; then associate the &#8220;<i>horror game</i>&#8221; topic with it. However, given the learning capacity of DNNs, it completely &#8220;memorized&#8221; the topics for the apps in our small training data and failed to generalize to new apps it hadn&#8217;t seen before.<br /><br />To generalize effectively, we needed a much larger dataset to train on, so we turned to how people learn as inspiration. In contrast to DNNs, human beings need much less training data.  For example, you would likely need to see very few &#8220;<i>horror game</i>&#8221; app descriptions before learning how to generalize and associate new apps to that genre. Just by knowing the language describing the apps, people can correctly infer topics from even a few examples.<br /><br />To emulate this, we tried a very rough approximation of this language-centric learning. We trained a neural network to learn how language was used to describe apps. We built a <a href="https://www.tensorflow.org/versions/r0.8/tutorials/word2vec/index.html#the-skip-gram-model">Skip-gram model</a>, where the neural network attempts to predict the words around a given word, for example &#8220;<i>share</i>&#8221; given &#8220;<i>photo</i>&#8221;. The neural network encodes its knowledge as vectors of floating point numbers, referred to as <i>embeddings</i>. These embeddings were used to train another model called a <i>classifier</i>, capable of distinguishing which topics applied to an app. We now needed much less training data to learn about app topics, due to the large amount of learning already done with Skip-gram.<br /><br />While this architecture generalized well for popular topics like &#8220;<i>social networking</i>&#8221;, we ran into a new problem for more niche topics like &#8220;<i>selfie</i>&#8221;. The single classifier built to predict all the topics together focused most of its learning on the popular topics, ignoring the errors it made on the less common ones. To solve this problem we built a separate classifier for each topic and tuned them in isolation.<br /><br />This architecture produced reasonable results, but would still sometimes overgeneralize. For instance, it might associate <i>Facebook</i> with &#8220;<i>dating</i>&#8221; or <i>Plants vs Zombies</i> with &#8220;<i>educational games</i>&#8221;. To produce more precise classifiers, we needed higher volume and quality of training data. We treated the system described above as a coarse classifier that pruned down every possible {app, topic} pair, numbering in billions, to a more manageable list of {app, topic} pairs of interest. We built a pipeline to have human raters evaluate the classifier output and fed consensus results back as training data. This process allowed us to bootstrap from our existing system, giving us a path to steadily improve classifier performance.<br /><br /><div><a href="https://4.bp.blogspot.com/-ysKil48TeE8/WCIPiiLzrzI/AAAAAAAABX8/uYqpeRJ-4fMV40A29eQ-lKdpGlOU1EQgQCLcB/s1600/image00.png"><img border="0" height="450" src="https://4.bp.blogspot.com/-ysKil48TeE8/WCIPiiLzrzI/AAAAAAAABX8/uYqpeRJ-4fMV40A29eQ-lKdpGlOU1EQgQCLcB/s640/image00.png" width="640"></a></div><br />To evaluate {app, topic} pairs by human raters, we asked them questions of the form, &#8220;<i>To what extent is topic X related to app Y?</i>&#8221; Multiple raters received the same question and independently selected answers on a rating scale to indicate if the topic was &#8220;important&#8221; for the app, &#8220;somewhat related&#8221;, or completely &#8220;off-topic&#8221;. Our initial evaluations showed a high level of disagreement amongst the raters. Diving deeper, we identified several causes of disagreement: vague guidelines for answer selection, insufficient rater training, evaluating broad topics like &#8220;<i>computer files</i>&#8221; and &#8220;<i>game physics</i>&#8221; that applied to most apps or games. Tackling these issues led to significant gains in rater agreement. Asking raters to choose an explicit reason for their answer from a curated list further improved reliability. Despite the improvements, we sometimes still have to &#8220;agree to disagree&#8221; and currently discard answers where raters fail to reach consensus.<br /><br />These app topic classifiers enable search and discovery features in the <a href="https://play.google.com/store/apps?hl=en">Google Play Apps store</a>. The current system helps provide relevant results to our users, but we are constantly exploring new ways to improve the system, through additional signals, architectural improvements and new algorithms. In Part 2 of this series, we will discuss how to personalize the app discovery experience for users.<br /><br /><b>Acknowledgments</b><br />This work was done within the Google Play team in close collaboration with Liadan O'Callaghan, Yuhua Zhu, Mark Taylor and Michael Watson.<br /><br /><br />]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Malay Haldar, Matt MacMahon, Neha Jha and Raj Arasu, Software Engineers</span><br /><br />Every month, more than a billion users come to Google Play to download apps for their mobile devices. While some are looking for specific apps, like Snapchat, others come with only a broad notion of what they are interested in, like “<i>horror games</i>” or “<i>selfie apps</i>”. These broad searches by topic represent nearly half of the queries in Play Store, so it’s critical to find the most relevant apps. <br /><br />Searches by topic require more than simply indexing apps by query terms; they require an <i>understanding</i> of the topics associated with an app. Machine learning approaches have been applied to similar problems, but success heavily depends on the number of training examples to learn about a topic. While for some popular topics such as “<i>social networking</i>” we had many labeled apps to learn from, the majority of topics had only a handful of examples. Our challenge was to learn from a very limited number of training examples and scale to millions of apps across thousands of topics, forcing us to adapt our machine learning techniques. <br /><br />Our initial attempt was to build a <a href="https://en.wikipedia.org/wiki/Deep_learning">deep neural network</a> (DNN) trained to predict topics for an app based on words and phrases from the app title and description. For example, if the app description mentioned “<i>frightening</i>”, “<i>very scary</i>”, and “<i>fear</i>” then associate the “<i>horror game</i>” topic with it. However, given the learning capacity of DNNs, it completely “memorized” the topics for the apps in our small training data and failed to generalize to new apps it hadn’t seen before.<br /><br />To generalize effectively, we needed a much larger dataset to train on, so we turned to how people learn as inspiration. In contrast to DNNs, human beings need much less training data.  For example, you would likely need to see very few “<i>horror game</i>” app descriptions before learning how to generalize and associate new apps to that genre. Just by knowing the language describing the apps, people can correctly infer topics from even a few examples.<br /><br />To emulate this, we tried a very rough approximation of this language-centric learning. We trained a neural network to learn how language was used to describe apps. We built a <a href="https://www.tensorflow.org/versions/r0.8/tutorials/word2vec/index.html#the-skip-gram-model">Skip-gram model</a>, where the neural network attempts to predict the words around a given word, for example “<i>share</i>” given “<i>photo</i>”. The neural network encodes its knowledge as vectors of floating point numbers, referred to as <i>embeddings</i>. These embeddings were used to train another model called a <i>classifier</i>, capable of distinguishing which topics applied to an app. We now needed much less training data to learn about app topics, due to the large amount of learning already done with Skip-gram.<br /><br />While this architecture generalized well for popular topics like “<i>social networking</i>”, we ran into a new problem for more niche topics like “<i>selfie</i>”. The single classifier built to predict all the topics together focused most of its learning on the popular topics, ignoring the errors it made on the less common ones. To solve this problem we built a separate classifier for each topic and tuned them in isolation.<br /><br />This architecture produced reasonable results, but would still sometimes overgeneralize. For instance, it might associate <i>Facebook</i> with “<i>dating</i>” or <i>Plants vs Zombies</i> with “<i>educational games</i>”. To produce more precise classifiers, we needed higher volume and quality of training data. We treated the system described above as a coarse classifier that pruned down every possible {app, topic} pair, numbering in billions, to a more manageable list of {app, topic} pairs of interest. We built a pipeline to have human raters evaluate the classifier output and fed consensus results back as training data. This process allowed us to bootstrap from our existing system, giving us a path to steadily improve classifier performance.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-ysKil48TeE8/WCIPiiLzrzI/AAAAAAAABX8/uYqpeRJ-4fMV40A29eQ-lKdpGlOU1EQgQCLcB/s1600/image00.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="450" src="https://4.bp.blogspot.com/-ysKil48TeE8/WCIPiiLzrzI/AAAAAAAABX8/uYqpeRJ-4fMV40A29eQ-lKdpGlOU1EQgQCLcB/s640/image00.png" width="640" /></a></div><br />To evaluate {app, topic} pairs by human raters, we asked them questions of the form, “<i>To what extent is topic X related to app Y?</i>” Multiple raters received the same question and independently selected answers on a rating scale to indicate if the topic was “important” for the app, “somewhat related”, or completely “off-topic”. Our initial evaluations showed a high level of disagreement amongst the raters. Diving deeper, we identified several causes of disagreement: vague guidelines for answer selection, insufficient rater training, evaluating broad topics like “<i>computer files</i>” and “<i>game physics</i>” that applied to most apps or games. Tackling these issues led to significant gains in rater agreement. Asking raters to choose an explicit reason for their answer from a curated list further improved reliability. Despite the improvements, we sometimes still have to “agree to disagree” and currently discard answers where raters fail to reach consensus.<br /><br />These app topic classifiers enable search and discovery features in the <a href="https://play.google.com/store/apps?hl=en">Google Play Apps store</a>. The current system helps provide relevant results to our users, but we are constantly exploring new ways to improve the system, through additional signals, architectural improvements and new algorithms. In Part 2 of this series, we will discuss how to personalize the app discovery experience for users.<br /><br /><b>Acknowledgments</b><br />This work was done within the Google Play team in close collaboration with Liadan O'Callaghan, Yuhua Zhu, Mark Taylor and Michael Watson.<br /><br /><br />]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/app-discovery-with-google-play-part-1-understanding-topics/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Research suggestions at your fingertips with Explore in Docs</title>
		<link>https://googledata.org/google-docs/research-suggestions-at-your-fingertips-with-explore-in-docs/</link>
		<comments>https://googledata.org/google-docs/research-suggestions-at-your-fingertips-with-explore-in-docs/#comments</comments>
		<pubDate>Tue, 01 Nov 2016 17:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Docs]]></category>
		<category><![CDATA[Google Research]]></category>
		<category><![CDATA[education]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=d1c80080568d0da0190341b473c04b71</guid>
		<description><![CDATA[<span>Posted by Kishore Papineni, Research Scientist, Google Research NY</span><br /><br />Enabling easy access to vast amounts of information across multiple languages and modalities (from text to images to video), computers have become highly influential tools for learning, allowing you to use the world&#8217;s information to aid you with your research. However, when researching a topic or writing a term paper, gathering all the information you need from a variety of sources on the Internet can be time-consuming, and at times, a distraction from the writing process. <br /><br />That&#8217;s why we developed algorithms for <a href="https://blog.google/products/docs/explore-docs-sheets-and-slides/">Explore in Docs</a>, a collaboration between the Coauthor and Apps teams that uses powerful Google infrastructure, best-in-class information retrieval, machine learning, and machine translation technologies to assemble the relevant information and sources for a research paper, all within the document. Explore in Docs suggests relevant content&#8212;in the form of topics, images, and snippets &#8212;based on the content of the document, allowing the user to focus on critical thinking and idea development.<br /><br /><b>More than just a Search</b><br /><br />Suggesting material that is relevant to the content in a Google Doc is a difficult problem. A naive approach would be to consider the content of a document as a Search query.  However, search engines are not designed to accept large blocks of text as queries, so they might truncate the query or focus on the wrong words. So the challenge becomes not only identifying relevant search terms based on the <i>overall</i> content of the document, but additionally providing <i>related</i> topics that may be useful.  <br /><br />To tackle this, the Coauthor team built algorithms that are able to associate external content with topics - entities, abstract concepts - in a document and assign relative importance to each of them. This is accomplished by creating a &#8220;target&#8221; in a topic vector space that incorporates not only the topics you are writing about but also related topics, creating a variety of search terms that include both. Then, each returned search result (piece of text, image, etc) is embedded in the same vector space and the closest items in that vector space are suggested to the user. <br /><br />For example, if you&#8217;re writing about <a href="https://en.wikipedia.org/wiki/Monarch_butterfly">monarch butterflies</a>, our algorithms find that <i>monarch butterfly</i> and <i>milkweed plant</i> are related to each other. This is done by analyzing the statistics of discourse on the web, collected from hundreds of billions of sentences from billions of webpages across dozens of languages. Note that these two are not semantically close (an insect versus a plant). An example of a set of learned relations is below:<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://2.bp.blogspot.com/-7s-aI39cyME/WBi0dSTspoI/AAAAAAAABXk/xFosN3T9Ovoa62Iq2mEWDDDhjXCCydI2QCLcB/s1600/image01.png"><img border="0" height="529" src="https://2.bp.blogspot.com/-7s-aI39cyME/WBi0dSTspoI/AAAAAAAABXk/xFosN3T9Ovoa62Iq2mEWDDDhjXCCydI2QCLcB/s640/image01.png" width="640"></a></td></tr><tr><td>The connection between concepts related to "monarch butterfly", with the thickness of the lines representing the strength of connection, as determined by analysis of discourse on the web. Because this is a discourse graph and not a concept/classification hierarchy, this analysis indicates that "Butterflies &#38; moth" and "Monarch butterfly" are not discussed together as often as "monarch butterfly" and "milkweed".</td></tr></tbody></table>And because we take the entire document into account while constructing the search request and scoring each candidate piece of text, the resulting suggestions are typically different and more varied than the search snippets users would see if they search the web for each topic individually. By eliminating the need to switch tabs to search, and additionally suggesting new, related topics based on discourse on the web, Explore provides opportunities for learning that users might not discover otherwise - all from the Doc that they&#8217;re currently working in! <br /><div><a href="https://1.bp.blogspot.com/-bUE_Cg1OJKI/WBizW-lwhqI/AAAAAAAABXc/OnLKPNe_8I0ogL4hE0eFopoUULRMfmP1gCLcB/s1600/image02.gif"><img border="0" height="308" src="https://1.bp.blogspot.com/-bUE_Cg1OJKI/WBizW-lwhqI/AAAAAAAABXc/OnLKPNe_8I0ogL4hE0eFopoUULRMfmP1gCLcB/s640/image02.gif" width="640"></a></div><br /><b>The information you need, in multiple languages</b><br /><br />Cross-lingual predictive search is another key aspect of what we have designed and built. If the relevant material is likely to be in foreign languages, Google searches the web in those languages and translates the selected nuggets into the language of the document. <br /><br />In the example pictured below, the user begins to type an essay in Docs about Claudia Neto and clicks on the &#8220;Explore&#8221; button to learn more about her. Explore returns relevant &#8220;Topics&#8221; and &#8220;Images&#8221; as well as &#8220;Related Research&#8221; sourced from multiple websites. Also, Explore suggests Dolores Silva as a related topic since she and Claudia have high mutual information in multilingual web text (statistics collected from more than 10 billion webpages).<br /><div><a href="https://2.bp.blogspot.com/-JtRF-_RDge4/WBizpktj1jI/AAAAAAAABXg/JsH_7iFxohI83geWtljb9dAkGCOMjpd9gCLcB/s1600/image00.gif"><img border="0" height="416" src="https://2.bp.blogspot.com/-JtRF-_RDge4/WBizpktj1jI/AAAAAAAABXg/JsH_7iFxohI83geWtljb9dAkGCOMjpd9gCLcB/s640/image00.gif" width="640"></a></div>Because Swedish ranks high among languages that have significant discourse on Claudia Neto, our algorithms search Swedish content on the Internet for any additional information about her that might not be available on English websites. Before returning information obtained from the Swedish websites, we use <a href="https://translate.google.com/">Google Translate</a> to render the nugget in the user&#8217;s preferred language (in this case, English). Related Research is currently available in 10 languages with more to come in the future.<br /><br /><a href="https://blog.google/products/docs/explore-docs-sheets-and-slides/">Explore in Docs</a> is a useful tool that can be used worldwide, in all forms of industry and at all levels of education. Try out the Explore feature the next time you create a document, and check back for more exciting progress from the Coauthor team!]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Kishore Papineni, Research Scientist, Google Research NY</span><br /><br />Enabling easy access to vast amounts of information across multiple languages and modalities (from text to images to video), computers have become highly influential tools for learning, allowing you to use the world’s information to aid you with your research. However, when researching a topic or writing a term paper, gathering all the information you need from a variety of sources on the Internet can be time-consuming, and at times, a distraction from the writing process. <br /><br />That’s why we developed algorithms for <a href="https://blog.google/products/docs/explore-docs-sheets-and-slides/">Explore in Docs</a>, a collaboration between the Coauthor and Apps teams that uses powerful Google infrastructure, best-in-class information retrieval, machine learning, and machine translation technologies to assemble the relevant information and sources for a research paper, all within the document. Explore in Docs suggests relevant content—in the form of topics, images, and snippets —based on the content of the document, allowing the user to focus on critical thinking and idea development.<br /><br /><b>More than just a Search</b><br /><br />Suggesting material that is relevant to the content in a Google Doc is a difficult problem. A naive approach would be to consider the content of a document as a Search query.  However, search engines are not designed to accept large blocks of text as queries, so they might truncate the query or focus on the wrong words. So the challenge becomes not only identifying relevant search terms based on the <i>overall</i> content of the document, but additionally providing <i>related</i> topics that may be useful.  <br /><br />To tackle this, the Coauthor team built algorithms that are able to associate external content with topics - entities, abstract concepts - in a document and assign relative importance to each of them. This is accomplished by creating a “target” in a topic vector space that incorporates not only the topics you are writing about but also related topics, creating a variety of search terms that include both. Then, each returned search result (piece of text, image, etc) is embedded in the same vector space and the closest items in that vector space are suggested to the user. <br /><br />For example, if you’re writing about <a href="https://en.wikipedia.org/wiki/Monarch_butterfly">monarch butterflies</a>, our algorithms find that <i>monarch butterfly</i> and <i>milkweed plant</i> are related to each other. This is done by analyzing the statistics of discourse on the web, collected from hundreds of billions of sentences from billions of webpages across dozens of languages. Note that these two are not semantically close (an insect versus a plant). An example of a set of learned relations is below:<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://2.bp.blogspot.com/-7s-aI39cyME/WBi0dSTspoI/AAAAAAAABXk/xFosN3T9Ovoa62Iq2mEWDDDhjXCCydI2QCLcB/s1600/image01.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="529" src="https://2.bp.blogspot.com/-7s-aI39cyME/WBi0dSTspoI/AAAAAAAABXk/xFosN3T9Ovoa62Iq2mEWDDDhjXCCydI2QCLcB/s640/image01.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The connection between concepts related to "monarch butterfly", with the thickness of the lines representing the strength of connection, as determined by analysis of discourse on the web. Because this is a discourse graph and not a concept/classification hierarchy, this analysis indicates that "Butterflies &amp; moth" and "Monarch butterfly" are not discussed together as often as "monarch butterfly" and "milkweed".</td></tr></tbody></table>And because we take the entire document into account while constructing the search request and scoring each candidate piece of text, the resulting suggestions are typically different and more varied than the search snippets users would see if they search the web for each topic individually. By eliminating the need to switch tabs to search, and additionally suggesting new, related topics based on discourse on the web, Explore provides opportunities for learning that users might not discover otherwise - all from the Doc that they’re currently working in! <br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-bUE_Cg1OJKI/WBizW-lwhqI/AAAAAAAABXc/OnLKPNe_8I0ogL4hE0eFopoUULRMfmP1gCLcB/s1600/image02.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="308" src="https://1.bp.blogspot.com/-bUE_Cg1OJKI/WBizW-lwhqI/AAAAAAAABXc/OnLKPNe_8I0ogL4hE0eFopoUULRMfmP1gCLcB/s640/image02.gif" width="640" /></a></div><br /><b>The information you need, in multiple languages</b><br /><br />Cross-lingual predictive search is another key aspect of what we have designed and built. If the relevant material is likely to be in foreign languages, Google searches the web in those languages and translates the selected nuggets into the language of the document. <br /><br />In the example pictured below, the user begins to type an essay in Docs about Claudia Neto and clicks on the “Explore” button to learn more about her. Explore returns relevant “Topics” and “Images” as well as “Related Research” sourced from multiple websites. Also, Explore suggests Dolores Silva as a related topic since she and Claudia have high mutual information in multilingual web text (statistics collected from more than 10 billion webpages).<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-JtRF-_RDge4/WBizpktj1jI/AAAAAAAABXg/JsH_7iFxohI83geWtljb9dAkGCOMjpd9gCLcB/s1600/image00.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="416" src="https://2.bp.blogspot.com/-JtRF-_RDge4/WBizpktj1jI/AAAAAAAABXg/JsH_7iFxohI83geWtljb9dAkGCOMjpd9gCLcB/s640/image00.gif" width="640" /></a></div>Because Swedish ranks high among languages that have significant discourse on Claudia Neto, our algorithms search Swedish content on the Internet for any additional information about her that might not be available on English websites. Before returning information obtained from the Swedish websites, we use <a href="https://translate.google.com/">Google Translate</a> to render the nugget in the user’s preferred language (in this case, English). Related Research is currently available in 10 languages with more to come in the future.<br /><br /><a href="https://blog.google/products/docs/explore-docs-sheets-and-slides/">Explore in Docs</a> is a useful tool that can be used worldwide, in all forms of industry and at all levels of education. Try out the Explore feature the next time you create a document, and check back for more exciting progress from the Coauthor team!]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-docs/research-suggestions-at-your-fingertips-with-explore-in-docs/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Supercharging Style Transfer</title>
		<link>https://googledata.org/google-research/supercharging-style-transfer/</link>
		<comments>https://googledata.org/google-research/supercharging-style-transfer/#comments</comments>
		<pubDate>Wed, 26 Oct 2016 16:30:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=bb5bb29476adc8bdcf42ecc9d70c68b0</guid>
		<description><![CDATA[<span>Posted by Vincent Dumoulin<a href="http://research.googleblog.com/#1" name="top1"><sup>*</sup></a>, Jonathon Shlens and Manjunath Kudlur, Google Brain Team</span><br /><br /><i><a href="https://en.wikipedia.org/wiki/Pastiche">Pastiche</a></i>. A French word, it designates a work of art that imitates the style of another one (not to be confused with its more humorous Greek cousin, <i>parody</i>). Although it has been used for a long time in visual art, music and literature, pastiche has been getting mass attention lately with <a href="https://www.reddit.com/r/deepstyle">online forums</a> dedicated to images that have been modified to be in the style of famous paintings. Using a technique known as <i>style transfer</i>, these images are generated by phone or web apps that allow a user to render their favorite picture in the style of a well known work of art.<br /><br />Although users have already produced gorgeous pastiches using the current technology, we feel that it could be made even more engaging. Right now, each painting is its own island, so to speak: the user provides a content image, selects an artistic style and gets a pastiche back.  But what if one could combine many different styles, exploring unique mixtures of well known artists to create an entirely unique pastiche?<br /><br /><b>Learning a representation for artistic style</b><br /><br />In our recent paper titled &#8220;<i><a href="https://arxiv.org/abs/1610.07629">A Learned Representation for Artistic Style</a></i>&#8221;, we introduce a simple method to allow a single deep convolutional style transfer network to learn multiple styles at the same time. The network, having learned multiple styles, is able to do <i>style interpolation</i>, where the pastiche varies smoothly from one style to another. Our method enables style interpolation in real-time as well, allowing this to be applied not only to static images, but also videos.<br /><div></div><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td>Credit: awesome dog role played by Google Brain team office dog Picabo.</td></tr></tbody></table>In the video above, multiple styles are combined in real-time and the resulting style is applied <i>using a single style transfer network</i>. The user is provided with a set of 13 different painting styles and adjusts their relative strengths in the final style via sliders. In this demonstration, the user is an active participant in producing the pastiche.<br /><br /><b>A Quick History of Style Transfer</b><br /><br />While transferring the style of one image to another has existed for nearly 15 years [1] [2], leveraging neural networks to accomplish it is both very recent and very fascinating. In &#8220;<i><a href="https://arxiv.org/abs/1508.06576">A Neural Algorithm of Artistic Style</a></i>&#8221; [3], researchers Gatys, Ecker &#38; Bethge introduced a method that uses deep convolutional neural network (CNN) classifiers. The pastiche image is found via optimization: the algorithm looks for an image which elicits the same kind of activations in the CNN&#8217;s lower layers - which capture the overall rough aesthetic of the style input (broad brushstrokes, cubist patterns, etc.) - yet produces activations in the higher layers - which capture the things that make the subject recognizable - that are close to those produced by the content image. From some starting point (e.g. random noise, or the content image itself), the pastiche image is progressively refined until these requirements are met.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://2.bp.blogspot.com/-kV4SKTFlWQk/WA6n82yFFJI/AAAAAAAABWY/9GcePSQZ7qcY95b7zVnCBR4ABWR7K2o4gCLcB/s1600/image04.png"><img border="0" height="190" src="https://2.bp.blogspot.com/-kV4SKTFlWQk/WA6n82yFFJI/AAAAAAAABWY/9GcePSQZ7qcY95b7zVnCBR4ABWR7K2o4gCLcB/s640/image04.png" width="640"></a></td></tr><tr><td>Content image: The <a href="https://commons.wikimedia.org/wiki/File:Tuebingen_Neckarfront.jpg">T&#252;bingen Neckarfront by Andreas Praefcke</a>, Style painting: &#8220;<a href="https://www.google.com/culturalinstitute/beta/u/0/asset/head-of-a-clown/CQHMqKf7DRo76w">Head of a Clown</a>&#8221;, by <a href="https://en.wikipedia.org/wiki/Georges_Rouault">Georges Rouault</a>.</td></tr></tbody></table>The pastiches produced via this algorithm look <i>spectacular</i>:<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://3.bp.blogspot.com/-jYGbp0Ow1Cc/WA6oWw63F7I/AAAAAAAABWc/8_E5A1dbPP4xeo1GuIGTsYvG6TuXIfmoQCLcB/s1600/image06.png"><img border="0" height="478" src="https://3.bp.blogspot.com/-jYGbp0Ow1Cc/WA6oWw63F7I/AAAAAAAABWc/8_E5A1dbPP4xeo1GuIGTsYvG6TuXIfmoQCLcB/s640/image06.png" width="640"></a></td></tr><tr><td>Figure adapted from L. Gatys et al. "<a href="https://arxiv.org/abs/1508.06576">A Neural Algorithm of Artistic Style</a>" (2015).&#160;</td></tr></tbody></table>This work is considered a breakthrough in the field of deep learning research because it provided the first proof of concept for neural network-based style transfer. Unfortunately this method for stylizing an individual image is computationally demanding. For instance, in the first demos available on the web, one would upload a photo to a server, and then still have plenty of time to go grab a cup of coffee before a result was available.<br /><br />This process was sped up significantly by subsequent research [4, 5] that recognized that this optimization problem may be recast as an image transformation problem, where one wishes to apply a single, fixed painting style to an arbitrary content image (e.g. a photograph). The problem can then be solved by teaching a feed-forward, deep convolutional neural network to alter a corpus of content images to match the style of a painting. The goal of the trained network is two-fold: maintain the content of the original image while matching the visual style of the painting.<br /><br />The end result of this was that what once took a few minutes for a single static image, could now be run real time (e.g. applying style transfer to a live video). However, the increase in speed that allowed real-time style transfer came with a cost - a given style transfer network is tied to the style of a <i>single</i> painting, losing some flexibility of the original algorithm, which was not tied to any one style. This means that to build a style transfer system capable of modeling 100 paintings, one has to train and store <i>100 separate style transfer networks</i>.<br /><br /><b>Our Contribution: Learning and Combining Multiple Styles</b><br /><br />We started from the observation that many artists from the impressionist period employ similar brush stroke techniques and color palettes. Furthermore, painting by say, Monet, are even more visually similar.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://1.bp.blogspot.com/-Y0f_lRc2kx4/WA6pgGGr32I/AAAAAAAABWo/EQZ35CY_mUkC2YSrlumcD5rb3xFKRTqPACLcB/s1600/img3.png"><img border="0" height="254" src="https://1.bp.blogspot.com/-Y0f_lRc2kx4/WA6pgGGr32I/AAAAAAAABWo/EQZ35CY_mUkC2YSrlumcD5rb3xFKRTqPACLcB/s640/img3.png" width="640"></a></td></tr><tr><td><a href="https://commons.wikimedia.org/wiki/File:Claude_Monet_037.jpg">Poppy Field</a> (left) and <a href="https://en.wikipedia.org/wiki/Impression,_Sunrise">Impression, Sunrise</a> (right) by <a href="https://en.wikipedia.org/wiki/Claude_Monet">Claude Monet</a>. Images from Wikipedia</td></tr></tbody></table>We leveraged this observation in our training of a machine learning system. That is, we trained a single system that is able to capture and generalize across many Monet paintings or even a diverse array of artists across genres. The pastiches produced are qualitatively comparable to those produced in previous work, while originating from the <i>same style transfer network</i>.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://1.bp.blogspot.com/-73F44Q3CaOE/WA6uu5YKZSI/AAAAAAAABXE/XDcKcn6Wa98cWCJI3-LuniItgG7oUNxIwCLcB/s1600/image03.png"><img border="0" height="424" src="https://1.bp.blogspot.com/-73F44Q3CaOE/WA6uu5YKZSI/AAAAAAAABXE/XDcKcn6Wa98cWCJI3-LuniItgG7oUNxIwCLcB/s640/image03.png" width="640"></a></td></tr><tr><td>Pastiches produced by our single network, trained on 32 varied styles. These pastiches are qualitatively equivalent to those created by single-style networks: Image Credit: (from top to bottom) content photographs by <a href="https://commons.wikimedia.org/wiki/File:Tuebingen_Neckarfront.jpg">Andreas Praefcke</a>, <a href="https://en.wikipedia.org/wiki/File:GoldenGateBridge-001.jpg#file">Rich Niewiroski Jr.</a> and <a href="https://commons.wikimedia.org/wiki/File:Schultenhof_Mettingen_Bauerngarten_8.jpg">J.-H. Jan&#223;en</a>, (from left to right) style paintings by <a href="https://en.wikipedia.org/wiki/William_Glackens">William Glackens</a>, <a href="https://en.wikipedia.org/wiki/Paul_Signac">Paul Signac</a>, <a href="https://en.wikipedia.org/wiki/Georges_Rouault">Georges Rouault</a>, <a href="https://en.wikipedia.org/wiki/Edvard_Munch">Edvard Munch</a> and <a href="https://en.wikipedia.org/wiki/Vincent_van_Gogh">Vincent van Gogh</a>.</td></tr></tbody></table><div></div>The technique we developed is simple to implement and is not memory intensive. Furthermore, our network, trained on several artistic styles, permits arbitrary combining multiple painting styles <b>in real-time</b>, as shown in the video above. Here are four styles being combined in different proportions on a photograph of <a href="https://en.wikipedia.org/wiki/T%C3%BCbingen">T&#252;bingen</a>:<br /><div><a href="https://2.bp.blogspot.com/-8xKOzpnsDCs/WA6slX7skUI/AAAAAAAABW4/iZ52N19hNSIpUgAQDXWnGaKt5FD4yGGtQCLcB/s1600/image05.jpg"><img border="0" height="456" src="https://2.bp.blogspot.com/-8xKOzpnsDCs/WA6slX7skUI/AAAAAAAABW4/iZ52N19hNSIpUgAQDXWnGaKt5FD4yGGtQCLcB/s640/image05.jpg" width="640"></a></div>Unlike previous approaches to fast style transfer, we feel that this method of modeling multiple styles at the same time opens the door to exciting new ways for users to interact with style transfer algorithms, not only allowing the freedom to create new styles based on the mixture of several others, but to do it in real-time. Stay tuned for a future post on the <a href="https://magenta.tensorflow.org/2016/11/01/multistyle-pastiche-generator/">Magenta blog</a>, in which we will describe the algorithm in more detail and release the <a href="https://www.tensorflow.org/">TensorFlow</a> source code to run this model and demo yourself. We also recommend that you check out <a href="https://www.youtube.com/watch?v=WHmp26bh0tI">Nat &#38; Lo&#8217;s fantastic video explanation</a> on the subject of style transfer.<br /><br /><b>References</b><br /><b><br /></b> [1] Efros, Alexei A., and William T. Freeman. <i><a href="https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/papers/efros-siggraph01.pdf">Image quilting for texture synthesis and transfer</a> </i>(2001).<br /><br />[2] Hertzmann, Aaron, Charles E. Jacobs, Nuria Oliver, Brian Curless, and David H. Salesin. <a href="http://mrl.nyu.edu/publications/image-analogies/"><i>Image analogies </i></a>(2001).<br /><br />[3] Gatys, Leon A., Alexander S. Ecker, and Matthias Bethge. <a href="https://arxiv.org/abs/1508.06576"><i>A Neural Algorithm of Artistic Style</i></a> (2015).<br /><br />[4] Ulyanov, Dmitry, Vadim Lebedev, Andrea Vedaldi, and Victor Lempitsky. <a href="https://arxiv.org/abs/1603.03417">Texture Networks: Feed-forward Synthesis of Textures and Stylized Images</a> (2016).<br /><br />[5] Johnson, Justin, Alexandre Alahi, and Li Fei-Fei. P<a href="https://arxiv.org/abs/1603.08155">erceptual Losses for Real-Time Style Transfer and Super-Resolution</a> (2016).<br /><br /><span><br /><a name="1"><b>* </b></a>This work was done during an internship with the <a href="http://g.co/brain">Google Brain Team</a>. <a href="http://vdumoulin.github.io/about">Vincent</a> is currently a Ph.D. candidate at MILA, Universit&#233; de Montr&#233;al.<a href="http://research.googleblog.com/#top1"><sup>&#8617;</sup></a><br /></span><br /><br /><br />]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Vincent Dumoulin<a href="http://research.googleblog.com/2016/10/supercharging-style-transfer.html#1" name="top1"><sup>*</sup></a>, Jonathon Shlens and Manjunath Kudlur, Google Brain Team</span><br /><br /><i><a href="https://en.wikipedia.org/wiki/Pastiche">Pastiche</a></i>. A French word, it designates a work of art that imitates the style of another one (not to be confused with its more humorous Greek cousin, <i>parody</i>). Although it has been used for a long time in visual art, music and literature, pastiche has been getting mass attention lately with <a href="https://www.reddit.com/r/deepstyle">online forums</a> dedicated to images that have been modified to be in the style of famous paintings. Using a technique known as <i>style transfer</i>, these images are generated by phone or web apps that allow a user to render their favorite picture in the style of a well known work of art.<br /><br />Although users have already produced gorgeous pastiches using the current technology, we feel that it could be made even more engaging. Right now, each painting is its own island, so to speak: the user provides a content image, selects an artistic style and gets a pastiche back.  But what if one could combine many different styles, exploring unique mixtures of well known artists to create an entirely unique pastiche?<br /><br /><b>Learning a representation for artistic style</b><br /><br />In our recent paper titled “<i><a href="https://arxiv.org/abs/1610.07629">A Learned Representation for Artistic Style</a></i>”, we introduce a simple method to allow a single deep convolutional style transfer network to learn multiple styles at the same time. The network, having learned multiple styles, is able to do <i>style interpolation</i>, where the pastiche varies smoothly from one style to another. Our method enables style interpolation in real-time as well, allowing this to be applied not only to static images, but also videos.<br /><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/6ZHiARZmiUI/0.jpg" frameborder="0" height="360" src="https://www.youtube.com/embed/6ZHiARZmiUI?rel=0&amp;feature=player_embedded" width="640"></iframe></div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td class="tr-caption" style="text-align: center;">Credit: awesome dog role played by Google Brain team office dog Picabo.</td></tr></tbody></table>In the video above, multiple styles are combined in real-time and the resulting style is applied <i>using a single style transfer network</i>. The user is provided with a set of 13 different painting styles and adjusts their relative strengths in the final style via sliders. In this demonstration, the user is an active participant in producing the pastiche.<br /><br /><b>A Quick History of Style Transfer</b><br /><br />While transferring the style of one image to another has existed for nearly 15 years [1] [2], leveraging neural networks to accomplish it is both very recent and very fascinating. In “<i><a href="https://arxiv.org/abs/1508.06576">A Neural Algorithm of Artistic Style</a></i>” [3], researchers Gatys, Ecker &amp; Bethge introduced a method that uses deep convolutional neural network (CNN) classifiers. The pastiche image is found via optimization: the algorithm looks for an image which elicits the same kind of activations in the CNN’s lower layers - which capture the overall rough aesthetic of the style input (broad brushstrokes, cubist patterns, etc.) - yet produces activations in the higher layers - which capture the things that make the subject recognizable - that are close to those produced by the content image. From some starting point (e.g. random noise, or the content image itself), the pastiche image is progressively refined until these requirements are met.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://2.bp.blogspot.com/-kV4SKTFlWQk/WA6n82yFFJI/AAAAAAAABWY/9GcePSQZ7qcY95b7zVnCBR4ABWR7K2o4gCLcB/s1600/image04.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="190" src="https://2.bp.blogspot.com/-kV4SKTFlWQk/WA6n82yFFJI/AAAAAAAABWY/9GcePSQZ7qcY95b7zVnCBR4ABWR7K2o4gCLcB/s640/image04.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Content image: The <a href="https://commons.wikimedia.org/wiki/File:Tuebingen_Neckarfront.jpg">Tübingen Neckarfront by Andreas Praefcke</a>, Style painting: “<a href="https://www.google.com/culturalinstitute/beta/u/0/asset/head-of-a-clown/CQHMqKf7DRo76w">Head of a Clown</a>”, by <a href="https://en.wikipedia.org/wiki/Georges_Rouault">Georges Rouault</a>.</td></tr></tbody></table>The pastiches produced via this algorithm look <i>spectacular</i>:<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://3.bp.blogspot.com/-jYGbp0Ow1Cc/WA6oWw63F7I/AAAAAAAABWc/8_E5A1dbPP4xeo1GuIGTsYvG6TuXIfmoQCLcB/s1600/image06.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="478" src="https://3.bp.blogspot.com/-jYGbp0Ow1Cc/WA6oWw63F7I/AAAAAAAABWc/8_E5A1dbPP4xeo1GuIGTsYvG6TuXIfmoQCLcB/s640/image06.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Figure adapted from L. Gatys et al. "<a href="https://arxiv.org/abs/1508.06576">A Neural Algorithm of Artistic Style</a>" (2015).&nbsp;</td></tr></tbody></table>This work is considered a breakthrough in the field of deep learning research because it provided the first proof of concept for neural network-based style transfer. Unfortunately this method for stylizing an individual image is computationally demanding. For instance, in the first demos available on the web, one would upload a photo to a server, and then still have plenty of time to go grab a cup of coffee before a result was available.<br /><br />This process was sped up significantly by subsequent research [4, 5] that recognized that this optimization problem may be recast as an image transformation problem, where one wishes to apply a single, fixed painting style to an arbitrary content image (e.g. a photograph). The problem can then be solved by teaching a feed-forward, deep convolutional neural network to alter a corpus of content images to match the style of a painting. The goal of the trained network is two-fold: maintain the content of the original image while matching the visual style of the painting.<br /><br />The end result of this was that what once took a few minutes for a single static image, could now be run real time (e.g. applying style transfer to a live video). However, the increase in speed that allowed real-time style transfer came with a cost - a given style transfer network is tied to the style of a <i>single</i> painting, losing some flexibility of the original algorithm, which was not tied to any one style. This means that to build a style transfer system capable of modeling 100 paintings, one has to train and store <i>100 separate style transfer networks</i>.<br /><br /><b>Our Contribution: Learning and Combining Multiple Styles</b><br /><br />We started from the observation that many artists from the impressionist period employ similar brush stroke techniques and color palettes. Furthermore, painting by say, Monet, are even more visually similar.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-Y0f_lRc2kx4/WA6pgGGr32I/AAAAAAAABWo/EQZ35CY_mUkC2YSrlumcD5rb3xFKRTqPACLcB/s1600/img3.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="254" src="https://1.bp.blogspot.com/-Y0f_lRc2kx4/WA6pgGGr32I/AAAAAAAABWo/EQZ35CY_mUkC2YSrlumcD5rb3xFKRTqPACLcB/s640/img3.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><a href="https://commons.wikimedia.org/wiki/File:Claude_Monet_037.jpg">Poppy Field</a> (left) and <a href="https://en.wikipedia.org/wiki/Impression,_Sunrise">Impression, Sunrise</a> (right) by <a href="https://en.wikipedia.org/wiki/Claude_Monet">Claude Monet</a>. Images from Wikipedia</td></tr></tbody></table>We leveraged this observation in our training of a machine learning system. That is, we trained a single system that is able to capture and generalize across many Monet paintings or even a diverse array of artists across genres. The pastiches produced are qualitatively comparable to those produced in previous work, while originating from the <i>same style transfer network</i>.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-73F44Q3CaOE/WA6uu5YKZSI/AAAAAAAABXE/XDcKcn6Wa98cWCJI3-LuniItgG7oUNxIwCLcB/s1600/image03.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="424" src="https://1.bp.blogspot.com/-73F44Q3CaOE/WA6uu5YKZSI/AAAAAAAABXE/XDcKcn6Wa98cWCJI3-LuniItgG7oUNxIwCLcB/s640/image03.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Pastiches produced by our single network, trained on 32 varied styles. These pastiches are qualitatively equivalent to those created by single-style networks: Image Credit: (from top to bottom) content photographs by <a href="https://commons.wikimedia.org/wiki/File:Tuebingen_Neckarfront.jpg">Andreas Praefcke</a>, <a href="https://en.wikipedia.org/wiki/File:GoldenGateBridge-001.jpg#file">Rich Niewiroski Jr.</a> and <a href="https://commons.wikimedia.org/wiki/File:Schultenhof_Mettingen_Bauerngarten_8.jpg">J.-H. Janßen</a>, (from left to right) style paintings by <a href="https://en.wikipedia.org/wiki/William_Glackens">William Glackens</a>, <a href="https://en.wikipedia.org/wiki/Paul_Signac">Paul Signac</a>, <a href="https://en.wikipedia.org/wiki/Georges_Rouault">Georges Rouault</a>, <a href="https://en.wikipedia.org/wiki/Edvard_Munch">Edvard Munch</a> and <a href="https://en.wikipedia.org/wiki/Vincent_van_Gogh">Vincent van Gogh</a>.</td></tr></tbody></table><div class="separator" style="clear: both; text-align: center;"></div>The technique we developed is simple to implement and is not memory intensive. Furthermore, our network, trained on several artistic styles, permits arbitrary combining multiple painting styles <b>in real-time</b>, as shown in the video above. Here are four styles being combined in different proportions on a photograph of <a href="https://en.wikipedia.org/wiki/T%C3%BCbingen">Tübingen</a>:<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-8xKOzpnsDCs/WA6slX7skUI/AAAAAAAABW4/iZ52N19hNSIpUgAQDXWnGaKt5FD4yGGtQCLcB/s1600/image05.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="456" src="https://2.bp.blogspot.com/-8xKOzpnsDCs/WA6slX7skUI/AAAAAAAABW4/iZ52N19hNSIpUgAQDXWnGaKt5FD4yGGtQCLcB/s640/image05.jpg" width="640" /></a></div>Unlike previous approaches to fast style transfer, we feel that this method of modeling multiple styles at the same time opens the door to exciting new ways for users to interact with style transfer algorithms, not only allowing the freedom to create new styles based on the mixture of several others, but to do it in real-time. Stay tuned for a future post on the <a href="https://magenta.tensorflow.org/2016/11/01/multistyle-pastiche-generator/">Magenta blog</a>, in which we will describe the algorithm in more detail and release the <a href="https://www.tensorflow.org/">TensorFlow</a> source code to run this model and demo yourself. We also recommend that you check out <a href="https://www.youtube.com/watch?v=WHmp26bh0tI">Nat &amp; Lo’s fantastic video explanation</a> on the subject of style transfer.<br /><br /><b>References</b><br /><b><br /></b> [1] Efros, Alexei A., and William T. Freeman. <i><a href="https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/papers/efros-siggraph01.pdf">Image quilting for texture synthesis and transfer</a> </i>(2001).<br /><br />[2] Hertzmann, Aaron, Charles E. Jacobs, Nuria Oliver, Brian Curless, and David H. Salesin. <a href="http://mrl.nyu.edu/publications/image-analogies/"><i>Image analogies </i></a>(2001).<br /><br />[3] Gatys, Leon A., Alexander S. Ecker, and Matthias Bethge. <a href="https://arxiv.org/abs/1508.06576"><i>A Neural Algorithm of Artistic Style</i></a> (2015).<br /><br />[4] Ulyanov, Dmitry, Vadim Lebedev, Andrea Vedaldi, and Victor Lempitsky. <a href="https://arxiv.org/abs/1603.03417">Texture Networks: Feed-forward Synthesis of Textures and Stylized Images</a> (2016).<br /><br />[5] Johnson, Justin, Alexandre Alahi, and Li Fei-Fei. P<a href="https://arxiv.org/abs/1603.08155">erceptual Losses for Real-Time Style Transfer and Super-Resolution</a> (2016).<br /><br /><span class="Apple-style-span" style="font-size: small;"><br /><a name="1"><b>* </b></a>This work was done during an internship with the <a href="http://g.co/brain">Google Brain Team</a>. <a href="http://vdumoulin.github.io/about">Vincent</a> is currently a Ph.D. candidate at MILA, Université de Montréal.<a href="http://research.googleblog.com/2016/10/supercharging-style-transfer.html#top1"><sup>↩</sup></a><br /></span><br /><br /><br />]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/supercharging-style-transfer/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Course Builder now supports scheduling, easier customization and more</title>
		<link>https://googledata.org/google-research/course-builder-now-supports-scheduling-easier-customization-and-more/</link>
		<comments>https://googledata.org/google-research/course-builder-now-supports-scheduling-easier-customization-and-more/#comments</comments>
		<pubDate>Mon, 17 Oct 2016 17:30:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>
		<category><![CDATA[education]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=d1750c60a82838d4794650ae598fa1e4</guid>
		<description><![CDATA[<span>Posted by Adam Feldman, Product Manager and Pavel Simakov, Technical Lead, Course Builder Team</span><br /><br />Over the years, we've learned that there are as many ways to run an online course as there are instructors to run them. Today's release of <a href="https://www.google.com/edu/openonline/course-builder/downloads/index.html">Course Builder v1.11</a> has a focus on improved student access controls, easier visual customization and a new course explorer. Additionally, we've added better support for <a href="https://www.google.com/edu/openonline/course-builder/docs/1.11/set-up-course-builder/deploy-your-app.html">deploying from Windows</a>!<br /><br /><b><u>Improved student access controls</u></b><br />A course's availability is often dynamic - sometimes you want to make a course available to everyone all at once, while other times may call for the course to be available to some students before others. Perhaps registration will be available for a while and then the course later becomes read-only. To support these use cases, we've added Student Groups and Calendar Triggers.<br /><br /><ul><li><b>Student Groups</b> allow you to define which students can see which parts of a course.  Want your morning class to see unit 5 and your afternoon class to see unit 6 -- while letting random Internet visitors only see unit 1?  <a href="https://www.google.com/edu/openonline/course-builder/docs/1.11/prepare-for-students/student-groups.html">Student groups</a> have you covered.</li><li><b>Calendar Triggers</b> can be used to update course or content availability automatically at a specific time.  For instance, if your course goes live at midnight on Sunday night, you don't need to be at a computer to make it happen.  Or, if you want to unlock a new unit every week, you can set up a trigger to automate the process.  Read more about <a href="https://www.google.com/edu/openonline/course-builder/docs/1.11/publish-a-course/availability.html#calendar-triggers">calendar triggers and availability</a>.</li></ul><br />You can even use these features together.  Say you want to start a new group of students through the course every month, giving each access to one new unit per week.  Using Student Groups and Calendar Triggers together, you can achieve this cohort-like functionality.<br /><br /><b><u>Easier visual customization</u></b><br />In the past, if you wanted to customize Course Builder's student experience beyond a certain point, you needed to be a Python developer.  We heard from many web developers that they would like to be able to create their own student-facing pages, too.  With this release, Course Builder includes a <a href="https://www.google.com/edu/openonline/course-builder/docs/1.11/for-course-builder-developers/specific-sub-tasks/render-with-graphql.html">GraphQL server</a> that allows you to create your own frontend experience, while still letting Course Builder take care of things like user sessions and statefulness.<br /><br /><b><u>New course explorer</u></b><br />Large Course Builder partners such as Google's <a href="https://digitalworkshop-eu.withgoogle.com/">Digital Workshop</a> and <a href="https://onlinecourses.nptel.ac.in/explorer">NPTEL</a> have many courses and students with diverse needs.  To help them, we've completely revamped the Course Explorer page, giving it richer information and interactivity, so your students can find which of your courses they're looking for.  You can provide categories and start/end dates, in addition to the course title, abstract and instructor information.<br /><div><a href="https://3.bp.blogspot.com/-5VmQAmo2uJc/WAT3LuYI5TI/AAAAAAAABV4/VtGeJTzVYMUCoTbPVrwJWnH0okksZHzUgCLcB/s1600/image00.png"><img border="0" height="402" src="https://3.bp.blogspot.com/-5VmQAmo2uJc/WAT3LuYI5TI/AAAAAAAABV4/VtGeJTzVYMUCoTbPVrwJWnH0okksZHzUgCLcB/s640/image00.png" width="640"></a></div>In v1.11, we've added several new highly requested features. Together, they help make Course Builder easier to use and customize, giving you the flexibility to schedule things in advance.<br /><br />We've come a long way since <a href="https://opensource.googleblog.com/2012/09/helping-world-to-teach.html">releasing our first experimental code</a> over 4 years ago, turning Course Builder into a large open-source Google App Engine application with over 5 million student registrations across all Course Builder users. With these latest additions, we consider Course Builder feature complete and fully capable of delivering online learning at any scale. We will continue to provide support and bug fixes for those using the platform.<br /><br />We hope you&#8217;ll enjoy these new features and share how you&#8217;re using them in the <a href="https://groups.google.com/forum/?fromgroups#!forum/course-builder-forum">forum</a>.  Keep on learning!]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Adam Feldman, Product Manager and Pavel Simakov, Technical Lead, Course Builder Team</span><br /><br />Over the years, we've learned that there are as many ways to run an online course as there are instructors to run them. Today's release of <a href="https://www.google.com/edu/openonline/course-builder/downloads/index.html">Course Builder v1.11</a> has a focus on improved student access controls, easier visual customization and a new course explorer. Additionally, we've added better support for <a href="https://www.google.com/edu/openonline/course-builder/docs/1.11/set-up-course-builder/deploy-your-app.html">deploying from Windows</a>!<br /><br /><b><u>Improved student access controls</u></b><br />A course's availability is often dynamic - sometimes you want to make a course available to everyone all at once, while other times may call for the course to be available to some students before others. Perhaps registration will be available for a while and then the course later becomes read-only. To support these use cases, we've added Student Groups and Calendar Triggers.<br /><br /><ul><li><b>Student Groups</b> allow you to define which students can see which parts of a course.  Want your morning class to see unit 5 and your afternoon class to see unit 6 -- while letting random Internet visitors only see unit 1?  <a href="https://www.google.com/edu/openonline/course-builder/docs/1.11/prepare-for-students/student-groups.html">Student groups</a> have you covered.</li><li><b>Calendar Triggers</b> can be used to update course or content availability automatically at a specific time.  For instance, if your course goes live at midnight on Sunday night, you don't need to be at a computer to make it happen.  Or, if you want to unlock a new unit every week, you can set up a trigger to automate the process.  Read more about <a href="https://www.google.com/edu/openonline/course-builder/docs/1.11/publish-a-course/availability.html#calendar-triggers">calendar triggers and availability</a>.</li></ul><br />You can even use these features together.  Say you want to start a new group of students through the course every month, giving each access to one new unit per week.  Using Student Groups and Calendar Triggers together, you can achieve this cohort-like functionality.<br /><br /><b><u>Easier visual customization</u></b><br />In the past, if you wanted to customize Course Builder's student experience beyond a certain point, you needed to be a Python developer.  We heard from many web developers that they would like to be able to create their own student-facing pages, too.  With this release, Course Builder includes a <a href="https://www.google.com/edu/openonline/course-builder/docs/1.11/for-course-builder-developers/specific-sub-tasks/render-with-graphql.html">GraphQL server</a> that allows you to create your own frontend experience, while still letting Course Builder take care of things like user sessions and statefulness.<br /><br /><b><u>New course explorer</u></b><br />Large Course Builder partners such as Google's <a href="https://digitalworkshop-eu.withgoogle.com/">Digital Workshop</a> and <a href="https://onlinecourses.nptel.ac.in/explorer">NPTEL</a> have many courses and students with diverse needs.  To help them, we've completely revamped the Course Explorer page, giving it richer information and interactivity, so your students can find which of your courses they're looking for.  You can provide categories and start/end dates, in addition to the course title, abstract and instructor information.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-5VmQAmo2uJc/WAT3LuYI5TI/AAAAAAAABV4/VtGeJTzVYMUCoTbPVrwJWnH0okksZHzUgCLcB/s1600/image00.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="402" src="https://3.bp.blogspot.com/-5VmQAmo2uJc/WAT3LuYI5TI/AAAAAAAABV4/VtGeJTzVYMUCoTbPVrwJWnH0okksZHzUgCLcB/s640/image00.png" width="640" /></a></div>In v1.11, we've added several new highly requested features. Together, they help make Course Builder easier to use and customize, giving you the flexibility to schedule things in advance.<br /><br />We've come a long way since <a href="https://opensource.googleblog.com/2012/09/helping-world-to-teach.html">releasing our first experimental code</a> over 4 years ago, turning Course Builder into a large open-source Google App Engine application with over 5 million student registrations across all Course Builder users. With these latest additions, we consider Course Builder feature complete and fully capable of delivering online learning at any scale. We will continue to provide support and bug fixes for those using the platform.<br /><br />We hope you’ll enjoy these new features and share how you’re using them in the <a href="https://groups.google.com/forum/?fromgroups#!forum/course-builder-forum">forum</a>.  Keep on learning!]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/course-builder-now-supports-scheduling-easier-customization-and-more/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Equality of Opportunity in Machine Learning</title>
		<link>https://googledata.org/google-research/equality-of-opportunity-in-machine-learning/</link>
		<comments>https://googledata.org/google-research/equality-of-opportunity-in-machine-learning/#comments</comments>
		<pubDate>Fri, 07 Oct 2016 17:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=135fbfdb8d67935c7917186a2a88a1fd</guid>
		<description><![CDATA[<span>Posted by Moritz Hardt, Research Scientist, Google Brain Team</span><br /><br />As machine learning technology progresses rapidly, there is much interest in understanding its societal impact. A particularly successful branch of machine learning is <i><a href="https://en.wikipedia.org/wiki/Supervised_learning">supervised learning</a></i>. With enough past data and computational resources, learning algorithms often produce surprisingly effective predictors of future events. To take one hypothetical example: an algorithm could, for example, be used to predict with high accuracy who will pay back their loan. Lenders might then use such a predictor as an aid in deciding who should receive a loan in the first place. Decisions based on machine learning can be both incredibly useful and have a profound impact on our lives.<br /><br />Even the best predictors make mistakes. Although machine learning aims to minimize the chance of a mistake, how do we prevent certain groups from experiencing a disproportionate share of these mistakes? Consider the case of a group that we have relatively little data on and whose characteristics differ from those of the general population in ways that are relevant to the prediction task. As prediction accuracy is generally correlated with the amount of data available for training, it is likely that incorrect predictions will be more common in this group. A predictor might, for example, end up flagging too many individuals in this group as &#8216;high risk of default&#8217; even though they pay back their loan. When group membership coincides with a sensitive attribute, such as race, gender, disability, or religion, this situation can lead to unjust or prejudicial outcomes.<br /><br />Despite the need, a vetted methodology in machine learning for preventing this kind of discrimination based on sensitive attributes has been lacking. A naive approach might require a set of sensitive attributes to be removed from the data before doing anything else with it. This idea of &#8220;fairness through unawareness,&#8221; however, fails due to the existence of &#8220;redundant encodings.&#8221; Even if a particular attribute is not present in the data, combinations of other attributes can act as a proxy.<br /><br />Another common approach, called <i>demographic parity</i>, asks that the prediction must be uncorrelated with the sensitive attribute. This might sound intuitively desirable, but the outcome itself is often correlated with the sensitive attribute. For example, the incidence of heart failure is substantially more common in men than in women. When predicting such a medical condition, it is therefore neither realistic nor desirable to prevent all correlation between the predicted outcome and group membership.<br /><br /><b>Equal Opportunity</b><br /><b><br /></b> Taking these conceptual difficulties into account, we&#8217;ve proposed a methodology for measuring and preventing discrimination based on a set of sensitive attributes. Our framework not only helps to scrutinize predictors to discover possible concerns. We also show how to adjust a given predictor so as to strike a better tradeoff between classification accuracy and non-discrimination if need be.<br /><br />At the heart of our approach is the idea that individuals who qualify for a desirable outcome should have an equal chance of being correctly classified for this outcome.  In our fictional loan example, it means the rate of &#8216;low risk&#8217; predictions among people who actually pay back their loan should not depend on a sensitive attribute like race or gender. We call this principle <i>equality of opportunity</i> in supervised learning.<br /><br />When implemented, our framework also improves incentives by shifting the cost of poor predictions from the individual to the decision maker, who can respond by investing in improved prediction accuracy. Perfect predictors always satisfy our notion, showing that the central goal of building more accurate predictors is well aligned with the goal of avoiding discrimination.<br /><br /><b>Learn more</b><br /><br />To explore the ideas in this blog post on your own, our <a href="https://research.google.com/bigpicture/">Big Picture team</a> created a beautiful <a href="http://research.google.com/bigpicture/attacking-discrimination-in-ml">interactive visualization</a> of the different concepts and tradeoffs. So, head on over to their page to learn more. <br /><br />Once you&#8217;ve walked through the demo, please check out the <a href="https://arxiv.org/abs/1610.02413">full version of our paper</a>, a joint work with Eric Price (UT Austin) and Nati Srebro (TTI Chicago). We&#8217;ll present the paper at this year&#8217;s Conference on Neural Information Processing Systems (<a href="https://nips.cc/">NIPS</a>) in Barcelona. So, if you&#8217;re around, be sure to stop by and chat with one of us.<br /><br />Our paper is by no means the final word on this important and complex topic. It joins an ongoing conversation with a multidisciplinary focus of research. We hope to inspire future research that will sharpen the discussion of the different achievable tradeoffs surrounding discrimination and machine learning, as well as the development of tools that will help practitioners address these challenges.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Moritz Hardt, Research Scientist, Google Brain Team</span><br /><br />As machine learning technology progresses rapidly, there is much interest in understanding its societal impact. A particularly successful branch of machine learning is <i><a href="https://en.wikipedia.org/wiki/Supervised_learning">supervised learning</a></i>. With enough past data and computational resources, learning algorithms often produce surprisingly effective predictors of future events. To take one hypothetical example: an algorithm could, for example, be used to predict with high accuracy who will pay back their loan. Lenders might then use such a predictor as an aid in deciding who should receive a loan in the first place. Decisions based on machine learning can be both incredibly useful and have a profound impact on our lives.<br /><br />Even the best predictors make mistakes. Although machine learning aims to minimize the chance of a mistake, how do we prevent certain groups from experiencing a disproportionate share of these mistakes? Consider the case of a group that we have relatively little data on and whose characteristics differ from those of the general population in ways that are relevant to the prediction task. As prediction accuracy is generally correlated with the amount of data available for training, it is likely that incorrect predictions will be more common in this group. A predictor might, for example, end up flagging too many individuals in this group as ‘high risk of default’ even though they pay back their loan. When group membership coincides with a sensitive attribute, such as race, gender, disability, or religion, this situation can lead to unjust or prejudicial outcomes.<br /><br />Despite the need, a vetted methodology in machine learning for preventing this kind of discrimination based on sensitive attributes has been lacking. A naive approach might require a set of sensitive attributes to be removed from the data before doing anything else with it. This idea of “fairness through unawareness,” however, fails due to the existence of “redundant encodings.” Even if a particular attribute is not present in the data, combinations of other attributes can act as a proxy.<br /><br />Another common approach, called <i>demographic parity</i>, asks that the prediction must be uncorrelated with the sensitive attribute. This might sound intuitively desirable, but the outcome itself is often correlated with the sensitive attribute. For example, the incidence of heart failure is substantially more common in men than in women. When predicting such a medical condition, it is therefore neither realistic nor desirable to prevent all correlation between the predicted outcome and group membership.<br /><br /><b>Equal Opportunity</b><br /><b><br /></b> Taking these conceptual difficulties into account, we’ve proposed a methodology for measuring and preventing discrimination based on a set of sensitive attributes. Our framework not only helps to scrutinize predictors to discover possible concerns. We also show how to adjust a given predictor so as to strike a better tradeoff between classification accuracy and non-discrimination if need be.<br /><br />At the heart of our approach is the idea that individuals who qualify for a desirable outcome should have an equal chance of being correctly classified for this outcome.  In our fictional loan example, it means the rate of ‘low risk’ predictions among people who actually pay back their loan should not depend on a sensitive attribute like race or gender. We call this principle <i>equality of opportunity</i> in supervised learning.<br /><br />When implemented, our framework also improves incentives by shifting the cost of poor predictions from the individual to the decision maker, who can respond by investing in improved prediction accuracy. Perfect predictors always satisfy our notion, showing that the central goal of building more accurate predictors is well aligned with the goal of avoiding discrimination.<br /><br /><b>Learn more</b><br /><br />To explore the ideas in this blog post on your own, our <a href="https://research.google.com/bigpicture/">Big Picture team</a> created a beautiful <a href="http://research.google.com/bigpicture/attacking-discrimination-in-ml">interactive visualization</a> of the different concepts and tradeoffs. So, head on over to their page to learn more. <br /><br />Once you’ve walked through the demo, please check out the <a href="https://arxiv.org/abs/1610.02413">full version of our paper</a>, a joint work with Eric Price (UT Austin) and Nati Srebro (TTI Chicago). We’ll present the paper at this year’s Conference on Neural Information Processing Systems (<a href="https://nips.cc/">NIPS</a>) in Barcelona. So, if you’re around, be sure to stop by and chat with one of us.<br /><br />Our paper is by no means the final word on this important and complex topic. It joins an ongoing conversation with a multidisciplinary focus of research. We hope to inspire future research that will sharpen the discussion of the different achievable tradeoffs surrounding discrimination and machine learning, as well as the development of tools that will help practitioners address these challenges. ]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/equality-of-opportunity-in-machine-learning/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Graph-powered Machine Learning at Google</title>
		<link>https://googledata.org/google-research/graph-powered-machine-learning-at-google/</link>
		<comments>https://googledata.org/google-research/graph-powered-machine-learning-at-google/#comments</comments>
		<pubDate>Thu, 06 Oct 2016 17:03:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=340247eef62a63fc3a7c10989ad6d71a</guid>
		<description><![CDATA[<span>Posted by Sujith Ravi, Staff Research Scientist, Google Research</span><br /><br />Recently, there have been significant advances in <a href="https://en.wikipedia.org/wiki/Machine_learning">Machine Learning</a> that enable computer systems to solve complex real-world problems. One of those advances is Google&#8217;s large scale, <a href="https://en.wikipedia.org/wiki/Graph_theory">graph-based</a> machine learning platform, built by the Expander team in Google Research.  A technology that is behind many of the Google products and features you may use everyday, graph-based machine learning is a powerful tool that can be used to power useful features such as <a href="https://gmail.googleblog.com/2014/12/reminders-in-inbox-your-to-dos-on-your.html">reminders in Inbox</a> and <a href="https://googleblog.blogspot.com/2016/09/google-allo-smarter-messaging-app.html">smart messaging in Allo</a>, or used in conjunction with deep neural networks to power the latest image recognition system in <a href="https://googleblog.blogspot.com/2016/05/google-photos-one-year-200-million.html">Google Photos</a>. <br /><div><a href="https://3.bp.blogspot.com/-c8w7GPULXBU/V_aDexkRXII/AAAAAAAABVg/33cVb3rPZ_UgETMTBgi3WP8ZRfdksJ61QCLcB/s1600/image01.png"><img border="0" height="378" src="https://3.bp.blogspot.com/-c8w7GPULXBU/V_aDexkRXII/AAAAAAAABVg/33cVb3rPZ_UgETMTBgi3WP8ZRfdksJ61QCLcB/s640/image01.png" width="640"></a></div><b>Learning with Minimal Supervision</b><br /><br />Much of the recent success in <a href="https://en.wikipedia.org/wiki/Deep_learning">deep learning</a>, and machine learning in general, can be attributed to models that demonstrate high predictive capacity when trained on large amounts of labeled data -- often millions of training examples. This is commonly referred to as &#8220;<a href="https://en.wikipedia.org/wiki/Supervised_learning">supervised learning</a>&#8221; since it requires supervision, in the form of labeled data, to train the machine learning systems. (Conversely, some machine learning methods operate directly on raw data without any supervision, a paradigm referred to as <a href="https://en.wikipedia.org/wiki/Unsupervised_learning">unsupervised learning</a>.)<br /><br />However, the more difficult the task, the harder it is to get sufficient high-quality labeled data. It is often prohibitively labor intensive and time-consuming to collect labeled data for every new problem. This motivated the Expander research team to build new technology for powering machine learning applications at scale and with minimal supervision.  <br /><br />Expander&#8217;s technology draws inspiration from how humans learn to generalize and bridge the gap between what they already know (labeled information) and novel, unfamiliar observations (unlabeled information). Known as &#8220;<a href="https://en.wikipedia.org/wiki/Semi-supervised_learning">semi-supervised</a>&#8221; learning, this powerful technique enables us to build systems that can work in situations where training data may be sparse. One of the key advantages to a graph-based semi-supervised machine learning approach is the fact that (a) one models labeled and unlabeled data <i>jointly</i> during learning, leveraging the underlying structure in the data, (b) one can easily combine multiple types of signals (for example, relational information from <a href="https://en.wikipedia.org/wiki/Knowledge_Graph">Knowledge Graph</a> along with raw features) into a single graph representation and learn over them. This is in contrast to other machine learning approaches, such as neural network methods, in which it is typical to <i>first</i> train a system using labeled data with features and <i>then</i> apply the trained system to unlabeled data.<br /><br /><b>Graph Learning: How It Works</b><br /><br />At its core, Expander&#8217;s platform combines semi-supervised machine learning with large-scale graph-based learning by building a multi-graph representation of the data with nodes corresponding to objects or concepts and edges connecting concepts that share similarities. The graph typically contains both labeled data (nodes associated with a known output category or label) and unlabeled data (nodes for which no labels were provided). Expander&#8217;s framework then performs semi-supervised learning to label all nodes jointly by propagating label information across the graph. <br /><br />However, this is easier said than done!  We have to (1) learn efficiently at scale with minimal supervision (i.e., tiny amount of labeled data), (2) operate over multi-modal data (i.e., heterogeneous representations and various sources of data), and (3) solve challenging prediction tasks (i.e., large, complex output spaces) involving high dimensional data that might be noisy.<br /><br />One of the primary ingredients in the entire learning process is the graph and choice of connections. Graphs come in all sizes, shapes and can be combined from multiple sources.  We have observed that it is often beneficial to learn over multi-graphs that combine information from multiple types of data representations (e.g., image pixels, object categories and chat response messages for <a href="https://research.googleblog.com/2016/05/aw-so-cute-allo-helps-you-respond-to.html">PhotoReply in Allo</a>). The Expander team&#8217;s graph learning platform automatically generates graphs directly from data based on the inferred or known relationships between data elements.  The data can be structured (for example, <a href="https://en.wikipedia.org/wiki/Knowledge_Graph">relational data</a>) or unstructured (for example, <a href="https://en.wikipedia.org/wiki/Sparse_array">sparse</a> or dense feature representations extracted from raw data). <br /><br />To understand how Expander&#8217;s system learns, let us consider an example graph shown below. <br /><div><a href="https://3.bp.blogspot.com/-ZYBHfGDkDP8/V_aDXU1ewxI/AAAAAAAABVc/QPNCCcXi3w0MXOsVVGa9J_07mj0BFBVpQCLcB/s1600/image03.png"><img border="0" height="446" src="https://3.bp.blogspot.com/-ZYBHfGDkDP8/V_aDXU1ewxI/AAAAAAAABVc/QPNCCcXi3w0MXOsVVGa9J_07mj0BFBVpQCLcB/s640/image03.png" width="640"></a></div><br />There are two types of nodes in the graph: &#8220;grey&#8221; represents unlabeled data whereas the colored nodes represent labeled data. Relationships between node data is represented via edges and thickness of each edge indicates strength of the connection. We can formulate the semi-supervised learning problem on this toy graph as follows: <i>predict a color (&#8220;red&#8221; or &#8220;blue&#8221;) for every node in the graph</i>. Note that the specific choice of graph structure and colors depend on the task. For example, as shown in <a href="https://arxiv.org/abs/1606.04870">this research paper</a> we recently published, a graph that we built for the <a href="https://research.googleblog.com/2015/11/computer-respond-to-this-email.html">Smart Reply feature in Inbox</a> represents email messages as nodes and colors indicate semantic categories of user responses (e.g., &#8220;yes&#8221;, &#8220;awesome&#8221;, &#8220;funny&#8221;).<br /><br />The Expander graph learning framework solves this labeling task by treating it as an optimization problem. At the simplest level, it learns a color label assignment for every node in the graph such that neighboring nodes are assigned similar colors depending on the strength of their connection. A naive way to solve this would be to try to learn a label assignment for all nodes at once -- this method does not scale to large graphs. Instead, we can optimize the problem formulation by propagating colors from labeled nodes to their neighbors, and then repeating the process. In each step, an unlabeled node is assigned a label by inspecting color assignments of its neighbors. We can update every node&#8217;s label in this manner and iterate until the whole graph is colored. This process is a far more efficient way to optimize the same problem and the sequence of iterations converges to a unique solution in this case. The solution at the end of the graph propagation looks something like this:<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://2.bp.blogspot.com/-aJ-VcuyhELs/V_aDJLa9TPI/AAAAAAAABVY/kdEy3rcl5oo_T1Zp47Lpki0lFPCg59MsQCLcB/s1600/image00.gif"><img border="0" height="446" src="https://2.bp.blogspot.com/-aJ-VcuyhELs/V_aDJLa9TPI/AAAAAAAABVY/kdEy3rcl5oo_T1Zp47Lpki0lFPCg59MsQCLcB/s640/image00.gif" width="640"></a></td></tr><tr><td>Semi-supervised learning on a graph</td></tr></tbody></table>In practice, we use complex optimization functions defined over the graph structure, which incorporate additional information and constraints for semi-supervised graph learning that can lead to hard, <a href="https://en.wikipedia.org/wiki/Convex_optimization">non-convex</a> problems. The <i>real</i> challenge, however, is to scale this efficiently to graphs containing billions of nodes, trillions of edges and for complex tasks involving billions of different label types. <br /><br />To tackle this challenge, we created an approach outlined in <a href="https://arxiv.org/abs/1512.01752"><i>Large Scale Distributed Semi-Supervised Learning Using Streaming Approximation</i></a>, published last year. It introduces a <a href="https://en.wikipedia.org/wiki/Streaming_algorithm">streaming algorithm</a> to process information propagated from neighboring nodes in a distributed manner that makes it work on very large graphs. In addition, it addresses other practical concerns, notably it guarantees that the space complexity or memory requirements of the system stays constant regardless of the difficulty of the task, i.e., the overall system uses the same amount of memory regardless of whether the number of prediction labels is two (as in the above toy example) or a million or even a billion. This enables wide-ranging applications for natural language understanding, machine perception, user modeling and even joint <a href="https://en.wikipedia.org/wiki/Multimodal_learning">multimodal</a> learning for tasks involving multiple modalities such as text, image and video inputs.   <br /><br /><b>Language Graphs for Learning Humor </b><br /><br />As an example use of graph-based machine learning, consider <i>emotion labeling</i>, a language understanding task in <a href="https://research.googleblog.com/2015/11/computer-respond-to-this-email.html">Smart Reply for Inbox</a>, where the goal is to label words occurring in natural language text with their fine-grained emotion categories. A neural network model is first applied to a text corpus to learn word embeddings, i.e., a mathematical vector representation of the meaning of each word. The dense embedding vectors are then used to build a sparse graph where nodes correspond to words and edges represent semantic relationship between them. Edge strength is computed using similarity between embedding vectors &#8212; low similarity edges are ignored. We seed the graph with emotion labels known <i>a priori</i> for a few nodes (e.g., laugh is labeled as &#8220;funny&#8221;) and then apply semi-supervised learning over the graph to discover emotion categories for remaining words (e.g., ROTFL gets labeled as &#8220;funny&#8221; owing to its multi-hop semantic connection to the word &#8220;laugh&#8221;). <br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://3.bp.blogspot.com/-BHCy_9hWMYw/V_aC9m8P-_I/AAAAAAAABVU/OtEiUW-vPowSTNcrrfzD5QoYHOo6ymFWACLcB/s1600/image02.png"><img border="0" height="234" src="https://3.bp.blogspot.com/-BHCy_9hWMYw/V_aC9m8P-_I/AAAAAAAABVU/OtEiUW-vPowSTNcrrfzD5QoYHOo6ymFWACLcB/s640/image02.png" width="640"></a></td></tr><tr><td>Learning emotion associations using graph constructed from word embedding vectors</td></tr></tbody></table>For applications involving large datasets or dense representations that are observed (e.g., pixels from images) or learned using neural networks (e.g., embedding vectors), it is infeasible to compute pairwise similarity between all objects to construct edges in the graph. The Expander team <a href="https://arxiv.org/abs/1512.01752">solves</a> this problem by leveraging approximate, linear-time graph construction algorithms. <br /><br /><b>Graph-based Machine Intelligence in Action</b><br /><br />The Expander team&#8217;s machine learning system is now being used on massive graphs (containing billions of nodes and trillions of edges) to recognize and understand concepts in natural language, images, videos, and queries, powering Google products for applications like <a href="https://gmail.googleblog.com/2014/12/reminders-in-inbox-your-to-dos-on-your.html">reminders</a>, <a href="https://en.wikipedia.org/wiki/Question_answering">question answering</a>, <a href="http://googletranslate.blogspot.com/2016/02/from-amharic-to-xhosa-introducing.html">language translation</a>, <a href="https://en.wikipedia.org/wiki/Computer_vision#Recognition">visual object recognition</a>, <a href="https://en.wikipedia.org/wiki/Dialog_system">dialogue understanding</a>, and more. <br /><br />We are excited that with the <a href="https://googleblog.blogspot.com/2016/09/google-allo-smarter-messaging-app.html">recent release of Allo</a>, millions of chat users are now experiencing smart messaging technology powered by the Expander team&#8217;s system for understanding and assisting with chat conversations in multiple languages.  Also, this technology isn&#8217;t used only for large-scale models in the cloud - as <a href="http://android-developers.blogspot.com/2016/09/android-wear-2-0-developer-preview-3-play-store-and-more.html">announced this past week</a>, Android Wear has opened up an <a href="https://developer.android.com/wear/preview/features/notifications.html#messaging"> on-device Smart Reply capability</a> for developers that will provide smart replies for any messaging application.  We&#8217;re excited to tackle even more challenging Internet-scale problems with Expander in the years to come.  <br /><br /><b>Acknowledgements</b><br /><br />We wish to acknowledge the hard work of all the researchers, engineers, product managers, and leaders across Google who helped make this technology a success.  In particular, we would like to highlight the efforts of Allan Heydon, Andrei Broder, Andrew Tomkins, Ariel Fuxman, Bo Pang, Dana Movshovitz-Attias, Fritz Obermeyer, Krishnamurthy Viswanathan, Patrick McGregor, Peter Young, Robin Dua, Sujith Ravi and Vivek Ramavajjala.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Sujith Ravi, Staff Research Scientist, Google Research</span><br /><br />Recently, there have been significant advances in <a href="https://en.wikipedia.org/wiki/Machine_learning">Machine Learning</a> that enable computer systems to solve complex real-world problems. One of those advances is Google’s large scale, <a href="https://en.wikipedia.org/wiki/Graph_theory">graph-based</a> machine learning platform, built by the Expander team in Google Research.  A technology that is behind many of the Google products and features you may use everyday, graph-based machine learning is a powerful tool that can be used to power useful features such as <a href="https://gmail.googleblog.com/2014/12/reminders-in-inbox-your-to-dos-on-your.html">reminders in Inbox</a> and <a href="https://googleblog.blogspot.com/2016/09/google-allo-smarter-messaging-app.html">smart messaging in Allo</a>, or used in conjunction with deep neural networks to power the latest image recognition system in <a href="https://googleblog.blogspot.com/2016/05/google-photos-one-year-200-million.html">Google Photos</a>. <br /><div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-c8w7GPULXBU/V_aDexkRXII/AAAAAAAABVg/33cVb3rPZ_UgETMTBgi3WP8ZRfdksJ61QCLcB/s1600/image01.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="378" src="https://3.bp.blogspot.com/-c8w7GPULXBU/V_aDexkRXII/AAAAAAAABVg/33cVb3rPZ_UgETMTBgi3WP8ZRfdksJ61QCLcB/s640/image01.png" width="640" /></a></div><b>Learning with Minimal Supervision</b><br /><br />Much of the recent success in <a href="https://en.wikipedia.org/wiki/Deep_learning">deep learning</a>, and machine learning in general, can be attributed to models that demonstrate high predictive capacity when trained on large amounts of labeled data -- often millions of training examples. This is commonly referred to as “<a href="https://en.wikipedia.org/wiki/Supervised_learning">supervised learning</a>” since it requires supervision, in the form of labeled data, to train the machine learning systems. (Conversely, some machine learning methods operate directly on raw data without any supervision, a paradigm referred to as <a href="https://en.wikipedia.org/wiki/Unsupervised_learning">unsupervised learning</a>.)<br /><br />However, the more difficult the task, the harder it is to get sufficient high-quality labeled data. It is often prohibitively labor intensive and time-consuming to collect labeled data for every new problem. This motivated the Expander research team to build new technology for powering machine learning applications at scale and with minimal supervision.  <br /><br />Expander’s technology draws inspiration from how humans learn to generalize and bridge the gap between what they already know (labeled information) and novel, unfamiliar observations (unlabeled information). Known as “<a href="https://en.wikipedia.org/wiki/Semi-supervised_learning">semi-supervised</a>” learning, this powerful technique enables us to build systems that can work in situations where training data may be sparse. One of the key advantages to a graph-based semi-supervised machine learning approach is the fact that (a) one models labeled and unlabeled data <i>jointly</i> during learning, leveraging the underlying structure in the data, (b) one can easily combine multiple types of signals (for example, relational information from <a href="https://en.wikipedia.org/wiki/Knowledge_Graph">Knowledge Graph</a> along with raw features) into a single graph representation and learn over them. This is in contrast to other machine learning approaches, such as neural network methods, in which it is typical to <i>first</i> train a system using labeled data with features and <i>then</i> apply the trained system to unlabeled data.<br /><br /><b>Graph Learning: How It Works</b><br /><br />At its core, Expander’s platform combines semi-supervised machine learning with large-scale graph-based learning by building a multi-graph representation of the data with nodes corresponding to objects or concepts and edges connecting concepts that share similarities. The graph typically contains both labeled data (nodes associated with a known output category or label) and unlabeled data (nodes for which no labels were provided). Expander’s framework then performs semi-supervised learning to label all nodes jointly by propagating label information across the graph. <br /><br />However, this is easier said than done!  We have to (1) learn efficiently at scale with minimal supervision (i.e., tiny amount of labeled data), (2) operate over multi-modal data (i.e., heterogeneous representations and various sources of data), and (3) solve challenging prediction tasks (i.e., large, complex output spaces) involving high dimensional data that might be noisy.<br /><br />One of the primary ingredients in the entire learning process is the graph and choice of connections. Graphs come in all sizes, shapes and can be combined from multiple sources.  We have observed that it is often beneficial to learn over multi-graphs that combine information from multiple types of data representations (e.g., image pixels, object categories and chat response messages for <a href="https://research.googleblog.com/2016/05/aw-so-cute-allo-helps-you-respond-to.html">PhotoReply in Allo</a>). The Expander team’s graph learning platform automatically generates graphs directly from data based on the inferred or known relationships between data elements.  The data can be structured (for example, <a href="https://en.wikipedia.org/wiki/Knowledge_Graph">relational data</a>) or unstructured (for example, <a href="https://en.wikipedia.org/wiki/Sparse_array">sparse</a> or dense feature representations extracted from raw data). <br /><br />To understand how Expander’s system learns, let us consider an example graph shown below. <br /><div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-ZYBHfGDkDP8/V_aDXU1ewxI/AAAAAAAABVc/QPNCCcXi3w0MXOsVVGa9J_07mj0BFBVpQCLcB/s1600/image03.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="446" src="https://3.bp.blogspot.com/-ZYBHfGDkDP8/V_aDXU1ewxI/AAAAAAAABVc/QPNCCcXi3w0MXOsVVGa9J_07mj0BFBVpQCLcB/s640/image03.png" width="640" /></a></div><br />There are two types of nodes in the graph: “grey” represents unlabeled data whereas the colored nodes represent labeled data. Relationships between node data is represented via edges and thickness of each edge indicates strength of the connection. We can formulate the semi-supervised learning problem on this toy graph as follows: <i>predict a color (“red” or “blue”) for every node in the graph</i>. Note that the specific choice of graph structure and colors depend on the task. For example, as shown in <a href="https://arxiv.org/abs/1606.04870">this research paper</a> we recently published, a graph that we built for the <a href="https://research.googleblog.com/2015/11/computer-respond-to-this-email.html">Smart Reply feature in Inbox</a> represents email messages as nodes and colors indicate semantic categories of user responses (e.g., “yes”, “awesome”, “funny”).<br /><br />The Expander graph learning framework solves this labeling task by treating it as an optimization problem. At the simplest level, it learns a color label assignment for every node in the graph such that neighboring nodes are assigned similar colors depending on the strength of their connection. A naive way to solve this would be to try to learn a label assignment for all nodes at once -- this method does not scale to large graphs. Instead, we can optimize the problem formulation by propagating colors from labeled nodes to their neighbors, and then repeating the process. In each step, an unlabeled node is assigned a label by inspecting color assignments of its neighbors. We can update every node’s label in this manner and iterate until the whole graph is colored. This process is a far more efficient way to optimize the same problem and the sequence of iterations converges to a unique solution in this case. The solution at the end of the graph propagation looks something like this:<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://2.bp.blogspot.com/-aJ-VcuyhELs/V_aDJLa9TPI/AAAAAAAABVY/kdEy3rcl5oo_T1Zp47Lpki0lFPCg59MsQCLcB/s1600/image00.gif" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="446" src="https://2.bp.blogspot.com/-aJ-VcuyhELs/V_aDJLa9TPI/AAAAAAAABVY/kdEy3rcl5oo_T1Zp47Lpki0lFPCg59MsQCLcB/s640/image00.gif" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Semi-supervised learning on a graph</td></tr></tbody></table>In practice, we use complex optimization functions defined over the graph structure, which incorporate additional information and constraints for semi-supervised graph learning that can lead to hard, <a href="https://en.wikipedia.org/wiki/Convex_optimization">non-convex</a> problems. The <i>real</i> challenge, however, is to scale this efficiently to graphs containing billions of nodes, trillions of edges and for complex tasks involving billions of different label types. <br /><br />To tackle this challenge, we created an approach outlined in <a href="https://arxiv.org/abs/1512.01752"><i>Large Scale Distributed Semi-Supervised Learning Using Streaming Approximation</i></a>, published last year. It introduces a <a href="https://en.wikipedia.org/wiki/Streaming_algorithm">streaming algorithm</a> to process information propagated from neighboring nodes in a distributed manner that makes it work on very large graphs. In addition, it addresses other practical concerns, notably it guarantees that the space complexity or memory requirements of the system stays constant regardless of the difficulty of the task, i.e., the overall system uses the same amount of memory regardless of whether the number of prediction labels is two (as in the above toy example) or a million or even a billion. This enables wide-ranging applications for natural language understanding, machine perception, user modeling and even joint <a href="https://en.wikipedia.org/wiki/Multimodal_learning">multimodal</a> learning for tasks involving multiple modalities such as text, image and video inputs.   <br /><br /><b>Language Graphs for Learning Humor </b><br /><br />As an example use of graph-based machine learning, consider <i>emotion labeling</i>, a language understanding task in <a href="https://research.googleblog.com/2015/11/computer-respond-to-this-email.html">Smart Reply for Inbox</a>, where the goal is to label words occurring in natural language text with their fine-grained emotion categories. A neural network model is first applied to a text corpus to learn word embeddings, i.e., a mathematical vector representation of the meaning of each word. The dense embedding vectors are then used to build a sparse graph where nodes correspond to words and edges represent semantic relationship between them. Edge strength is computed using similarity between embedding vectors — low similarity edges are ignored. We seed the graph with emotion labels known <i>a priori</i> for a few nodes (e.g., laugh is labeled as “funny”) and then apply semi-supervised learning over the graph to discover emotion categories for remaining words (e.g., ROTFL gets labeled as “funny” owing to its multi-hop semantic connection to the word “laugh”). <br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://3.bp.blogspot.com/-BHCy_9hWMYw/V_aC9m8P-_I/AAAAAAAABVU/OtEiUW-vPowSTNcrrfzD5QoYHOo6ymFWACLcB/s1600/image02.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="234" src="https://3.bp.blogspot.com/-BHCy_9hWMYw/V_aC9m8P-_I/AAAAAAAABVU/OtEiUW-vPowSTNcrrfzD5QoYHOo6ymFWACLcB/s640/image02.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Learning emotion associations using graph constructed from word embedding vectors</td></tr></tbody></table>For applications involving large datasets or dense representations that are observed (e.g., pixels from images) or learned using neural networks (e.g., embedding vectors), it is infeasible to compute pairwise similarity between all objects to construct edges in the graph. The Expander team <a href="https://arxiv.org/abs/1512.01752">solves</a> this problem by leveraging approximate, linear-time graph construction algorithms. <br /><br /><b>Graph-based Machine Intelligence in Action</b><br /><br />The Expander team’s machine learning system is now being used on massive graphs (containing billions of nodes and trillions of edges) to recognize and understand concepts in natural language, images, videos, and queries, powering Google products for applications like <a href="https://gmail.googleblog.com/2014/12/reminders-in-inbox-your-to-dos-on-your.html">reminders</a>, <a href="https://en.wikipedia.org/wiki/Question_answering">question answering</a>, <a href="http://googletranslate.blogspot.com/2016/02/from-amharic-to-xhosa-introducing.html">language translation</a>, <a href="https://en.wikipedia.org/wiki/Computer_vision#Recognition">visual object recognition</a>, <a href="https://en.wikipedia.org/wiki/Dialog_system">dialogue understanding</a>, and more. <br /><br />We are excited that with the <a href="https://googleblog.blogspot.com/2016/09/google-allo-smarter-messaging-app.html">recent release of Allo</a>, millions of chat users are now experiencing smart messaging technology powered by the Expander team’s system for understanding and assisting with chat conversations in multiple languages.  Also, this technology isn’t used only for large-scale models in the cloud - as <a href="http://android-developers.blogspot.com/2016/09/android-wear-2-0-developer-preview-3-play-store-and-more.html">announced this past week</a>, Android Wear has opened up an <a href="https://developer.android.com/wear/preview/features/notifications.html#messaging"> on-device Smart Reply capability</a> for developers that will provide smart replies for any messaging application.  We’re excited to tackle even more challenging Internet-scale problems with Expander in the years to come.  <br /><br /><b>Acknowledgements</b><br /><br />We wish to acknowledge the hard work of all the researchers, engineers, product managers, and leaders across Google who helped make this technology a success.  In particular, we would like to highlight the efforts of Allan Heydon, Andrei Broder, Andrew Tomkins, Ariel Fuxman, Bo Pang, Dana Movshovitz-Attias, Fritz Obermeyer, Krishnamurthy Viswanathan, Patrick McGregor, Peter Young, Robin Dua, Sujith Ravi and Vivek Ramavajjala.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/graph-powered-machine-learning-at-google/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>How Robots Can Acquire New Skills from Their Shared Experience</title>
		<link>https://googledata.org/google-research/how-robots-can-acquire-new-skills-from-their-shared-experience/</link>
		<comments>https://googledata.org/google-research/how-robots-can-acquire-new-skills-from-their-shared-experience/#comments</comments>
		<pubDate>Tue, 04 Oct 2016 00:30:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=64e4f393e1527b6ed14d0c5e71751a9c</guid>
		<description><![CDATA[<span>Posted by Sergey Levine (Google Brain Team), Timothy Lillicrap (DeepMind), Mrinal Kalakrishnan (X)</span><br /><br />The ability to learn from experience will likely be a key in enabling robots to help with complex real-world tasks, from assisting the elderly with chores and daily activities, to helping us in offices and hospitals, to performing jobs that are too dangerous or unpleasant for people. However, if each robot must learn its full repertoire of skills for these tasks only from its own experience, it could take far too long to acquire a rich enough range of behaviors to be useful. Could we bridge this gap by making it possible for robots to collectively learn from each other&#8217;s experiences?<br /><br />While machine learning algorithms have made great strides in <a href="https://research.googleblog.com/search/label/Natural%20Language%20Understanding">natural language understanding</a> and <a href="https://research.googleblog.com/2015/08/the-neural-networks-behind-google-voice.html">speech recognition</a>, the kind of symbolic high-level reasoning that allows people to communicate complex concepts in words remains out of reach for machines. However, robots can instantaneously transmit their experience to other robots over the network - sometimes known as "<a href="http://goldberg.berkeley.edu/cloud-robotics/">cloud robotics</a>" - and it is this ability that can let them learn from each other.<br /><br />This is true even for seemingly simple low-level skills. Humans and animals excel at adaptive motor control that integrates their senses, reflexes, and muscles in a closely coordinated feedback loop. Robots still struggle with these basic skills in the real world, where the variability and complexity of the environment demands well-honed behaviors that are not easily fooled by distractors. If we enable robots to transmit their experiences to each other, could they learn to perform motion skills in close coordination with sensing in realistic environments?<br /><br />We <a href="https://research.googleblog.com/2016/03/deep-learning-for-robots-learning-from.html">previously wrote</a> about how multiple robots could pool their experiences to learn a grasping task. Here, we will discuss new experiments that we conducted to investigate three possible approaches for general-purpose skill learning across multiple robots: learning motion skills directly from experience, learning internal models of physics, and learning skills with human assistance. In all three cases, multiple robots shared their experiences to build a common model of the skill. The skills learned by the robots are still relatively simple -- pushing objects and opening doors -- but by learning such skills more quickly and efficiently through collective learning, robots might in the future acquire richer behavioral repertoires that could eventually make it possible for them to assist us in our daily lives.<br /><br /><b>Learning from raw experience with model-free reinforcement learning.</b><br />Perhaps one of the simplest ways for robots to teach each other is to pool information about their successes and failures in the world.  Humans and animals acquire many skills by direct trial-and-error learning.  During this kind of &#8216;model-free&#8217; learning -- so called because there is no explicit model of the environment formed -- they explore variations on their existing behavior and then reinforce and exploit the variations that give bigger rewards. In combination with deep neural networks, model-free algorithms have recently proved to be surprisingly effective and have been key to <a href="https://research.googleblog.com/2015/02/from-pixels-to-actions-human-level.html">successes with the Atari video game system</a> and <a href="https://research.googleblog.com/2016/01/alphago-mastering-ancient-game-of-go.html">playing Go</a>. Having multiple robots allows us to experiment with sharing experiences to speed up this kind of direct learning in the real world.<br /><br />In these experiments we tasked robots with trying to move their arms to goal locations, or reaching to and opening a door.  Each robot has a copy of a neural network that allows it to estimate the value of taking a given action in a given state.  By querying this network, the robot can quickly decide what actions might be worth taking in the world.  When a robot acts, we add noise to the actions it selects, so the resulting behavior is sometimes a bit better than previously observed, and sometimes a bit worse.  This allows each robot to explore different ways of approaching a task.  Records of the actions taken by the robots, their behaviors, and the final outcomes are sent back to a central server.  The server collects the experiences from all of the robots and uses them to iteratively improve the neural network that estimates value for different states and actions.  The model-free algorithms we employed look across both good and bad experiences and distill these into a new network that is better at understanding how action and success are related.  Then, at regular intervals, each robot takes a copy of the updated network from the server and begins to act using the information in its new network.  Given that this updated network is a bit better at estimating the true value of actions in the world, the robots will produce better behavior. This cycle can then be repeated to continue improving on the task. In the video below, a robot explores the door opening task.<br /><div></div>With a few hours of practice, robots sharing their raw experience learn to make reaches to targets, and to open a door by making contact with the handle and pulling.  In the case of door opening, the robots learn to deal with the complex physics of the contacts between the hook and the door handle without building an explicit model of the world, as can be seen in the example below:<br /><div></div><b>Learning how the world works by interacting with objects. </b><br />Direct trial-and-error reinforcement learning is a great way to learn individual skills. However, humans and animals don&#8217;t learn exclusively by trial and error. We also build mental models about our environment and imagine how the world might change in response to our actions.<br /><br />We can start with the simplest of physical interactions, and have our robots learn the basics of cause and effect from reflecting on their own experiences. In this experiment, we had the robots play with a wide variety of common household objects by randomly prodding and pushing them inside a tabletop bin. The robots again shared their experiences with each other and together built a single predictive model that attempted to forecast what the world might look like in response to their actions. This predictive model can make simple, if slightly blurry, forecasts about future camera images when provided with the current image and a possible sequence of actions that the robot might execute:<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://3.bp.blogspot.com/-ywvHCpUF8Bs/V_LS9pIX09I/AAAAAAAABUc/ELKnKLvPS6AOoyXxzrXEcZPdXnuJAi0_wCLcB/s1600/image02.gif"><img border="0" height="632" src="https://3.bp.blogspot.com/-ywvHCpUF8Bs/V_LS9pIX09I/AAAAAAAABUc/ELKnKLvPS6AOoyXxzrXEcZPdXnuJAi0_wCLcB/s640/image02.gif" width="640"></a></td></tr><tr><td>Top row: robotic arms interacting with common household items.<br />Bottom row: Predicted future camera images given an initial image and a sequence of actions.</td></tr></tbody></table>Once this model is trained, the robots can use it to perform purposeful manipulations, for example based on user commands. In our prototype, a user can command the robot to move a particular object simply by clicking on that object, and then clicking on the point where the object should go:<br /><div><a href="https://4.bp.blogspot.com/-4h96hfyfLDg/V_LTn4dTZiI/AAAAAAAABUk/EyYHBQdaM3c_OEdw0vWHB9BSqjw27ERXQCLcB/s1600/image03.gif"><img border="0" height="360" src="https://4.bp.blogspot.com/-4h96hfyfLDg/V_LTn4dTZiI/AAAAAAAABUk/EyYHBQdaM3c_OEdw0vWHB9BSqjw27ERXQCLcB/s640/image03.gif" width="640"></a></div>The robots in this experiment were not told anything about objects or physics: they only see that the command requires a particular pixel to be moved to a particular place. However, because they have seen so many object interactions in their shared past experiences, they can forecast how particular actions will affect particular pixels. In order for such an implicit understanding of physics to emerge, the robots must be provided with a sufficient breadth of experience. This requires either a lot of time, or sharing the combined experiences of many robots. An extended video on this project may be found <a href="https://www.youtube.com/watch?v=CKRWJEVSXMI">here</a>.<br /><table border="0" cellpadding="1" cellspacing="0"><tbody><tr><td><img border="0" src="https://1.bp.blogspot.com/-LbLR0XQkgUY/V_LUQDuWIeI/AAAAAAAABUs/c3BTaSQazYo6jEhnvipx14v2oB3-_PXJwCLcB/s320/image01.gif" width="100%"></td> <td><img border="0" src="https://1.bp.blogspot.com/-ZDhqDD0QLWU/V_LUS4UuQPI/AAAAAAAABUw/KRHkQutGtCMk0DlrSdW4oQbyc-LWulgFwCLcB/s320/image00.gif" width="100%"></td></tr></tbody></table><b>Learning with the help of humans.</b><br />So far, we discussed how robots can learn entirely on their own. However, human guidance is important, not just for telling the robot what to do, but also for helping the robots along. We have a lot of intuition about how various manipulation skills can be performed, and it only seems natural that transferring this intuition to robots can help them learn these skills a lot faster. In the next experiment, we provided each robot with a different door, and guided each of them by hand to show how these doors can be opened. These demonstrations are encoded into a single combined strategy for all robots, called a policy. The policy is a deep neural network which converts camera images to robot actions, and is maintained on a central server. The following video shows the instructor demonstrating the door-opening skill to a robot:<br /><div></div>Next, the robots collectively improve this policy through a trial-and-error learning process. Each robot attempts to open its own door using the latest available policy, with some added noise for exploration. These attempts allow each robot to plan a better strategy for opening the door the next time around, and improve the policy accordingly:<br /><div></div>Not surprisingly, we find that robots learn more effectively if they are trained on a curriculum of tasks that are gradually increasing in difficulty. In our experiment, each robot starts off by practicing the door-opening skill on a specific position and orientation of the door that the instructor had previously shown it. As it gets better at performing the task, the instructor starts to alter the position and orientation of the door to be just a bit beyond the current capabilities of the policy, but not so difficult that it fails entirely. This allows the robots to gradually increase their skill level over time, and expands the range of situations they can handle. The combination of human-guidance with trial-and-error learning allowed the robots to collectively learn the skill of door-opening in just a couple of hours. Since the robots were trained on doors that look different from each other, the final policy succeeds on a door with a handle that none of the robots had seen before:<br /><div></div>In all three of the experiments described above, the ability to communicate and exchange their experiences allows the robots to learn more quickly and effectively. This becomes particularly important when we combine robotic learning with deep learning, as is the case in all of the experiments discussed above. We&#8217;ve seen before that deep learning works best when provided with ample training data. For example, the popular ImageNet benchmark uses over 1.5 million labeled examples. While such a quantity of data is not impossible for a single robot to gather over a few years, it is much more efficient to gather the same volume of experience from multiple robots over the course of a few weeks. Besides faster learning times, this approach might benefit from the greater diversity of experience: a real-world deployment might involve multiple robots in different places and different settings, sharing heterogeneous, varied experiences to build a single highly generalizable representation.<br /><br />Of course, the kinds of behaviors that robots today can learn are still quite limited. Even basic motion skills, such as picking up objects and opening doors, remain in the realm of cutting edge research. In all of these experiments, a human engineer is still needed to tell the robots what they should learn to do by specifying a detailed objective function. However, as algorithms improve and robots are deployed more widely, their ability to share and pool their experiences could be instrumental for enabling them to assist us in our daily lives.<br /><br /><i>The experiments on learning by trial-and-error were conducted by Shixiang (Shane) Gu and Ethan Holly from the Google Brain team, and Timothy Lillicrap from DeepMind. Work on learning predictive models was conducted by Chelsea Finn from the Google Brain team, and the research on learning from demonstration was conducted by Yevgen Chebotar, Ali Yahya, Adrian Li, and Mrinal Kalakrishnan from X. We would also like to acknowledge contributions by Peter Pastor, Gabriel Dulac-Arnold, and Jon Scholz. Articles about each of the experiments discussed in this blog post can be found below:</i><br /><br /><a href="https://arxiv.org/abs/1610.00633">Deep Reinforcement Learning for Robotic Manipulation</a>. <i>Shixiang Gu, Ethan Holly, Timothy Lillicrap, Sergey Levine.</i>&#160;[<a href="https://sites.google.com/site/deeproboticmanipulation/">video</a>]<br /><a href="https://arxiv.org/abs/1610.00696"><br /></a> <a href="https://arxiv.org/abs/1610.00696">Deep Visual Foresight for Planning Robot Motion</a>. <i>Chelsea Finn, Sergey Levine.</i>&#160;[<a href="https://www.youtube.com/watch?v=CKRWJEVSXMI">video</a>] [<a href="https://sites.google.com/site/brainrobotdata/home">data</a>]<br /><br /><a href="https://arxiv.org/abs/1610.00673">Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search</a>. <br /><i>Ali Yahya, Adrian Li, Mrinal Kalakrishnan, Yevgen Chebotar, Sergey Levine.</i>&#160; [<a href="https://youtu.be/ZBFwe1gF0FU">video</a>]<br /><a href="https://arxiv.org/abs/1610.00529"><br /></a> <a href="https://arxiv.org/abs/1610.00529">Path Integral Guided Policy Search</a>. <i>Yevgen Chebotar, Mrinal Kalakrishnan, Ali Yahya, Adrian Li, Stefan Schaal, Sergey Levine. </i>[<a href="https://www.youtube.com/watch?v=ncp1kY5JV90">video</a>]]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Sergey Levine (Google Brain Team), Timothy Lillicrap (DeepMind), Mrinal Kalakrishnan (X)</span><br /><br />The ability to learn from experience will likely be a key in enabling robots to help with complex real-world tasks, from assisting the elderly with chores and daily activities, to helping us in offices and hospitals, to performing jobs that are too dangerous or unpleasant for people. However, if each robot must learn its full repertoire of skills for these tasks only from its own experience, it could take far too long to acquire a rich enough range of behaviors to be useful. Could we bridge this gap by making it possible for robots to collectively learn from each other’s experiences?<br /><br />While machine learning algorithms have made great strides in <a href="https://research.googleblog.com/search/label/Natural%20Language%20Understanding">natural language understanding</a> and <a href="https://research.googleblog.com/2015/08/the-neural-networks-behind-google-voice.html">speech recognition</a>, the kind of symbolic high-level reasoning that allows people to communicate complex concepts in words remains out of reach for machines. However, robots can instantaneously transmit their experience to other robots over the network - sometimes known as "<a href="http://goldberg.berkeley.edu/cloud-robotics/">cloud robotics</a>" - and it is this ability that can let them learn from each other.<br /><br />This is true even for seemingly simple low-level skills. Humans and animals excel at adaptive motor control that integrates their senses, reflexes, and muscles in a closely coordinated feedback loop. Robots still struggle with these basic skills in the real world, where the variability and complexity of the environment demands well-honed behaviors that are not easily fooled by distractors. If we enable robots to transmit their experiences to each other, could they learn to perform motion skills in close coordination with sensing in realistic environments?<br /><br />We <a href="https://research.googleblog.com/2016/03/deep-learning-for-robots-learning-from.html">previously wrote</a> about how multiple robots could pool their experiences to learn a grasping task. Here, we will discuss new experiments that we conducted to investigate three possible approaches for general-purpose skill learning across multiple robots: learning motion skills directly from experience, learning internal models of physics, and learning skills with human assistance. In all three cases, multiple robots shared their experiences to build a common model of the skill. The skills learned by the robots are still relatively simple -- pushing objects and opening doors -- but by learning such skills more quickly and efficiently through collective learning, robots might in the future acquire richer behavioral repertoires that could eventually make it possible for them to assist us in our daily lives.<br /><br /><b>Learning from raw experience with model-free reinforcement learning.</b><br />Perhaps one of the simplest ways for robots to teach each other is to pool information about their successes and failures in the world.  Humans and animals acquire many skills by direct trial-and-error learning.  During this kind of ‘model-free’ learning -- so called because there is no explicit model of the environment formed -- they explore variations on their existing behavior and then reinforce and exploit the variations that give bigger rewards. In combination with deep neural networks, model-free algorithms have recently proved to be surprisingly effective and have been key to <a href="https://research.googleblog.com/2015/02/from-pixels-to-actions-human-level.html">successes with the Atari video game system</a> and <a href="https://research.googleblog.com/2016/01/alphago-mastering-ancient-game-of-go.html">playing Go</a>. Having multiple robots allows us to experiment with sharing experiences to speed up this kind of direct learning in the real world.<br /><br />In these experiments we tasked robots with trying to move their arms to goal locations, or reaching to and opening a door.  Each robot has a copy of a neural network that allows it to estimate the value of taking a given action in a given state.  By querying this network, the robot can quickly decide what actions might be worth taking in the world.  When a robot acts, we add noise to the actions it selects, so the resulting behavior is sometimes a bit better than previously observed, and sometimes a bit worse.  This allows each robot to explore different ways of approaching a task.  Records of the actions taken by the robots, their behaviors, and the final outcomes are sent back to a central server.  The server collects the experiences from all of the robots and uses them to iteratively improve the neural network that estimates value for different states and actions.  The model-free algorithms we employed look across both good and bad experiences and distill these into a new network that is better at understanding how action and success are related.  Then, at regular intervals, each robot takes a copy of the updated network from the server and begins to act using the information in its new network.  Given that this updated network is a bit better at estimating the true value of actions in the world, the robots will produce better behavior. This cycle can then be repeated to continue improving on the task. In the video below, a robot explores the door opening task.<br /><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/iiD3Klvm96s/0.jpg" frameborder="0" height="360" src="https://www.youtube.com/embed/iiD3Klvm96s?rel=0&amp;feature=player_embedded" width="640"></iframe></div>With a few hours of practice, robots sharing their raw experience learn to make reaches to targets, and to open a door by making contact with the handle and pulling.  In the case of door opening, the robots learn to deal with the complex physics of the contacts between the hook and the door handle without building an explicit model of the world, as can be seen in the example below:<br /><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/R2kEi_KKSsA/0.jpg" frameborder="0" height="360" src="https://www.youtube.com/embed/R2kEi_KKSsA?rel=0&amp;feature=player_embedded" width="640"></iframe></div><b>Learning how the world works by interacting with objects. </b><br />Direct trial-and-error reinforcement learning is a great way to learn individual skills. However, humans and animals don’t learn exclusively by trial and error. We also build mental models about our environment and imagine how the world might change in response to our actions.<br /><br />We can start with the simplest of physical interactions, and have our robots learn the basics of cause and effect from reflecting on their own experiences. In this experiment, we had the robots play with a wide variety of common household objects by randomly prodding and pushing them inside a tabletop bin. The robots again shared their experiences with each other and together built a single predictive model that attempted to forecast what the world might look like in response to their actions. This predictive model can make simple, if slightly blurry, forecasts about future camera images when provided with the current image and a possible sequence of actions that the robot might execute:<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://3.bp.blogspot.com/-ywvHCpUF8Bs/V_LS9pIX09I/AAAAAAAABUc/ELKnKLvPS6AOoyXxzrXEcZPdXnuJAi0_wCLcB/s1600/image02.gif" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="632" src="https://3.bp.blogspot.com/-ywvHCpUF8Bs/V_LS9pIX09I/AAAAAAAABUc/ELKnKLvPS6AOoyXxzrXEcZPdXnuJAi0_wCLcB/s640/image02.gif" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Top row: robotic arms interacting with common household items.<br />Bottom row: Predicted future camera images given an initial image and a sequence of actions.</td></tr></tbody></table>Once this model is trained, the robots can use it to perform purposeful manipulations, for example based on user commands. In our prototype, a user can command the robot to move a particular object simply by clicking on that object, and then clicking on the point where the object should go:<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-4h96hfyfLDg/V_LTn4dTZiI/AAAAAAAABUk/EyYHBQdaM3c_OEdw0vWHB9BSqjw27ERXQCLcB/s1600/image03.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="360" src="https://4.bp.blogspot.com/-4h96hfyfLDg/V_LTn4dTZiI/AAAAAAAABUk/EyYHBQdaM3c_OEdw0vWHB9BSqjw27ERXQCLcB/s640/image03.gif" width="640" /></a></div>The robots in this experiment were not told anything about objects or physics: they only see that the command requires a particular pixel to be moved to a particular place. However, because they have seen so many object interactions in their shared past experiences, they can forecast how particular actions will affect particular pixels. In order for such an implicit understanding of physics to emerge, the robots must be provided with a sufficient breadth of experience. This requires either a lot of time, or sharing the combined experiences of many robots. An extended video on this project may be found <a href="https://www.youtube.com/watch?v=CKRWJEVSXMI">here</a>.<br /><table border="0" cellpadding="1" cellspacing="0" style="width: 100%;"><tbody><tr> <td><img border="0" src="https://1.bp.blogspot.com/-LbLR0XQkgUY/V_LUQDuWIeI/AAAAAAAABUs/c3BTaSQazYo6jEhnvipx14v2oB3-_PXJwCLcB/s320/image01.gif" width="100%" /></td> <td><img border="0" src="https://1.bp.blogspot.com/-ZDhqDD0QLWU/V_LUS4UuQPI/AAAAAAAABUw/KRHkQutGtCMk0DlrSdW4oQbyc-LWulgFwCLcB/s320/image00.gif" width="100%" /></td></tr></tbody></table><b>Learning with the help of humans.</b><br />So far, we discussed how robots can learn entirely on their own. However, human guidance is important, not just for telling the robot what to do, but also for helping the robots along. We have a lot of intuition about how various manipulation skills can be performed, and it only seems natural that transferring this intuition to robots can help them learn these skills a lot faster. In the next experiment, we provided each robot with a different door, and guided each of them by hand to show how these doors can be opened. These demonstrations are encoded into a single combined strategy for all robots, called a policy. The policy is a deep neural network which converts camera images to robot actions, and is maintained on a central server. The following video shows the instructor demonstrating the door-opening skill to a robot:<br /><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/9-9udQtO1PY/0.jpg" frameborder="0" height="360" src="https://www.youtube.com/embed/9-9udQtO1PY?rel=0&amp;feature=player_embedded" width="640"></iframe></div>Next, the robots collectively improve this policy through a trial-and-error learning process. Each robot attempts to open its own door using the latest available policy, with some added noise for exploration. These attempts allow each robot to plan a better strategy for opening the door the next time around, and improve the policy accordingly:<br /><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/QZvu8M02BeE/0.jpg" frameborder="0" height="360" src="https://www.youtube.com/embed/QZvu8M02BeE?rel=0&amp;feature=player_embedded" width="640"></iframe></div>Not surprisingly, we find that robots learn more effectively if they are trained on a curriculum of tasks that are gradually increasing in difficulty. In our experiment, each robot starts off by practicing the door-opening skill on a specific position and orientation of the door that the instructor had previously shown it. As it gets better at performing the task, the instructor starts to alter the position and orientation of the door to be just a bit beyond the current capabilities of the policy, but not so difficult that it fails entirely. This allows the robots to gradually increase their skill level over time, and expands the range of situations they can handle. The combination of human-guidance with trial-and-error learning allowed the robots to collectively learn the skill of door-opening in just a couple of hours. Since the robots were trained on doors that look different from each other, the final policy succeeds on a door with a handle that none of the robots had seen before:<br /><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/YBVR-TRXEc4/0.jpg" frameborder="0" height="360" src="https://www.youtube.com/embed/YBVR-TRXEc4?rel=0&amp;feature=player_embedded" width="640"></iframe></div>In all three of the experiments described above, the ability to communicate and exchange their experiences allows the robots to learn more quickly and effectively. This becomes particularly important when we combine robotic learning with deep learning, as is the case in all of the experiments discussed above. We’ve seen before that deep learning works best when provided with ample training data. For example, the popular ImageNet benchmark uses over 1.5 million labeled examples. While such a quantity of data is not impossible for a single robot to gather over a few years, it is much more efficient to gather the same volume of experience from multiple robots over the course of a few weeks. Besides faster learning times, this approach might benefit from the greater diversity of experience: a real-world deployment might involve multiple robots in different places and different settings, sharing heterogeneous, varied experiences to build a single highly generalizable representation.<br /><br />Of course, the kinds of behaviors that robots today can learn are still quite limited. Even basic motion skills, such as picking up objects and opening doors, remain in the realm of cutting edge research. In all of these experiments, a human engineer is still needed to tell the robots what they should learn to do by specifying a detailed objective function. However, as algorithms improve and robots are deployed more widely, their ability to share and pool their experiences could be instrumental for enabling them to assist us in our daily lives.<br /><br /><i>The experiments on learning by trial-and-error were conducted by Shixiang (Shane) Gu and Ethan Holly from the Google Brain team, and Timothy Lillicrap from DeepMind. Work on learning predictive models was conducted by Chelsea Finn from the Google Brain team, and the research on learning from demonstration was conducted by Yevgen Chebotar, Ali Yahya, Adrian Li, and Mrinal Kalakrishnan from X. We would also like to acknowledge contributions by Peter Pastor, Gabriel Dulac-Arnold, and Jon Scholz. Articles about each of the experiments discussed in this blog post can be found below:</i><br /><br /><a href="https://arxiv.org/abs/1610.00633">Deep Reinforcement Learning for Robotic Manipulation</a>. <i>Shixiang Gu, Ethan Holly, Timothy Lillicrap, Sergey Levine.</i>&nbsp;[<a href="https://sites.google.com/site/deeproboticmanipulation/">video</a>]<br /><a href="https://arxiv.org/abs/1610.00696"><br /></a> <a href="https://arxiv.org/abs/1610.00696">Deep Visual Foresight for Planning Robot Motion</a>. <i>Chelsea Finn, Sergey Levine.</i>&nbsp;[<a href="https://www.youtube.com/watch?v=CKRWJEVSXMI">video</a>] [<a href="https://sites.google.com/site/brainrobotdata/home">data</a>]<br /><br /><a href="https://arxiv.org/abs/1610.00673">Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search</a>. <br /><i>Ali Yahya, Adrian Li, Mrinal Kalakrishnan, Yevgen Chebotar, Sergey Levine.</i>&nbsp; [<a href="https://youtu.be/ZBFwe1gF0FU">video</a>]<br /><a href="https://arxiv.org/abs/1610.00529"><br /></a> <a href="https://arxiv.org/abs/1610.00529">Path Integral Guided Policy Search</a>. <i>Yevgen Chebotar, Mrinal Kalakrishnan, Ali Yahya, Adrian Li, Stefan Schaal, Sergey Levine. </i>[<a href="https://www.youtube.com/watch?v=ncp1kY5JV90">video</a>]]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/how-robots-can-acquire-new-skills-from-their-shared-experience/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Introducing the Open Images Dataset</title>
		<link>https://googledata.org/google-research/introducing-the-open-images-dataset/</link>
		<comments>https://googledata.org/google-research/introducing-the-open-images-dataset/#comments</comments>
		<pubDate>Fri, 30 Sep 2016 17:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=6fc034636a594021eeccd06948c99b99</guid>
		<description><![CDATA[<span>Posted by Ivan Krasin and Tom Duerig, Software Engineers</span><br /><br />In the last few years, advances in machine learning have enabled <a href="https://en.wikipedia.org/wiki/Computer_vision">Computer Vision</a> to progress rapidly, allowing for systems that can <a href="https://research.googleblog.com/2016/09/show-and-tell-image-captioning-open.html">automatically caption images</a> to apps that can create <a href="https://research.googleblog.com/2016/05/aw-so-cute-allo-helps-you-respond-to.html">natural language replies in response to shared photos</a>. Much of this progress can be attributed to publicly available image datasets, such as <a href="http://image-net.org/">ImageNet</a> and <a href="http://mscoco.org/">COCO</a> for supervised learning, and <a href="http://webscope.sandbox.yahoo.com/catalog.php?datatype=i&#38;did=67">YFCC100M</a> for unsupervised learning.<br /><br />Today, we introduce <a href="https://github.com/openimages/dataset">Open Images</a>, a dataset consisting of ~9 million URLs to images that have been annotated with labels spanning over 6000 categories. We tried to make the dataset as practical as possible: the labels cover more real-life entities than the 1000 ImageNet classes, there are enough images to train a deep neural network from scratch and the images are listed as having a <a href="https://creativecommons.org/licenses/by/2.0/">Creative Commons Attribution</a> license<a href="http://research.googleblog.com/#1" name="top1"><sup>*</sup></a>. <br /><br />The image-level annotations have been populated automatically with a vision model similar to <a href="https://cloud.google.com/vision/">Google Cloud Vision API</a>. For the validation set, we had human raters verify these automated labels to find and remove false positives. On average, each image has about 8 labels assigned. Here are some examples:<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://3.bp.blogspot.com/-K7oXk2v5Buk/V-2c0Cd_M-I/AAAAAAAABUE/Zl4wdW_T5rAJiYTfgG2HcYsGOjc3hOGrgCLcB/s1600/Open%2BImages.png"><img border="0" height="220" src="https://3.bp.blogspot.com/-K7oXk2v5Buk/V-2c0Cd_M-I/AAAAAAAABUE/Zl4wdW_T5rAJiYTfgG2HcYsGOjc3hOGrgCLcB/s640/Open%2BImages.png" width="640"></a></td></tr><tr><td>Annotated images form the Open Images dataset. <b>Left:</b> <a href="https://www.flickr.com/photos/kevinkrejci/2957748348">Ghost Arches</a> by <a href="https://www.flickr.com/photos/kevinkrejci/">Kevin Krejci</a>. <b>Right:</b>  <a href="https://www.flickr.com/photos/lobsterstew/3197736453">Some Silverware</a> by <a href="https://www.flickr.com/photos/lobsterstew/">J B</a>. Both images used under <a href="https://creativecommons.org/licenses/by/2.0/">CC BY 2.0</a> license</td></tr></tbody></table>We have trained an Inception v3 model based on Open Images annotations alone, and the model is good enough to be used for fine-tuning applications as well as for other things, like <a href="https://research.googleblog.com/2015/07/deepdream-code-example-for-visualizing.html">DeepDream</a> or <a href="https://arxiv.org/abs/1508.06576">artistic style transfer</a> which require a well developed hierarchy of filters. We hope to improve the quality of the annotations in Open Images the coming months, and therefore the quality of models which can be trained.<br /><br />The dataset is a product of a collaboration between Google, CMU and Cornell universities, and there are a number of research papers built on top of the Open Images dataset in the works. It is our hope that datasets like <a href="https://github.com/openimages/dataset">Open Images</a> and the <a href="https://research.googleblog.com/2016/09/announcing-youtube-8m-large-and-diverse.html">recently released YouTube-8M</a> will be useful tools for the machine learning community.<br /><br /><span><br /><a name="1"><b>* </b></a>While we tried to identify images that are licensed under a Creative Commons Attribution license, we make no representations or warranties regarding the license status of each image and you should verify the license for each image yourself.<a href="http://research.googleblog.com/#top1"><sup>&#8617;</sup></a><br /></span>]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Ivan Krasin and Tom Duerig, Software Engineers</span><br /><br />In the last few years, advances in machine learning have enabled <a href="https://en.wikipedia.org/wiki/Computer_vision">Computer Vision</a> to progress rapidly, allowing for systems that can <a href="https://research.googleblog.com/2016/09/show-and-tell-image-captioning-open.html">automatically caption images</a> to apps that can create <a href="https://research.googleblog.com/2016/05/aw-so-cute-allo-helps-you-respond-to.html">natural language replies in response to shared photos</a>. Much of this progress can be attributed to publicly available image datasets, such as <a href="http://image-net.org/">ImageNet</a> and <a href="http://mscoco.org/">COCO</a> for supervised learning, and <a href="http://webscope.sandbox.yahoo.com/catalog.php?datatype=i&amp;did=67">YFCC100M</a> for unsupervised learning.<br /><br />Today, we introduce <a href="https://github.com/openimages/dataset">Open Images</a>, a dataset consisting of ~9 million URLs to images that have been annotated with labels spanning over 6000 categories. We tried to make the dataset as practical as possible: the labels cover more real-life entities than the 1000 ImageNet classes, there are enough images to train a deep neural network from scratch and the images are listed as having a <a href="https://creativecommons.org/licenses/by/2.0/">Creative Commons Attribution</a> license<a href="http://research.googleblog.com/2016/09/introducing-open-images-dataset.html#1" name="top1"><sup>*</sup></a>. <br /><br />The image-level annotations have been populated automatically with a vision model similar to <a href="https://cloud.google.com/vision/">Google Cloud Vision API</a>. For the validation set, we had human raters verify these automated labels to find and remove false positives. On average, each image has about 8 labels assigned. Here are some examples:<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://3.bp.blogspot.com/-K7oXk2v5Buk/V-2c0Cd_M-I/AAAAAAAABUE/Zl4wdW_T5rAJiYTfgG2HcYsGOjc3hOGrgCLcB/s1600/Open%2BImages.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="220" src="https://3.bp.blogspot.com/-K7oXk2v5Buk/V-2c0Cd_M-I/AAAAAAAABUE/Zl4wdW_T5rAJiYTfgG2HcYsGOjc3hOGrgCLcB/s640/Open%2BImages.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Annotated images form the Open Images dataset. <b>Left:</b> <a href="https://www.flickr.com/photos/kevinkrejci/2957748348">Ghost Arches</a> by <a href="https://www.flickr.com/photos/kevinkrejci/">Kevin Krejci</a>. <b>Right:</b>  <a href="https://www.flickr.com/photos/lobsterstew/3197736453">Some Silverware</a> by <a href="https://www.flickr.com/photos/lobsterstew/">J B</a>. Both images used under <a href="https://creativecommons.org/licenses/by/2.0/">CC BY 2.0</a> license</td></tr></tbody></table>We have trained an Inception v3 model based on Open Images annotations alone, and the model is good enough to be used for fine-tuning applications as well as for other things, like <a href="https://research.googleblog.com/2015/07/deepdream-code-example-for-visualizing.html">DeepDream</a> or <a href="https://arxiv.org/abs/1508.06576">artistic style transfer</a> which require a well developed hierarchy of filters. We hope to improve the quality of the annotations in Open Images the coming months, and therefore the quality of models which can be trained.<br /><br />The dataset is a product of a collaboration between Google, CMU and Cornell universities, and there are a number of research papers built on top of the Open Images dataset in the works. It is our hope that datasets like <a href="https://github.com/openimages/dataset">Open Images</a> and the <a href="https://research.googleblog.com/2016/09/announcing-youtube-8m-large-and-diverse.html">recently released YouTube-8M</a> will be useful tools for the machine learning community.<br /><br /><span class="Apple-style-span" style="font-size: small;"><br /><a name="1"><b>* </b></a>While we tried to identify images that are licensed under a Creative Commons Attribution license, we make no representations or warranties regarding the license status of each image and you should verify the license for each image yourself.<a href="http://research.googleblog.com/2016/09/introducing-open-images-dataset.html#top1"><sup>↩</sup></a><br /></span>]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/introducing-the-open-images-dataset/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Image Compression with Neural Networks</title>
		<link>https://googledata.org/google-research/image-compression-with-neural-networks/</link>
		<comments>https://googledata.org/google-research/image-compression-with-neural-networks/#comments</comments>
		<pubDate>Thu, 29 Sep 2016 17:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=1d8b970222b611114a044e2a3f311aec</guid>
		<description><![CDATA[<span>Posted by Nick Johnston and David Minnen, Software Engineers</span><br /><br />Data compression is used nearly everywhere on the internet - the videos you watch online, the images you share, the music you listen to, even the blog you're reading right now. Compression techniques make sharing the content you want quick and efficient. Without data compression, the time and bandwidth costs for getting the information you need, when you need it, would be exorbitant!<br /><br />In "<a href="https://arxiv.org/abs/1608.05148"><i>Full Resolution Image Compression with Recurrent Neural Networks</i></a>", we expand on our <a href="https://arxiv.org/abs/1511.06085">previous research</a> on data compression using neural networks, exploring whether machine learning can provide better results for image compression like it has for <a href="https://research.googleblog.com/2016/08/improving-inception-and-image.html">image recognition</a> and <a href="https://research.googleblog.com/2016/08/text-summarization-with-tensorflow.html">text summarization</a>. Furthermore, we are <a href="https://github.com/tensorflow/models/tree/master/compression">releasing our compression model</a>  via <a href="https://www.tensorflow.org/">TensorFlow</a> so you can experiment with compressing your own images with our network. <br /><br />We introduce an architecture that uses a new variant of the <a href="https://en.wikipedia.org/wiki/Gated_recurrent_unit">Gated Recurrent Unit</a> (a type of  <a href="https://en.wikipedia.org/wiki/Recurrent_neural_network">RNN</a> that allows units to save activations and process sequences) called Residual Gated Recurrent Unit (Residual GRU). Our Residual GRU combines existing GRUs with the residual connections introduced in "<a href="https://arxiv.org/abs/1512.03385"><i>Deep Residual Learning for Image Recognition</i></a>" to achieve significant image quality gains for a given compression rate. Instead of using a DCT to generate a new bit representation like many compression schemes in use today, we train two sets of neural networks - one to create the codes from the image (encoder) and another to create the image from the codes (decoder). <br /><br />Our system works by iteratively refining a reconstruction of the original image, with both the encoder and decoder using Residual GRU layers so that additional information can pass from one iteration to the next. Each iteration adds more bits to the encoding, which allows for a higher quality reconstruction. Conceptually, the network operates as follows:<br /><ol><li>The initial residual, R[0], corresponds to the original image I: R[0] = I.</li><li>Set i=1 for to the first iteration.</li><li>Iteration[i] takes R[i-1] as input and runs the encoder and binarizer to compress the image into B[i].</li><li>Iteration[i] runs the decoder on B[i] to generate a reconstructed image P[i].</li><li>The residual for Iteration[i] is calculated: R[i] = I - P[i].</li><li>Set i=i+1 and go to Step 3 (up to the desired number of iterations).</li></ol>The residual image represents how different the current version of the compressed image is from the original. This image is then given as input to the network with the goal of removing the compression errors from the next version of the compressed image. The compressed image is now represented by the concatenation of B[1] through B[N]. For larger values of N, the decoder gets more information on how to reduce the errors and generate a higher quality reconstruction of the original image.<br /><br />To understand how this works, consider the following example of the first two iterations of the image compression network, shown in the figures below.  We start with an image of a lighthouse. On the first pass through the network, the original image is given as an input (R[0] = I). P[1] is the reconstructed image. The difference between the original image and encoded image is the residual, R[1], which represents the error in the compression. <br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://3.bp.blogspot.com/-zrOb9rbWFSk/V-1FFXToERI/AAAAAAAABTc/qntzldIYTQoNBLYoqLl7_fLOkz0lg6gigCLcB/s1600/1.png"><img border="0" height="316" src="https://3.bp.blogspot.com/-zrOb9rbWFSk/V-1FFXToERI/AAAAAAAABTc/qntzldIYTQoNBLYoqLl7_fLOkz0lg6gigCLcB/s640/1.png" width="640"></a></td></tr><tr><td><b>Left:</b> Original image, I = R[0]. <b>Center:</b> Reconstructed image, P[1]. <b>Right:</b> the residual, R[1], which represents the error introduced by compression.</td></tr></tbody></table>On the second pass through the network, R[1] is given as the network&#8217;s input (see figure below). A higher quality image P[2] is then created. So how does the system recreate such a good image (P[2], center panel below) from the residual R[1]? Because the model uses recurrent nodes with memory, the network saves information from each iteration that it can use in the next one. It learned something about the original image in Iteration[1] that is used along with R[1] to generate a better P[2] from B[2].  Lastly, a new residual, R[2] (right), is generated by subtracting P[2] from the original image. This time the residual is smaller since there are fewer differences between the reconstructed image, and what we started with.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://2.bp.blogspot.com/-Y5UNSkEIHss/V-1FXPrSYWI/AAAAAAAABTg/hQjWHhorFsAj6mchR3wYdfCt7hYePLcOwCLcB/s1600/2.png"><img border="0" height="314" src="https://2.bp.blogspot.com/-Y5UNSkEIHss/V-1FXPrSYWI/AAAAAAAABTg/hQjWHhorFsAj6mchR3wYdfCt7hYePLcOwCLcB/s640/2.png" width="640"></a></td></tr><tr><td>The second pass through the network. <b>Left:</b> R[1] is given as input. <b>Center:</b> A higher quality reconstruction, P[2]. <b>Right:</b> A smaller residual R[2] is generated by subtracting P[2] from the original image.</td></tr></tbody></table>At each further iteration, the network gains more information about the errors introduced by compression (which is captured by the residual image). If it can use that information to predict the residuals even a little bit, the result is a better reconstruction. Our models are able to make use of the extra bits up to a point. We see diminishing returns, and at some point the representational power of the network is exhausted.<br /><br />To demonstrate file size and quality differences, we can take a photo of Vash, a <a href="https://en.wikipedia.org/wiki/Japanese_Chin">Japanese Chin</a>, and generate two compressed images, one JPEG and one Residual GRU. Both images target a perceptual similarity of 0.9 <a href="https://ece.uwaterloo.ca/~z70wang/publications/msssim.html">MS-SSIM</a>, a perceptual quality metric that reaches 1.0 for identical images. The image generated by our learned model results in an file 25% smaller than JPEG.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://1.bp.blogspot.com/-mJj8zJF2Fuo/V-1FuZ0SfRI/AAAAAAAABTk/TeUe0Y6Njfkxun7rbt_GiDS9d7Zmw3BCwCLcB/s1600/3.png"><img border="0" height="274" src="https://1.bp.blogspot.com/-mJj8zJF2Fuo/V-1FuZ0SfRI/AAAAAAAABTk/TeUe0Y6Njfkxun7rbt_GiDS9d7Zmw3BCwCLcB/s640/3.png" width="640"></a></td></tr><tr><td><b>Left:</b> Original image (1419 KB PNG) at ~1.0 MS-SSIM. <b>Center:</b> JPEG (33 KB) at ~0.9 MS-SSIM. <b>Right:</b>  Residual GRU (24 KB) at ~0.9 MS-SSIM. This is 25% smaller for a comparable image quality</td></tr></tbody></table>Taking a look around his nose and mouth, we see that our method doesn&#8217;t have the magenta blocks and noise in the middle of the image as seen in JPEG. This is due to the <a href="https://en.wikipedia.org/wiki/Compression_artifact#Block_boundary_artifacts">blocking artifacts</a> produced by JPEG, whereas our compression network works on the entire image at once. However, there's a tradeoff -- in our model the details of the whiskers and texture are lost, but the system shows great promise in reducing artifacts.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://3.bp.blogspot.com/-VDwRAX6wro0/V-1GAMqvFOI/AAAAAAAABTs/CPWE6nhomx0VSxpsYBdzYQh5BmA9xy7HQCLcB/s1600/4.png"><img border="0" height="208" src="https://3.bp.blogspot.com/-VDwRAX6wro0/V-1GAMqvFOI/AAAAAAAABTs/CPWE6nhomx0VSxpsYBdzYQh5BmA9xy7HQCLcB/s640/4.png" width="640"></a></td></tr><tr><td><b>Left:</b> Original. <b>Center:</b> JPEG. <b>Right:</b> Residual GRU.</td></tr></tbody></table>While today&#8217;s commonly used codecs perform well, our work shows that using neural networks to compress images results in a compression scheme with higher quality and smaller file sizes. To learn more about the details of our research and a comparison of other recurrent architectures, check out <a href="https://arxiv.org/abs/1608.05148">our paper</a>. Our future work will focus on even better compression quality and faster models, so stay tuned!<br /><br />]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Nick Johnston and David Minnen, Software Engineers</span><br /><br />Data compression is used nearly everywhere on the internet - the videos you watch online, the images you share, the music you listen to, even the blog you're reading right now. Compression techniques make sharing the content you want quick and efficient. Without data compression, the time and bandwidth costs for getting the information you need, when you need it, would be exorbitant!<br /><br />In "<a href="https://arxiv.org/abs/1608.05148"><i>Full Resolution Image Compression with Recurrent Neural Networks</i></a>", we expand on our <a href="https://arxiv.org/abs/1511.06085">previous research</a> on data compression using neural networks, exploring whether machine learning can provide better results for image compression like it has for <a href="https://research.googleblog.com/2016/08/improving-inception-and-image.html">image recognition</a> and <a href="https://research.googleblog.com/2016/08/text-summarization-with-tensorflow.html">text summarization</a>. Furthermore, we are <a href="https://github.com/tensorflow/models/tree/master/compression">releasing our compression model</a>  via <a href="https://www.tensorflow.org/">TensorFlow</a> so you can experiment with compressing your own images with our network. <br /><br />We introduce an architecture that uses a new variant of the <a href="https://en.wikipedia.org/wiki/Gated_recurrent_unit">Gated Recurrent Unit</a> (a type of  <a href="https://en.wikipedia.org/wiki/Recurrent_neural_network">RNN</a> that allows units to save activations and process sequences) called Residual Gated Recurrent Unit (Residual GRU). Our Residual GRU combines existing GRUs with the residual connections introduced in "<a href="https://arxiv.org/abs/1512.03385"><i>Deep Residual Learning for Image Recognition</i></a>" to achieve significant image quality gains for a given compression rate. Instead of using a DCT to generate a new bit representation like many compression schemes in use today, we train two sets of neural networks - one to create the codes from the image (encoder) and another to create the image from the codes (decoder). <br /><br />Our system works by iteratively refining a reconstruction of the original image, with both the encoder and decoder using Residual GRU layers so that additional information can pass from one iteration to the next. Each iteration adds more bits to the encoding, which allows for a higher quality reconstruction. Conceptually, the network operates as follows:<br /><ol><li>The initial residual, R[0], corresponds to the original image I: R[0] = I.</li><li>Set i=1 for to the first iteration.</li><li>Iteration[i] takes R[i-1] as input and runs the encoder and binarizer to compress the image into B[i].</li><li>Iteration[i] runs the decoder on B[i] to generate a reconstructed image P[i].</li><li>The residual for Iteration[i] is calculated: R[i] = I - P[i].</li><li>Set i=i+1 and go to Step 3 (up to the desired number of iterations).</li></ol>The residual image represents how different the current version of the compressed image is from the original. This image is then given as input to the network with the goal of removing the compression errors from the next version of the compressed image. The compressed image is now represented by the concatenation of B[1] through B[N]. For larger values of N, the decoder gets more information on how to reduce the errors and generate a higher quality reconstruction of the original image.<br /><br />To understand how this works, consider the following example of the first two iterations of the image compression network, shown in the figures below.  We start with an image of a lighthouse. On the first pass through the network, the original image is given as an input (R[0] = I). P[1] is the reconstructed image. The difference between the original image and encoded image is the residual, R[1], which represents the error in the compression. <br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://3.bp.blogspot.com/-zrOb9rbWFSk/V-1FFXToERI/AAAAAAAABTc/qntzldIYTQoNBLYoqLl7_fLOkz0lg6gigCLcB/s1600/1.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="316" src="https://3.bp.blogspot.com/-zrOb9rbWFSk/V-1FFXToERI/AAAAAAAABTc/qntzldIYTQoNBLYoqLl7_fLOkz0lg6gigCLcB/s640/1.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><b>Left:</b> Original image, I = R[0]. <b>Center:</b> Reconstructed image, P[1]. <b>Right:</b> the residual, R[1], which represents the error introduced by compression.</td></tr></tbody></table>On the second pass through the network, R[1] is given as the network’s input (see figure below). A higher quality image P[2] is then created. So how does the system recreate such a good image (P[2], center panel below) from the residual R[1]? Because the model uses recurrent nodes with memory, the network saves information from each iteration that it can use in the next one. It learned something about the original image in Iteration[1] that is used along with R[1] to generate a better P[2] from B[2].  Lastly, a new residual, R[2] (right), is generated by subtracting P[2] from the original image. This time the residual is smaller since there are fewer differences between the reconstructed image, and what we started with.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://2.bp.blogspot.com/-Y5UNSkEIHss/V-1FXPrSYWI/AAAAAAAABTg/hQjWHhorFsAj6mchR3wYdfCt7hYePLcOwCLcB/s1600/2.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="314" src="https://2.bp.blogspot.com/-Y5UNSkEIHss/V-1FXPrSYWI/AAAAAAAABTg/hQjWHhorFsAj6mchR3wYdfCt7hYePLcOwCLcB/s640/2.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The second pass through the network. <b>Left:</b> R[1] is given as input. <b>Center:</b> A higher quality reconstruction, P[2]. <b>Right:</b> A smaller residual R[2] is generated by subtracting P[2] from the original image.</td></tr></tbody></table>At each further iteration, the network gains more information about the errors introduced by compression (which is captured by the residual image). If it can use that information to predict the residuals even a little bit, the result is a better reconstruction. Our models are able to make use of the extra bits up to a point. We see diminishing returns, and at some point the representational power of the network is exhausted.<br /><br />To demonstrate file size and quality differences, we can take a photo of Vash, a <a href="https://en.wikipedia.org/wiki/Japanese_Chin">Japanese Chin</a>, and generate two compressed images, one JPEG and one Residual GRU. Both images target a perceptual similarity of 0.9 <a href="https://ece.uwaterloo.ca/~z70wang/publications/msssim.html">MS-SSIM</a>, a perceptual quality metric that reaches 1.0 for identical images. The image generated by our learned model results in an file 25% smaller than JPEG.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-mJj8zJF2Fuo/V-1FuZ0SfRI/AAAAAAAABTk/TeUe0Y6Njfkxun7rbt_GiDS9d7Zmw3BCwCLcB/s1600/3.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="274" src="https://1.bp.blogspot.com/-mJj8zJF2Fuo/V-1FuZ0SfRI/AAAAAAAABTk/TeUe0Y6Njfkxun7rbt_GiDS9d7Zmw3BCwCLcB/s640/3.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><b>Left:</b> Original image (1419 KB PNG) at ~1.0 MS-SSIM. <b>Center:</b> JPEG (33 KB) at ~0.9 MS-SSIM. <b>Right:</b>  Residual GRU (24 KB) at ~0.9 MS-SSIM. This is 25% smaller for a comparable image quality</td></tr></tbody></table>Taking a look around his nose and mouth, we see that our method doesn’t have the magenta blocks and noise in the middle of the image as seen in JPEG. This is due to the <a href="https://en.wikipedia.org/wiki/Compression_artifact#Block_boundary_artifacts">blocking artifacts</a> produced by JPEG, whereas our compression network works on the entire image at once. However, there's a tradeoff -- in our model the details of the whiskers and texture are lost, but the system shows great promise in reducing artifacts.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://3.bp.blogspot.com/-VDwRAX6wro0/V-1GAMqvFOI/AAAAAAAABTs/CPWE6nhomx0VSxpsYBdzYQh5BmA9xy7HQCLcB/s1600/4.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="208" src="https://3.bp.blogspot.com/-VDwRAX6wro0/V-1GAMqvFOI/AAAAAAAABTs/CPWE6nhomx0VSxpsYBdzYQh5BmA9xy7HQCLcB/s640/4.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><b>Left:</b> Original. <b>Center:</b> JPEG. <b>Right:</b> Residual GRU.</td></tr></tbody></table>While today’s commonly used codecs perform well, our work shows that using neural networks to compress images results in a compression scheme with higher quality and smaller file sizes. To learn more about the details of our research and a comparison of other recurrent architectures, check out <a href="https://arxiv.org/abs/1608.05148">our paper</a>. Our future work will focus on even better compression quality and faster models, so stay tuned!<br /><br />]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/image-compression-with-neural-networks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Announcing YouTube-8M: A Large and Diverse Labeled Video Dataset for Video Understanding Research</title>
		<link>https://googledata.org/youtube/announcing-youtube-8m-a-large-and-diverse-labeled-video-dataset-for-video-understanding-research/</link>
		<comments>https://googledata.org/youtube/announcing-youtube-8m-a-large-and-diverse-labeled-video-dataset-for-video-understanding-research/#comments</comments>
		<pubDate>Wed, 28 Sep 2016 17:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>
		<category><![CDATA[Youtube]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=8564253f15fe1f87fac089f133532dce</guid>
		<description><![CDATA[<span>Posted by Sudheendra Vijayanarasimhan and Paul Natsev, Software Engineers</span><br /><br />Many recent breakthroughs in machine learning and machine perception have come from the availability of large labeled datasets, such as <a href="http://www.image-net.org/">ImageNet</a>, which has millions of images labeled with thousands of classes. Their availability has significantly accelerated research in image understanding, for example on <a href="http://googleresearch.blogspot.com/2014/09/building-deeper-understanding-of-images.html">detecting and classifying objects in static images</a>.<br /><br /><a href="https://research.googleblog.com/2015/04/beyond-short-snippets-deep-networks-for.html">Video analysis</a> provides even more information for detecting and recognizing objects, and understanding human actions and interactions with the world.  Improving video understanding can lead to better video search and discovery, similarly to how image understanding <a href="https://googleblog.blogspot.com/2015/05/picture-this-fresh-approach-to-photos.html">helped re-imagine the photos experience</a>. However, one of the key bottlenecks for further advancements in this area has been the lack of real-world video datasets with the same scale and diversity as image datasets. <br /><br />Today, we are excited to announce the release of <a href="https://research.google.com/youtube8m">YouTube-8M</a>, a dataset of 8 million YouTube video URLs (representing over 500,000 hours of video), along with video-level labels from a diverse set of 4800 <a href="https://developers.google.com/knowledge-graph/">Knowledge Graph</a> entities.  This represents a significant increase in scale and diversity compared to existing video datasets. For example, <a href="https://github.com/gtoderici/sports-1m-dataset">Sports-1M</a>, the largest existing labeled video dataset we are aware of, has around 1 million YouTube videos and 500 sports-specific classes--YouTube-8M represents nearly an <i>order of magnitude increase</i> in both number of videos <i>and</i> classes.<br /><div><a href="https://2.bp.blogspot.com/-7TeQwPP34iE/V-vnyZPp0dI/AAAAAAAABRA/uH19ST1iJdITzX73A8Uu5HMRjTrLvWc-QCEw/s1600/image08.png"><img border="0" height="74" src="https://2.bp.blogspot.com/-7TeQwPP34iE/V-vnyZPp0dI/AAAAAAAABRA/uH19ST1iJdITzX73A8Uu5HMRjTrLvWc-QCEw/s640/image08.png" width="640"></a></div>In order to construct a labeled video dataset of this scale, we needed to address two key challenges: (1) video is much more time-consuming to annotate manually than images, and (2) video is very computationally expensive to process and store. To overcome (1), we turned to YouTube and its video annotation system, which identifies relevant Knowledge Graph topics for all public YouTube videos.  While these annotations are machine-generated, they incorporate powerful user engagement signals from millions of users as well as video metadata and content analysis. As a result, the quality of these annotations is sufficiently high to be useful for video understanding research and benchmarking purposes. <br /><div> <br /></div>To ensure the stability and quality of the labeled video dataset, we used only public videos with more than 1000 views, and we constructed a diverse vocabulary of entities, which are visually observable and sufficiently frequent. The vocabulary construction was a combination of frequency analysis, automated filtering, verification by human raters that the entities are visually observable, and grouping into 24 top-level verticals (more details in our <a href="http://arxiv.org/abs/1609.08675">technical report</a>). The figures below depict the <a href="https://research.google.com/youtube8m/explore.html">dataset browser</a> and the distribution of videos along the top-level verticals, and illustrate the dataset&#8217;s scale and diversity.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://4.bp.blogspot.com/-nyrn6uJpjtk/V-vwHkrY5JI/AAAAAAAABRU/uqjVCmFGq_gSfVRlEeyHRkcirHwE3BNfQCLcB/s1600/image07.png"><img border="0" height="428" src="https://4.bp.blogspot.com/-nyrn6uJpjtk/V-vwHkrY5JI/AAAAAAAABRU/uqjVCmFGq_gSfVRlEeyHRkcirHwE3BNfQCLcB/s640/image07.png" width="640"></a></td></tr><tr><td>A <a href="https://research.google.com/youtube8m/explore.html">dataset explorer </a>allows browsing and searching the full vocabulary of Knowledge Graph entities, grouped in 24 top-level verticals, along with corresponding videos. This screenshot depicts a subset of dataset videos annotated with the entity &#8220;Guitar&#8221;.</td></tr></tbody></table><div></div><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://1.bp.blogspot.com/-a9S_rpU77wk/V-vwjxMX96I/AAAAAAAABRc/x8OI8UrD0tgyjSfNc49cwq2PLI1ooMntQCLcB/s1600/image09.png"><img border="0" height="480" src="https://1.bp.blogspot.com/-a9S_rpU77wk/V-vwjxMX96I/AAAAAAAABRc/x8OI8UrD0tgyjSfNc49cwq2PLI1ooMntQCLcB/s640/image09.png" width="640"></a></td></tr><tr><td>The distribution of videos in the top-level verticals illustrates the scope and diversity of the dataset and reflects the natural distribution of popular YouTube videos.</td></tr></tbody></table>To address (2), we had to overcome the storage and computational resource bottlenecks that researchers face when working with videos. Pursuing video understanding at YouTube-8M&#8217;s scale would normally require a petabyte of video storage and dozens of CPU-years worth of processing.  To make the dataset useful to researchers and students with limited computational resources, we pre-processed the videos and extracted frame-level <a href="https://en.wikipedia.org/wiki/Feature_(machine_learning)">features</a> using a state-of-the-art deep learning model--the publicly available <a href="https://www.tensorflow.org/versions/r0.9/tutorials/image_recognition/index.html">Inception-V3 image annotation model</a> trained on ImageNet. These features are extracted at 1 frame-per-second temporal resolution, from 1.9 billion video frames, and are further compressed to fit on a single commodity hard disk (less than 1.5 TB).  This makes it possible to download this dataset and train a baseline <a href="https://www.tensorflow.org/">TensorFlow</a> model at full scale on a single GPU in less than a day!<br /><br />We believe this dataset can significantly accelerate research on video understanding as it enables researchers and students without access to big data or big machines to do their research at previously unprecedented scale. We hope this dataset will spur exciting new research on video modeling architectures and representation learning, especially approaches that deal effectively with noisy or incomplete labels, transfer learning and domain adaptation.  In fact, we show that pre-training models on this dataset and applying / fine-tuning on other external datasets leads to state of the art performance on them (e.g.&#160;<a href="http://activity-net.org/">ActivityNet,</a> <a href="https://github.com/gtoderici/sports-1m-dataset">Sports-1M</a>). You can read all about our experiments using this dataset, along with more details on how we constructed it, in our <a href="http://arxiv.org/abs/1609.08675">technical report</a>.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Sudheendra Vijayanarasimhan and Paul Natsev, Software Engineers</span><br /><br />Many recent breakthroughs in machine learning and machine perception have come from the availability of large labeled datasets, such as <a href="http://www.image-net.org/">ImageNet</a>, which has millions of images labeled with thousands of classes. Their availability has significantly accelerated research in image understanding, for example on <a href="http://googleresearch.blogspot.com/2014/09/building-deeper-understanding-of-images.html">detecting and classifying objects in static images</a>.<br /><br /><a href="https://research.googleblog.com/2015/04/beyond-short-snippets-deep-networks-for.html">Video analysis</a> provides even more information for detecting and recognizing objects, and understanding human actions and interactions with the world.  Improving video understanding can lead to better video search and discovery, similarly to how image understanding <a href="https://googleblog.blogspot.com/2015/05/picture-this-fresh-approach-to-photos.html">helped re-imagine the photos experience</a>. However, one of the key bottlenecks for further advancements in this area has been the lack of real-world video datasets with the same scale and diversity as image datasets. <br /><br />Today, we are excited to announce the release of <a href="https://research.google.com/youtube8m">YouTube-8M</a>, a dataset of 8 million YouTube video URLs (representing over 500,000 hours of video), along with video-level labels from a diverse set of 4800 <a href="https://developers.google.com/knowledge-graph/">Knowledge Graph</a> entities.  This represents a significant increase in scale and diversity compared to existing video datasets. For example, <a href="https://github.com/gtoderici/sports-1m-dataset">Sports-1M</a>, the largest existing labeled video dataset we are aware of, has around 1 million YouTube videos and 500 sports-specific classes--YouTube-8M represents nearly an <i>order of magnitude increase</i> in both number of videos <i>and</i> classes.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-7TeQwPP34iE/V-vnyZPp0dI/AAAAAAAABRA/uH19ST1iJdITzX73A8Uu5HMRjTrLvWc-QCEw/s1600/image08.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="74" src="https://2.bp.blogspot.com/-7TeQwPP34iE/V-vnyZPp0dI/AAAAAAAABRA/uH19ST1iJdITzX73A8Uu5HMRjTrLvWc-QCEw/s640/image08.png" width="640" /></a></div>In order to construct a labeled video dataset of this scale, we needed to address two key challenges: (1) video is much more time-consuming to annotate manually than images, and (2) video is very computationally expensive to process and store. To overcome (1), we turned to YouTube and its video annotation system, which identifies relevant Knowledge Graph topics for all public YouTube videos.  While these annotations are machine-generated, they incorporate powerful user engagement signals from millions of users as well as video metadata and content analysis. As a result, the quality of these annotations is sufficiently high to be useful for video understanding research and benchmarking purposes. <br /><div class="separator" style="clear: both; text-align: center;"><iframe height="400px" src="https://research.google.com/youtube8m/tagcloud.html" width="100%"> </iframe><br /></div>To ensure the stability and quality of the labeled video dataset, we used only public videos with more than 1000 views, and we constructed a diverse vocabulary of entities, which are visually observable and sufficiently frequent. The vocabulary construction was a combination of frequency analysis, automated filtering, verification by human raters that the entities are visually observable, and grouping into 24 top-level verticals (more details in our <a href="http://arxiv.org/abs/1609.08675">technical report</a>). The figures below depict the <a href="https://research.google.com/youtube8m/explore.html">dataset browser</a> and the distribution of videos along the top-level verticals, and illustrate the dataset’s scale and diversity.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-nyrn6uJpjtk/V-vwHkrY5JI/AAAAAAAABRU/uqjVCmFGq_gSfVRlEeyHRkcirHwE3BNfQCLcB/s1600/image07.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="428" src="https://4.bp.blogspot.com/-nyrn6uJpjtk/V-vwHkrY5JI/AAAAAAAABRU/uqjVCmFGq_gSfVRlEeyHRkcirHwE3BNfQCLcB/s640/image07.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">A <a href="https://research.google.com/youtube8m/explore.html">dataset explorer </a>allows browsing and searching the full vocabulary of Knowledge Graph entities, grouped in 24 top-level verticals, along with corresponding videos. This screenshot depicts a subset of dataset videos annotated with the entity “Guitar”.</td></tr></tbody></table><div class="separator" style="clear: both; text-align: center;"></div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-a9S_rpU77wk/V-vwjxMX96I/AAAAAAAABRc/x8OI8UrD0tgyjSfNc49cwq2PLI1ooMntQCLcB/s1600/image09.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="480" src="https://1.bp.blogspot.com/-a9S_rpU77wk/V-vwjxMX96I/AAAAAAAABRc/x8OI8UrD0tgyjSfNc49cwq2PLI1ooMntQCLcB/s640/image09.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The distribution of videos in the top-level verticals illustrates the scope and diversity of the dataset and reflects the natural distribution of popular YouTube videos.</td></tr></tbody></table>To address (2), we had to overcome the storage and computational resource bottlenecks that researchers face when working with videos. Pursuing video understanding at YouTube-8M’s scale would normally require a petabyte of video storage and dozens of CPU-years worth of processing.  To make the dataset useful to researchers and students with limited computational resources, we pre-processed the videos and extracted frame-level <a href="https://en.wikipedia.org/wiki/Feature_(machine_learning)">features</a> using a state-of-the-art deep learning model--the publicly available <a href="https://www.tensorflow.org/versions/r0.9/tutorials/image_recognition/index.html">Inception-V3 image annotation model</a> trained on ImageNet. These features are extracted at 1 frame-per-second temporal resolution, from 1.9 billion video frames, and are further compressed to fit on a single commodity hard disk (less than 1.5 TB).  This makes it possible to download this dataset and train a baseline <a href="https://www.tensorflow.org/">TensorFlow</a> model at full scale on a single GPU in less than a day!<br /><br />We believe this dataset can significantly accelerate research on video understanding as it enables researchers and students without access to big data or big machines to do their research at previously unprecedented scale. We hope this dataset will spur exciting new research on video modeling architectures and representation learning, especially approaches that deal effectively with noisy or incomplete labels, transfer learning and domain adaptation.  In fact, we show that pre-training models on this dataset and applying / fine-tuning on other external datasets leads to state of the art performance on them (e.g.&nbsp;<a href="http://activity-net.org/">ActivityNet,</a> <a href="https://github.com/gtoderici/sports-1m-dataset">Sports-1M</a>). You can read all about our experiments using this dataset, along with more details on how we constructed it, in our <a href="http://arxiv.org/abs/1609.08675">technical report</a>.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/youtube/announcing-youtube-8m-a-large-and-diverse-labeled-video-dataset-for-video-understanding-research/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>A Neural Network for Machine Translation, at Production Scale</title>
		<link>https://googledata.org/google-translate/a-neural-network-for-machine-translation-at-production-scale/</link>
		<comments>https://googledata.org/google-translate/a-neural-network-for-machine-translation-at-production-scale/#comments</comments>
		<pubDate>Tue, 27 Sep 2016 17:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>
		<category><![CDATA[Google Translate]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=29cafe99ba6a5d7c27b9cb9b57a8c2a4</guid>
		<description><![CDATA[<span>Posted by Quoc V. Le &#38; Mike Schuster, Research Scientists, Google Brain Team</span><br /><br />Ten years ago, we announced the <a href="https://research.googleblog.com/2006/04/statistical-machine-translation-live.html">launch of Google Translate</a>, together with the use of <a href="https://en.wikipedia.org/wiki/Statistical_machine_translation#Phrase-based_translation">Phrase-Based Machine Translation</a> as the key algorithm behind this service. Since then, rapid advances in machine intelligence have improved our <a href="https://research.googleblog.com/2012/08/speech-recognition-and-deep-learning.html">speech recognition</a> and <a href="https://research.googleblog.com/2014/09/building-deeper-understanding-of-images.html">image recognition</a> capabilities, but improving machine translation remains a challenging goal.<br /><br />Today we announce the Google Neural Machine Translation system (GNMT), which utilizes state-of-the-art training techniques to achieve the largest improvements to date for machine translation quality. Our full research results are described in a new technical report we are releasing today: &#8220;<i><a href="http://arxiv.org/abs/1609.08144">Google&#8217;s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation</a></i>&#8221; [1]. <br /><br />A few years ago we started using <a href="https://en.wikipedia.org/wiki/Recurrent_neural_network">Recurrent Neural Networks</a> (RNNs) to directly learn the mapping between an input sequence (e.g. a sentence in one language) to an output sequence (that same sentence in another language) [2]. Whereas Phrase-Based Machine Translation (PBMT) breaks an input sentence into words and phrases to be translated largely independently, Neural Machine Translation (NMT) considers the entire input sentence as a unit for translation.The advantage of this approach is that it requires fewer engineering design choices than previous Phrase-Based translation systems. When it first came out, NMT showed equivalent accuracy with existing Phrase-Based translation systems on modest-sized public benchmark data sets.<br /><br />Since then, researchers have proposed many techniques to improve NMT, including work on handling rare words by mimicking an external alignment model [3], using attention to align input words and output words [4] and breaking words into smaller units to cope with rare words [5,6]. Despite these improvements, NMT wasn't fast or accurate enough to be used in a production system, such as Google Translate. Our new paper [1] describes how we overcame the many challenges to make NMT work on very large data sets and built a system that is sufficiently fast and accurate enough to provide better translations for Google&#8217;s users and services.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://1.bp.blogspot.com/-jOLa-LdidQU/V-qV2oJn1aI/AAAAAAAABPg/-6OhKKPhxT89Vs9HhyKMEnyG_0ncWGjJQCLcB/s1600/image00.png"><img border="0" height="370" src="https://1.bp.blogspot.com/-jOLa-LdidQU/V-qV2oJn1aI/AAAAAAAABPg/-6OhKKPhxT89Vs9HhyKMEnyG_0ncWGjJQCLcB/s640/image00.png" width="640"></a></td></tr><tr><td>Data from side-by-side evaluations, where human raters compare the quality of translations for a given source sentence. Scores range from 0 to 6, with 0 meaning &#8220;completely nonsense translation&#8221;, and 6 meaning &#8220;perfect translation."</td></tr></tbody></table>The following visualization shows the progression of GNMT as it translates a Chinese sentence to English. First, the network encodes the Chinese words as a list of vectors, where each vector represents the meaning of all words read so far (&#8220;Encoder&#8221;). Once the entire sentence is read, the decoder begins, generating the English sentence one word at a time (&#8220;Decoder&#8221;). To generate the translated word at each step, the decoder pays attention to a weighted distribution over the encoded Chinese vectors most relevant to generate the English word (&#8220;Attention&#8221;; the blue link transparency represents how much the decoder pays attention to an encoded word).<br /><div><a href="https://3.bp.blogspot.com/-3Pbj_dvt0Vo/V-qe-Nl6P5I/AAAAAAAABQc/z0_6WtVWtvARtMk0i9_AtLeyyGyV6AI4wCLcB/s1600/nmt-model-fast.gif"><img border="0" height="324" src="https://3.bp.blogspot.com/-3Pbj_dvt0Vo/V-qe-Nl6P5I/AAAAAAAABQc/z0_6WtVWtvARtMk0i9_AtLeyyGyV6AI4wCLcB/s640/nmt-model-fast.gif" width="640"></a></div>Using human-rated side-by-side comparison as a metric, the GNMT system produces translations that are vastly improved compared to the previous phrase-based production system. GNMT reduces translation errors by more than 55%-85% on several major language pairs measured on sampled sentences from Wikipedia and news websites with the help of bilingual human raters.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://1.bp.blogspot.com/-TAEq5oc14jQ/V-qWTeqaA7I/AAAAAAAABPo/IEmOBO6x7nIkzLqomgk_DwVtzvpEtJF1QCLcB/s1600/img3.png"><img border="0" height="168" src="https://1.bp.blogspot.com/-TAEq5oc14jQ/V-qWTeqaA7I/AAAAAAAABPo/IEmOBO6x7nIkzLqomgk_DwVtzvpEtJF1QCLcB/s640/img3.png" width="640"></a></td></tr><tr><td>An example of a translation produced by our system for an input sentence sampled from a news site. Go <a href="https://drive.google.com/file/d/0B4-Ig7UAZe3BSUYweVo3eVhNY3c/view?usp=sharing">here</a> for more examples of translations for input sentences sampled randomly from news sites and books.</td></tr></tbody></table>In addition to releasing this research paper today, we are announcing the launch of GNMT in production on a notoriously difficult language pair: Chinese to English. The Google Translate mobile and web apps are now using GNMT for 100% of machine translations from Chinese to English&#8212;about 18 million translations per day. The production deployment of GNMT was made possible by use of our publicly available machine learning toolkit <a href="https://www.tensorflow.org/">TensorFlow</a> and our <a href="https://en.wikipedia.org/wiki/Tensor_processing_unit">Tensor Processing Units</a> (TPUs), which provide sufficient computational power to deploy these powerful GNMT models while meeting the stringent latency requirements of the Google Translate product. Translating from Chinese to English is one of the more than 10,000 language pairs supported by Google Translate, and we will be working to roll out GNMT to many more of these over the coming months. <br /><br />Machine translation is by no means solved. GNMT can still make significant errors that a human translator would never make, like dropping words and mistranslating proper names or rare terms, and translating sentences in isolation rather than considering the context of the paragraph or page. There is still a lot of work we can do to serve our users better. However, GNMT represents a significant milestone. We would like to celebrate it with the many researchers and engineers&#8212;both within Google and the wider community&#8212;who have contributed to this direction of research in the past few years. <br /><br /><b>Acknowledgements:</b><br />We thank members of the <a href="http://g.co/brain">Google Brain team</a> and the <a href="https://translate.google.com/">Google Translate team</a> for the help with the project. We thank Nikhil Thorat and the <a href="https://research.google.com/bigpicture/">Big Picture team</a> for the visualization.<br /><br /><b>References:</b><br />[1] <a href="http://arxiv.org/abs/1609.08144">Google&#8217;s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation</a>, <i>Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, &#321;ukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, Jeffrey Dean. Technical Report, 2016.</i><br />[2]  <a href="https://arxiv.org/abs/1409.3215">Sequence to Sequence Learning with Neural Networks</a>,<i> Ilya Sutskever, Oriol Vinyals, Quoc V. Le.  Advances in Neural Information Processing Systems, 2014.</i><br />[3] <a href="https://arxiv.org/abs/1410.8206">Addressing the rare word problem in neural machine translation</a>, <i>Minh-Thang Luong, Ilya Sutskever, Quoc V. Le, Oriol Vinyals, and Wojciech Zaremba. Proceedings of the 53th Annual Meeting of the Association for Computational Linguistics, 2015.</i><br />[4] <a href="https://arxiv.org/abs/1409.0473">Neural Machine Translation by Jointly Learning to Align and Translate</a>, <i>Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio. International Conference on Learning Representations, 2015.</i><br />[5] <a href="http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/37842.pdf">Japanese and Korean voice search</a>, <i>Mike Schuster, and Kaisuke Nakajima. IEEE International Conference on Acoustics, Speech and Signal Processing, 2012.</i><br />[6] <a href="http://arxiv.org/abs/1508.07909">Neural Machine Translation of Rare Words with Subword Units</a>, <i>Rico Sennrich, Barry Haddow, Alexandra Birch. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016.</i><br /><br />]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Quoc V. Le &amp; Mike Schuster, Research Scientists, Google Brain Team</span><br /><br />Ten years ago, we announced the <a href="https://research.googleblog.com/2006/04/statistical-machine-translation-live.html">launch of Google Translate</a>, together with the use of <a href="https://en.wikipedia.org/wiki/Statistical_machine_translation#Phrase-based_translation">Phrase-Based Machine Translation</a> as the key algorithm behind this service. Since then, rapid advances in machine intelligence have improved our <a href="https://research.googleblog.com/2012/08/speech-recognition-and-deep-learning.html">speech recognition</a> and <a href="https://research.googleblog.com/2014/09/building-deeper-understanding-of-images.html">image recognition</a> capabilities, but improving machine translation remains a challenging goal.<br /><br />Today we announce the Google Neural Machine Translation system (GNMT), which utilizes state-of-the-art training techniques to achieve the largest improvements to date for machine translation quality. Our full research results are described in a new technical report we are releasing today: “<i><a href="http://arxiv.org/abs/1609.08144">Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation</a></i>” [1]. <br /><br />A few years ago we started using <a href="https://en.wikipedia.org/wiki/Recurrent_neural_network">Recurrent Neural Networks</a> (RNNs) to directly learn the mapping between an input sequence (e.g. a sentence in one language) to an output sequence (that same sentence in another language) [2]. Whereas Phrase-Based Machine Translation (PBMT) breaks an input sentence into words and phrases to be translated largely independently, Neural Machine Translation (NMT) considers the entire input sentence as a unit for translation.The advantage of this approach is that it requires fewer engineering design choices than previous Phrase-Based translation systems. When it first came out, NMT showed equivalent accuracy with existing Phrase-Based translation systems on modest-sized public benchmark data sets.<br /><br />Since then, researchers have proposed many techniques to improve NMT, including work on handling rare words by mimicking an external alignment model [3], using attention to align input words and output words [4] and breaking words into smaller units to cope with rare words [5,6]. Despite these improvements, NMT wasn't fast or accurate enough to be used in a production system, such as Google Translate. Our new paper [1] describes how we overcame the many challenges to make NMT work on very large data sets and built a system that is sufficiently fast and accurate enough to provide better translations for Google’s users and services.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-jOLa-LdidQU/V-qV2oJn1aI/AAAAAAAABPg/-6OhKKPhxT89Vs9HhyKMEnyG_0ncWGjJQCLcB/s1600/image00.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="370" src="https://1.bp.blogspot.com/-jOLa-LdidQU/V-qV2oJn1aI/AAAAAAAABPg/-6OhKKPhxT89Vs9HhyKMEnyG_0ncWGjJQCLcB/s640/image00.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Data from side-by-side evaluations, where human raters compare the quality of translations for a given source sentence. Scores range from 0 to 6, with 0 meaning “completely nonsense translation”, and 6 meaning “perfect translation."</td></tr></tbody></table>The following visualization shows the progression of GNMT as it translates a Chinese sentence to English. First, the network encodes the Chinese words as a list of vectors, where each vector represents the meaning of all words read so far (“Encoder”). Once the entire sentence is read, the decoder begins, generating the English sentence one word at a time (“Decoder”). To generate the translated word at each step, the decoder pays attention to a weighted distribution over the encoded Chinese vectors most relevant to generate the English word (“Attention”; the blue link transparency represents how much the decoder pays attention to an encoded word).<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-3Pbj_dvt0Vo/V-qe-Nl6P5I/AAAAAAAABQc/z0_6WtVWtvARtMk0i9_AtLeyyGyV6AI4wCLcB/s1600/nmt-model-fast.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="324" src="https://3.bp.blogspot.com/-3Pbj_dvt0Vo/V-qe-Nl6P5I/AAAAAAAABQc/z0_6WtVWtvARtMk0i9_AtLeyyGyV6AI4wCLcB/s640/nmt-model-fast.gif" width="640" /></a></div>Using human-rated side-by-side comparison as a metric, the GNMT system produces translations that are vastly improved compared to the previous phrase-based production system. GNMT reduces translation errors by more than 55%-85% on several major language pairs measured on sampled sentences from Wikipedia and news websites with the help of bilingual human raters.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-TAEq5oc14jQ/V-qWTeqaA7I/AAAAAAAABPo/IEmOBO6x7nIkzLqomgk_DwVtzvpEtJF1QCLcB/s1600/img3.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="168" src="https://1.bp.blogspot.com/-TAEq5oc14jQ/V-qWTeqaA7I/AAAAAAAABPo/IEmOBO6x7nIkzLqomgk_DwVtzvpEtJF1QCLcB/s640/img3.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">An example of a translation produced by our system for an input sentence sampled from a news site. Go <a href="https://drive.google.com/file/d/0B4-Ig7UAZe3BSUYweVo3eVhNY3c/view?usp=sharing">here</a> for more examples of translations for input sentences sampled randomly from news sites and books.</td></tr></tbody></table>In addition to releasing this research paper today, we are announcing the launch of GNMT in production on a notoriously difficult language pair: Chinese to English. The Google Translate mobile and web apps are now using GNMT for 100% of machine translations from Chinese to English—about 18 million translations per day. The production deployment of GNMT was made possible by use of our publicly available machine learning toolkit <a href="https://www.tensorflow.org/">TensorFlow</a> and our <a href="https://en.wikipedia.org/wiki/Tensor_processing_unit">Tensor Processing Units</a> (TPUs), which provide sufficient computational power to deploy these powerful GNMT models while meeting the stringent latency requirements of the Google Translate product. Translating from Chinese to English is one of the more than 10,000 language pairs supported by Google Translate, and we will be working to roll out GNMT to many more of these over the coming months. <br /><br />Machine translation is by no means solved. GNMT can still make significant errors that a human translator would never make, like dropping words and mistranslating proper names or rare terms, and translating sentences in isolation rather than considering the context of the paragraph or page. There is still a lot of work we can do to serve our users better. However, GNMT represents a significant milestone. We would like to celebrate it with the many researchers and engineers—both within Google and the wider community—who have contributed to this direction of research in the past few years. <br /><br /><b>Acknowledgements:</b><br />We thank members of the <a href="http://g.co/brain">Google Brain team</a> and the <a href="https://translate.google.com/">Google Translate team</a> for the help with the project. We thank Nikhil Thorat and the <a href="https://research.google.com/bigpicture/">Big Picture team</a> for the visualization.<br /><br /><b>References:</b><br />[1] <a href="http://arxiv.org/abs/1609.08144">Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation</a>, <i>Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Łukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, Jeffrey Dean. Technical Report, 2016.</i><br />[2]  <a href="https://arxiv.org/abs/1409.3215">Sequence to Sequence Learning with Neural Networks</a>,<i> Ilya Sutskever, Oriol Vinyals, Quoc V. Le.  Advances in Neural Information Processing Systems, 2014.</i><br />[3] <a href="https://arxiv.org/abs/1410.8206">Addressing the rare word problem in neural machine translation</a>, <i>Minh-Thang Luong, Ilya Sutskever, Quoc V. Le, Oriol Vinyals, and Wojciech Zaremba. Proceedings of the 53th Annual Meeting of the Association for Computational Linguistics, 2015.</i><br />[4] <a href="https://arxiv.org/abs/1409.0473">Neural Machine Translation by Jointly Learning to Align and Translate</a>, <i>Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio. International Conference on Learning Representations, 2015.</i><br />[5] <a href="http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/37842.pdf">Japanese and Korean voice search</a>, <i>Mike Schuster, and Kaisuke Nakajima. IEEE International Conference on Acoustics, Speech and Signal Processing, 2012.</i><br />[6] <a href="http://arxiv.org/abs/1508.07909">Neural Machine Translation of Rare Words with Subword Units</a>, <i>Rico Sennrich, Barry Haddow, Alexandra Birch. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016.</i><br /><br />]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-translate/a-neural-network-for-machine-translation-at-production-scale/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Show and Tell: image captioning open sourced in TensorFlow</title>
		<link>https://googledata.org/google-research/show-and-tell-image-captioning-open-sourced-in-tensorflow/</link>
		<comments>https://googledata.org/google-research/show-and-tell-image-captioning-open-sourced-in-tensorflow/#comments</comments>
		<pubDate>Thu, 22 Sep 2016 17:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=0204f8dea2e46da552cc7d2f1611b7a8</guid>
		<description><![CDATA[<span>Posted by Chris Shallue, Software Engineer, Google Brain Team</span><br /><br />In 2014, research scientists on the <a href="http://g.co/brain">Google Brain team</a> trained a <a href="https://research.googleblog.com/2014/11/a-picture-is-worth-thousand-coherent.html">machine learning system to automatically produce captions that accurately describe images</a>. Further development of that system led to its success in the <a href="http://mscoco.org/dataset/#captions-leaderboard">Microsoft COCO 2015 image captioning challenge</a>, a competition to compare the best algorithms for computing accurate image captions, where it tied for first place.<br /><br />Today, we&#8217;re making the latest version of our image captioning system <a href="https://github.com/tensorflow/models/tree/master/im2txt">available as an open source model</a> in <a href="https://www.tensorflow.org/">TensorFlow</a>. This release contains significant improvements to the computer vision component of the captioning system, is much faster to train, and produces more detailed and accurate descriptions compared to the original system. These improvements are outlined and analyzed in the paper <i><a href="http://arxiv.org/abs/1609.06647">Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge</a></i>, published in <a href="http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=34">IEEE Transactions on Pattern Analysis and Machine Intelligence</a>.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://2.bp.blogspot.com/-0QMvTn0tWAY/V-L8xK8LxsI/AAAAAAAABO0/pLVC06wzDE4rNFhL1iTTVaU4iFR7VUZHwCLcB/s1600/Caption4.png"><img border="0" height="400" src="https://2.bp.blogspot.com/-0QMvTn0tWAY/V-L8xK8LxsI/AAAAAAAABO0/pLVC06wzDE4rNFhL1iTTVaU4iFR7VUZHwCLcB/s400/Caption4.png" width="373"></a></td></tr><tr><td>Automatically captioned by our system.</td></tr></tbody></table><b>So what&#8217;s new?</b><br /><br />Our 2014 system used the <a href="https://arxiv.org/abs/1409.4842">Inception V1</a> image classification model to initialize the image encoder, which produces the encodings that are useful for recognizing different objects in the images. This was the best image model available at the time, achieving 89.6% top-5 accuracy on the benchmark ImageNet 2012 image classification task. We replaced this in 2015 with the newer <a href="http://arxiv.org/abs/1502.03167">Inception V2</a> image classification model, which achieves 91.8% accuracy on the same task. The improved vision component gave our captioning system an accuracy boost of 2 points in the BLEU-4 metric (which is commonly used in machine translation to evaluate the quality of generated sentences) and was an important factor of its success in the captioning challenge.<br /><br />Today&#8217;s code release initializes the image encoder using the&#160;<a href="https://arxiv.org/abs/1512.00567">Inception V3</a> model, which achieves 93.9% accuracy on the ImageNet classification task. Initializing the image encoder with a better vision model gives the image captioning system a better ability to recognize different objects in the images, allowing it to generate more detailed and accurate descriptions. This gives an additional 2 points of improvement in the BLEU-4 metric over the system used in the captioning challenge.<br /><br />Another key improvement to the vision component comes from <i>fine-tuning</i> the image model. This step addresses the problem that the image encoder is initialized by a model trained to <i>classify</i> objects in images, whereas the goal of the captioning system is to <i>describe</i> the objects in images using the encodings produced by the image model. For example, an image classification model will tell you that a dog, grass and a frisbee are in the image, but a natural description should also tell you the color of the grass and how the dog relates to the frisbee.<br /><br />In the fine-tuning phase, the captioning system is improved by jointly training its vision and language components on human generated captions. This allows the captioning system to transfer information from the image that is specifically useful for generating descriptive captions, but which was not necessary for classifying objects. In particular, after fine-tuning it becomes better at correctly describing the colors of objects. Importantly, the fine-tuning phase must occur after the language component has already learned to generate captions - otherwise, the noisiness of the randomly initialized language component causes irreversible corruption to the vision component. For more details, read the full paper <a href="http://arxiv.org/abs/1609.06647">here</a>.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://4.bp.blogspot.com/-NZnzMnEzW28/V-L899rK04I/AAAAAAAABO4/sPFHtUnKbi0Oxt3DwMjJXaji6-5VST9NgCLcB/s1600/Caption1.png"><img border="0" height="302" src="https://4.bp.blogspot.com/-NZnzMnEzW28/V-L899rK04I/AAAAAAAABO4/sPFHtUnKbi0Oxt3DwMjJXaji6-5VST9NgCLcB/s640/Caption1.png" width="640"></a></td></tr><tr><td><b>Left:</b> the better image model allows the captioning model to generate more detailed and accurate descriptions.<b> Right:</b> after fine-tuning the image model, the image captioning system is more likely to describe the colors of objects correctly.</td></tr></tbody></table>Until recently our image captioning system was implemented in the <a href="http://research.google.com/pubs/pub40565.html">DistBelief software framework</a>. The TensorFlow implementation released today achieves the same level of accuracy with significantly faster performance: time per training step is just 0.7 seconds in TensorFlow compared to 3 seconds in DistBelief on an Nvidia K20 GPU, meaning that total training time is just 25% of the time previously required.<br /><br />A natural question is whether our captioning system can generate novel descriptions of previously unseen contexts and interactions. The system is trained by showing it hundreds of thousands of images that were captioned manually by humans, and it often re-uses human captions when presented with scenes similar to what it&#8217;s seen before. <br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://2.bp.blogspot.com/-EzSERqpzCtw/V-L9a_j2kjI/AAAAAAAABPA/BUrcQ6N7R2knUQBEXOy1nM79uxfx3PmlgCLcB/s1600/Caption2b.png"><img border="0" height="416" src="https://2.bp.blogspot.com/-EzSERqpzCtw/V-L9a_j2kjI/AAAAAAAABPA/BUrcQ6N7R2knUQBEXOy1nM79uxfx3PmlgCLcB/s640/Caption2b.png" width="640"></a></td></tr><tr><td>When the model is presented with scenes similar to what it&#8217;s seen before, it will often re-use human generated captions.</td></tr></tbody></table>So does it really understand the objects and their interactions in each image? Or does it always regurgitate descriptions from the training data? Excitingly, our model <i>does indeed</i> develop the ability to generate accurate new captions when presented with completely new scenes, indicating a deeper understanding of the objects and context in the images. Moreover, it learns how to express that knowledge in natural-sounding English phrases despite receiving no additional language training other than reading the human captions.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://1.bp.blogspot.com/-gmCkHgRKbp0/V-L9sTZuevI/AAAAAAAABPE/0Im_E8pWqGoS3eC90DpYHS17vuFgLBh5gCLcB/s1600/Caption3c.png"><img border="0" height="416" src="https://1.bp.blogspot.com/-gmCkHgRKbp0/V-L9sTZuevI/AAAAAAAABPE/0Im_E8pWqGoS3eC90DpYHS17vuFgLBh5gCLcB/s640/Caption3c.png" width="640"></a></td></tr><tr><td>Our model generates a completely new caption using concepts learned from similar scenes in the training set.</td></tr></tbody></table>We hope that sharing this model in TensorFlow will help push forward image captioning research and applications, and will also allow interested people to learn and have fun. To get started training your own image captioning system, and for more details on the neural network architecture, navigate to the model&#8217;s home-page <a href="https://github.com/tensorflow/models/tree/master/im2txt">here</a>. While our system uses the Inception V3 image classification model, you could even try training our system with the <a href="https://research.googleblog.com/2016/08/improving-inception-and-image.html">recently released Inception-ResNet-v2 model</a> to see if it can do even better!]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Chris Shallue, Software Engineer, Google Brain Team</span><br /><br />In 2014, research scientists on the <a href="http://g.co/brain">Google Brain team</a> trained a <a href="https://research.googleblog.com/2014/11/a-picture-is-worth-thousand-coherent.html">machine learning system to automatically produce captions that accurately describe images</a>. Further development of that system led to its success in the <a href="http://mscoco.org/dataset/#captions-leaderboard">Microsoft COCO 2015 image captioning challenge</a>, a competition to compare the best algorithms for computing accurate image captions, where it tied for first place.<br /><br />Today, we’re making the latest version of our image captioning system <a href="https://github.com/tensorflow/models/tree/master/im2txt">available as an open source model</a> in <a href="https://www.tensorflow.org/">TensorFlow</a>. This release contains significant improvements to the computer vision component of the captioning system, is much faster to train, and produces more detailed and accurate descriptions compared to the original system. These improvements are outlined and analyzed in the paper <i><a href="http://arxiv.org/abs/1609.06647">Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge</a></i>, published in <a href="http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=34">IEEE Transactions on Pattern Analysis and Machine Intelligence</a>.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://2.bp.blogspot.com/-0QMvTn0tWAY/V-L8xK8LxsI/AAAAAAAABO0/pLVC06wzDE4rNFhL1iTTVaU4iFR7VUZHwCLcB/s1600/Caption4.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="400" src="https://2.bp.blogspot.com/-0QMvTn0tWAY/V-L8xK8LxsI/AAAAAAAABO0/pLVC06wzDE4rNFhL1iTTVaU4iFR7VUZHwCLcB/s400/Caption4.png" width="373" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Automatically captioned by our system.</td></tr></tbody></table><b>So what’s new?</b><br /><br />Our 2014 system used the <a href="https://arxiv.org/abs/1409.4842">Inception V1</a> image classification model to initialize the image encoder, which produces the encodings that are useful for recognizing different objects in the images. This was the best image model available at the time, achieving 89.6% top-5 accuracy on the benchmark ImageNet 2012 image classification task. We replaced this in 2015 with the newer <a href="http://arxiv.org/abs/1502.03167">Inception V2</a> image classification model, which achieves 91.8% accuracy on the same task. The improved vision component gave our captioning system an accuracy boost of 2 points in the BLEU-4 metric (which is commonly used in machine translation to evaluate the quality of generated sentences) and was an important factor of its success in the captioning challenge.<br /><br />Today’s code release initializes the image encoder using the&nbsp;<a href="https://arxiv.org/abs/1512.00567">Inception V3</a> model, which achieves 93.9% accuracy on the ImageNet classification task. Initializing the image encoder with a better vision model gives the image captioning system a better ability to recognize different objects in the images, allowing it to generate more detailed and accurate descriptions. This gives an additional 2 points of improvement in the BLEU-4 metric over the system used in the captioning challenge.<br /><br />Another key improvement to the vision component comes from <i>fine-tuning</i> the image model. This step addresses the problem that the image encoder is initialized by a model trained to <i>classify</i> objects in images, whereas the goal of the captioning system is to <i>describe</i> the objects in images using the encodings produced by the image model. For example, an image classification model will tell you that a dog, grass and a frisbee are in the image, but a natural description should also tell you the color of the grass and how the dog relates to the frisbee.<br /><br />In the fine-tuning phase, the captioning system is improved by jointly training its vision and language components on human generated captions. This allows the captioning system to transfer information from the image that is specifically useful for generating descriptive captions, but which was not necessary for classifying objects. In particular, after fine-tuning it becomes better at correctly describing the colors of objects. Importantly, the fine-tuning phase must occur after the language component has already learned to generate captions - otherwise, the noisiness of the randomly initialized language component causes irreversible corruption to the vision component. For more details, read the full paper <a href="http://arxiv.org/abs/1609.06647">here</a>.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-NZnzMnEzW28/V-L899rK04I/AAAAAAAABO4/sPFHtUnKbi0Oxt3DwMjJXaji6-5VST9NgCLcB/s1600/Caption1.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="302" src="https://4.bp.blogspot.com/-NZnzMnEzW28/V-L899rK04I/AAAAAAAABO4/sPFHtUnKbi0Oxt3DwMjJXaji6-5VST9NgCLcB/s640/Caption1.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><b>Left:</b> the better image model allows the captioning model to generate more detailed and accurate descriptions.<b> Right:</b> after fine-tuning the image model, the image captioning system is more likely to describe the colors of objects correctly.</td></tr></tbody></table>Until recently our image captioning system was implemented in the <a href="http://research.google.com/pubs/pub40565.html">DistBelief software framework</a>. The TensorFlow implementation released today achieves the same level of accuracy with significantly faster performance: time per training step is just 0.7 seconds in TensorFlow compared to 3 seconds in DistBelief on an Nvidia K20 GPU, meaning that total training time is just 25% of the time previously required.<br /><br />A natural question is whether our captioning system can generate novel descriptions of previously unseen contexts and interactions. The system is trained by showing it hundreds of thousands of images that were captioned manually by humans, and it often re-uses human captions when presented with scenes similar to what it’s seen before. <br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://2.bp.blogspot.com/-EzSERqpzCtw/V-L9a_j2kjI/AAAAAAAABPA/BUrcQ6N7R2knUQBEXOy1nM79uxfx3PmlgCLcB/s1600/Caption2b.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="416" src="https://2.bp.blogspot.com/-EzSERqpzCtw/V-L9a_j2kjI/AAAAAAAABPA/BUrcQ6N7R2knUQBEXOy1nM79uxfx3PmlgCLcB/s640/Caption2b.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">When the model is presented with scenes similar to what it’s seen before, it will often re-use human generated captions.</td></tr></tbody></table>So does it really understand the objects and their interactions in each image? Or does it always regurgitate descriptions from the training data? Excitingly, our model <i>does indeed</i> develop the ability to generate accurate new captions when presented with completely new scenes, indicating a deeper understanding of the objects and context in the images. Moreover, it learns how to express that knowledge in natural-sounding English phrases despite receiving no additional language training other than reading the human captions.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-gmCkHgRKbp0/V-L9sTZuevI/AAAAAAAABPE/0Im_E8pWqGoS3eC90DpYHS17vuFgLBh5gCLcB/s1600/Caption3c.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="416" src="https://1.bp.blogspot.com/-gmCkHgRKbp0/V-L9sTZuevI/AAAAAAAABPE/0Im_E8pWqGoS3eC90DpYHS17vuFgLBh5gCLcB/s640/Caption3c.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Our model generates a completely new caption using concepts learned from similar scenes in the training set.</td></tr></tbody></table>We hope that sharing this model in TensorFlow will help push forward image captioning research and applications, and will also allow interested people to learn and have fun. To get started training your own image captioning system, and for more details on the neural network architecture, navigate to the model’s home-page <a href="https://github.com/tensorflow/models/tree/master/im2txt">here</a>. While our system uses the Inception V3 image classification model, you could even try training our system with the <a href="https://research.googleblog.com/2016/08/improving-inception-and-image.html">recently released Inception-ResNet-v2 model</a> to see if it can do even better!]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/show-and-tell-image-captioning-open-sourced-in-tensorflow/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>The 280-Year-Old Algorithm Inside Google Trips</title>
		<link>https://googledata.org/google-research/the-280-year-old-algorithm-inside-google-trips/</link>
		<comments>https://googledata.org/google-research/the-280-year-old-algorithm-inside-google-trips/#comments</comments>
		<pubDate>Tue, 20 Sep 2016 17:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=9eccfaba32ed8b467072047b82083021</guid>
		<description><![CDATA[<span>Posted by Bogdan Arsintescu, Software Engineer &#38; Sreenivas Gollapudi, Kostas Kollias, Tamas Sarlos and Andrew Tomkins, Research Scientists<br /></span><br /><br /><a href="https://en.wikipedia.org/wiki/Algorithm_engineering">Algorithms Engineering</a> is a lot of fun because algorithms do not go out of fashion: one never knows when an oldie-but-goodie might come in handy.  Case in point: Yesterday, Google <a href="https://googleblog.blogspot.com/2016/09/see-more-plan-less-try-google-trips.html">announced Google Trips</a>, a new app to assist you in your travels  by helping you create your own &#8220;perfect day&#8221; in a city.  Surprisingly, deep inside Google Trips, there is an algorithm that was invented 280 years ago. <br /><br />In 1736, <a href="https://en.wikipedia.org/wiki/Leonhard_Euler">Leonhard Euler</a> authored a brief but <a href="http://eulerarchive.maa.org//docs/originals/E053.pdf">beautiful mathematical paper</a> regarding the town of K&#246;nigsberg and its 7 bridges, shown here:<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://4.bp.blogspot.com/-HRkkgmmGB3Y/V-Fk43DGYkI/AAAAAAAABNw/j5c6gQMUsjAWtMWwMBnQ-D35sA8l0-McQCLcB/s1600/image05.png"><img border="0" height="504" src="https://4.bp.blogspot.com/-HRkkgmmGB3Y/V-Fk43DGYkI/AAAAAAAABNw/j5c6gQMUsjAWtMWwMBnQ-D35sA8l0-McQCLcB/s640/image05.png" width="640"></a></td></tr><tr><td>Image from <a href="https://en.wikipedia.org/wiki/Seven_Bridges_of_K%C3%B6nigsberg">Wikipedia</a></td></tr></tbody></table>In the paper, Euler studied the following question: is it possible to walk through the city crossing each bridge exactly once? As it turns out, for the city of K&#246;nigsberg, the answer is no. To reach this answer, Euler developed a general approach to represent any layout of landmasses and bridges in terms of what he dubbed the <i>Geometriam Situs</i> (the &#8220;Geometry of Place&#8221;), which we now call <a href="https://en.wikipedia.org/wiki/Graph_theory">Graph Theory</a>.  He represented each landmass as a &#8220;node&#8221; in the graph, and each bridge as an &#8220;edge,&#8221; like this:<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://1.bp.blogspot.com/-wRl8WjuCA-c/V-FkPmiZK-I/AAAAAAAABNs/Z9htxAzYeyg_C44uNKdjCYVLoQqaaQHuwCLcB/s1600/Screen%2BShot%2B2016-09-20%2Bat%2B9.26.46%2BAM.png"><img border="0" height="146" src="https://1.bp.blogspot.com/-wRl8WjuCA-c/V-FkPmiZK-I/AAAAAAAABNs/Z9htxAzYeyg_C44uNKdjCYVLoQqaaQHuwCLcB/s640/Screen%2BShot%2B2016-09-20%2Bat%2B9.26.46%2BAM.png" width="640"></a></td></tr><tr><td>Image from <a href="https://en.wikipedia.org/wiki/Seven_Bridges_of_K%C3%B6nigsberg">Wikipedia</a></td></tr></tbody></table>Euler noticed that if all the nodes in the graph have an even number of edges (such graphs are called &#8220;Eulerian&#8221; in his honor) then, and only then, a cycle can be found that visits every edge exactly once.  Keep this in mind, as we&#8217;ll rely on this fact later in the post.<br /><br />Our team in Google Research has been fascinated by the &#8220;Geometry of Place&#8221; for some time, and we started investigating a question related to Euler&#8217;s:  rather than visiting just the bridges, how can we visit as many interesting places as possible during a particular trip?  We call this the &#8220;itineraries&#8221; problem.  Euler didn&#8217;t study it, but it is a well known topic in Optimization, where it is often called the &#8220;<a href="http://chekuri.cs.illinois.edu/papers/orienteering-journal.pdf">Orienteering</a>&#8221; problem.<br /><br />While Euler&#8217;s problem has an efficient and exact solution, the itineraries problem is not just hard to solve, it is hard to even <i>approximately</i> solve!  The difficulty lies in the interplay between two conflicting goals:  first, we should pick great places to visit, but second, we should pick them to allow a good itinerary:  not too much travel time; don&#8217;t visit places when they&#8217;re closed; don&#8217;t visit too many museums, etc.  Embedded in such problems is the challenge of finding efficient routes, often referred to as the <a href="https://en.wikipedia.org/wiki/Travelling_salesman_problem">Travelling Salesman Problem</a> (TSP).<br /><br /><b>Algorithms for Travel Itineraries</b><br /><br />Fortunately, the real world has a property called the &#8220;<a href="https://en.wikipedia.org/wiki/Triangle_inequality">triangle inequality</a>&#8221; that says adding an extra stop to a route never makes it shorter.  When the underlying geometry satisfies the triangle inequality, the TSP can be approximately solved using another <a href="https://en.wikipedia.org/wiki/Christofides_algorithm">algorithm discovered by Christofides</a> in 1976.  This is an important part of our solution, and builds on Euler&#8217;s paper, so we&#8217;ll give a quick four-step rundown of how it works here:<br /><ol><li>We start with all our destinations separate, and repeatedly connect together the closest two that aren&#8217;t yet connected.  This doesn&#8217;t yet give us an itinerary, but it does connect all the destinations via a <a href="https://en.wikipedia.org/wiki/Minimum_spanning_tree">minimum spanning tree</a> of the graph.</li><li>We take all the destinations that have an odd number of connections in this tree (Euler proved there must be an even number of these), and carefully pair them up.</li><li>Because all the destinations now have an even number of edges, we&#8217;ve created an Eulerian graph, so we create a route that crosses each edge exactly once.</li><li>We now have a great route, but it might visit some places more than once.  No problem, we find any double visits and simply bypass them, going directly from the predecessor to the successor.</li></ol>Christofides gave an elegant proof that the resulting route is always close to the shortest possible.  Here&#8217;s an example of the Christofides&#8217; algorithm in action on a location graph with the nodes representing places and the edges with costs representing the travel time between the places.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://3.bp.blogspot.com/-DYlLoNxg-S8/V-FlOHLF8TI/AAAAAAAABN0/kNISdLQvX6cfWAmjT8k-LKPEMJA63nX-ACLcB/s1600/image04.png"><img border="0" height="172" src="https://3.bp.blogspot.com/-DYlLoNxg-S8/V-FlOHLF8TI/AAAAAAAABN0/kNISdLQvX6cfWAmjT8k-LKPEMJA63nX-ACLcB/s640/image04.png" width="640"></a></td></tr><tr><td>Construction of an Eulerian Tour in a location graph</td></tr></tbody></table>Armed with this efficient route-finding subroutine, we can now start building itineraries one step at a time.  At each step, we estimate the benefit to the user of each possible new place to visit, and likewise estimate the cost using the Christofides algorithm.  A user&#8217;s benefit can be derived from a host of natural factors such as the popularity of the place and how different the place is relative to places already visited on the tour. We then pick whichever new place has the best benefit per unit of extra cost (e.g., time needed to include the new place in the tour).  Here&#8217;s an example of our algorithm actually building a route in London using the location graph shown above:<br /><div><a href="https://3.bp.blogspot.com/-y_t6IG5RMuE/V-Flb6RQktI/AAAAAAAABN4/91je5cQumFcOI5bgJIliDkk3tX3MoPoAgCLcB/s1600/image01.png"><img border="0" height="274" src="https://3.bp.blogspot.com/-y_t6IG5RMuE/V-Flb6RQktI/AAAAAAAABN4/91je5cQumFcOI5bgJIliDkk3tX3MoPoAgCLcB/s640/image01.png" width="640"></a></div><b>Itineraries in Google Trips</b><br /><br />With our first good approximate solution to the itineraries problem in hand, we started working with our colleagues from the Google Trips team, and we realized we&#8217;d barely scratched the surface. For instance, even if we produce the absolute perfect itinerary, any particular user of the system will very reasonably say, &#8220;That&#8217;s great, but all my friends say I also need to visit this other place.  Plus, I&#8217;m only around for the morning, and I don&#8217;t want to miss this place you listed in the afternoon. And I&#8217;ve already seen Big Ben twice.&#8221;  So rather than just producing an itinerary once and calling it a perfect day, we needed a fast dynamic algorithm for itineraries that users can modify on the fly to suit their individual taste.  And because many people have bad data connections while traveling, the solution had to be efficient enough to run disconnected on a phone.<br /><br /><b>Better Itineraries Through the Wisdom of Crowds</b><br /><br />While the algorithmic aspects of the problem were highly challenging, we realized that producing high-quality itineraries was just as dependent on our understanding of the many possible stopping points on the itinerary.  We had Google&#8217;s extensive travel database to identify the interesting places to visit, and we also had great data from Google&#8217;s existing systems about how to travel from any place to any other.  But we didn&#8217;t have a good sense for how people typically move through this geometry of places.  <br /><br />For this, we turned to the wisdom of crowds.  This type of wisdom is used by Google to <a href="https://googleblog.blogspot.com/2007/02/stuck-in-traffic.html">estimate delays on highways</a>, and to discover <a href="https://support.google.com/business/answer/6263531?hl=en">when restaurants are most busy</a>.  Here, we use the same techniques to learn about common visit sequences that we can stitch together into itineraries that feel good to our users. We combine Google's knowledge of <a href="https://techcrunch.com/2015/07/28/google-search-now-shows-you-when-local-businesses-are-busiest/">when places are popular</a>, with the directions between those places to gather an idea of what tourists like to do when travelling.<br /><br />And the crowd has a lot more wisdom to offer in the future.  For example, we noticed that visits to Buckingham Palace spike around 11:30 and stay a bit longer than at other times of the day.  This seemed a little strange to us, but when we looked more closely, it turns out to be the time of the <a href="https://www.royalcollection.org.uk/visit/buckinghampalace/what-to-see-and-do/changing-the-guard">Changing of the Guard</a>.  We&#8217;re looking now at ways to incorporate this type of timing information into the itinerary selection algorithms.<br /><br />So give it a try:  Google Trips, available now on <a href="https://play.google.com/store/apps/details?id=com.google.android.apps.travel.onthego">Android</a> and <a href="https://itunes.apple.com/app/id1081561570?mt=8">iOS</a>, has you covered from departure to return.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Bogdan Arsintescu, Software Engineer &amp; Sreenivas Gollapudi, Kostas Kollias, Tamas Sarlos and Andrew Tomkins, Research Scientists<br /></span><br /><br /><a href="https://en.wikipedia.org/wiki/Algorithm_engineering">Algorithms Engineering</a> is a lot of fun because algorithms do not go out of fashion: one never knows when an oldie-but-goodie might come in handy.  Case in point: Yesterday, Google <a href="https://googleblog.blogspot.com/2016/09/see-more-plan-less-try-google-trips.html">announced Google Trips</a>, a new app to assist you in your travels  by helping you create your own “perfect day” in a city.  Surprisingly, deep inside Google Trips, there is an algorithm that was invented 280 years ago. <br /><br />In 1736, <a href="https://en.wikipedia.org/wiki/Leonhard_Euler">Leonhard Euler</a> authored a brief but <a href="">beautiful mathematical paper</a> regarding the town of Königsberg and its 7 bridges, shown here:<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-HRkkgmmGB3Y/V-Fk43DGYkI/AAAAAAAABNw/j5c6gQMUsjAWtMWwMBnQ-D35sA8l0-McQCLcB/s1600/image05.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="504" src="https://4.bp.blogspot.com/-HRkkgmmGB3Y/V-Fk43DGYkI/AAAAAAAABNw/j5c6gQMUsjAWtMWwMBnQ-D35sA8l0-McQCLcB/s640/image05.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Image from <a href="https://en.wikipedia.org/wiki/Seven_Bridges_of_K%C3%B6nigsberg">Wikipedia</a></td></tr></tbody></table>In the paper, Euler studied the following question: is it possible to walk through the city crossing each bridge exactly once? As it turns out, for the city of Königsberg, the answer is no. To reach this answer, Euler developed a general approach to represent any layout of landmasses and bridges in terms of what he dubbed the <i>Geometriam Situs</i> (the “Geometry of Place”), which we now call <a href="https://en.wikipedia.org/wiki/Graph_theory">Graph Theory</a>.  He represented each landmass as a “node” in the graph, and each bridge as an “edge,” like this:<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-wRl8WjuCA-c/V-FkPmiZK-I/AAAAAAAABNs/Z9htxAzYeyg_C44uNKdjCYVLoQqaaQHuwCLcB/s1600/Screen%2BShot%2B2016-09-20%2Bat%2B9.26.46%2BAM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="146" src="https://1.bp.blogspot.com/-wRl8WjuCA-c/V-FkPmiZK-I/AAAAAAAABNs/Z9htxAzYeyg_C44uNKdjCYVLoQqaaQHuwCLcB/s640/Screen%2BShot%2B2016-09-20%2Bat%2B9.26.46%2BAM.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Image from <a href="https://en.wikipedia.org/wiki/Seven_Bridges_of_K%C3%B6nigsberg">Wikipedia</a></td></tr></tbody></table>Euler noticed that if all the nodes in the graph have an even number of edges (such graphs are called “Eulerian” in his honor) then, and only then, a cycle can be found that visits every edge exactly once.  Keep this in mind, as we’ll rely on this fact later in the post.<br /><br />Our team in Google Research has been fascinated by the “Geometry of Place” for some time, and we started investigating a question related to Euler’s:  rather than visiting just the bridges, how can we visit as many interesting places as possible during a particular trip?  We call this the “itineraries” problem.  Euler didn’t study it, but it is a well known topic in Optimization, where it is often called the “<a href="http://chekuri.cs.illinois.edu/papers/orienteering-journal.pdf">Orienteering</a>” problem.<br /><br />While Euler’s problem has an efficient and exact solution, the itineraries problem is not just hard to solve, it is hard to even <i>approximately</i> solve!  The difficulty lies in the interplay between two conflicting goals:  first, we should pick great places to visit, but second, we should pick them to allow a good itinerary:  not too much travel time; don’t visit places when they’re closed; don’t visit too many museums, etc.  Embedded in such problems is the challenge of finding efficient routes, often referred to as the <a href="https://en.wikipedia.org/wiki/Travelling_salesman_problem">Travelling Salesman Problem</a> (TSP).<br /><br /><b>Algorithms for Travel Itineraries</b><br /><br />Fortunately, the real world has a property called the “<a href="https://en.wikipedia.org/wiki/Triangle_inequality">triangle inequality</a>” that says adding an extra stop to a route never makes it shorter.  When the underlying geometry satisfies the triangle inequality, the TSP can be approximately solved using another <a href="https://en.wikipedia.org/wiki/Christofides_algorithm">algorithm discovered by Christofides</a> in 1976.  This is an important part of our solution, and builds on Euler’s paper, so we’ll give a quick four-step rundown of how it works here:<br /><ol><li>We start with all our destinations separate, and repeatedly connect together the closest two that aren’t yet connected.  This doesn’t yet give us an itinerary, but it does connect all the destinations via a <a href="https://en.wikipedia.org/wiki/Minimum_spanning_tree">minimum spanning tree</a> of the graph.</li><li>We take all the destinations that have an odd number of connections in this tree (Euler proved there must be an even number of these), and carefully pair them up.</li><li>Because all the destinations now have an even number of edges, we’ve created an Eulerian graph, so we create a route that crosses each edge exactly once.</li><li>We now have a great route, but it might visit some places more than once.  No problem, we find any double visits and simply bypass them, going directly from the predecessor to the successor.</li></ol>Christofides gave an elegant proof that the resulting route is always close to the shortest possible.  Here’s an example of the Christofides’ algorithm in action on a location graph with the nodes representing places and the edges with costs representing the travel time between the places.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://3.bp.blogspot.com/-DYlLoNxg-S8/V-FlOHLF8TI/AAAAAAAABN0/kNISdLQvX6cfWAmjT8k-LKPEMJA63nX-ACLcB/s1600/image04.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="172" src="https://3.bp.blogspot.com/-DYlLoNxg-S8/V-FlOHLF8TI/AAAAAAAABN0/kNISdLQvX6cfWAmjT8k-LKPEMJA63nX-ACLcB/s640/image04.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Construction of an Eulerian Tour in a location graph</td></tr></tbody></table>Armed with this efficient route-finding subroutine, we can now start building itineraries one step at a time.  At each step, we estimate the benefit to the user of each possible new place to visit, and likewise estimate the cost using the Christofides algorithm.  A user’s benefit can be derived from a host of natural factors such as the popularity of the place and how different the place is relative to places already visited on the tour. We then pick whichever new place has the best benefit per unit of extra cost (e.g., time needed to include the new place in the tour).  Here’s an example of our algorithm actually building a route in London using the location graph shown above:<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-y_t6IG5RMuE/V-Flb6RQktI/AAAAAAAABN4/91je5cQumFcOI5bgJIliDkk3tX3MoPoAgCLcB/s1600/image01.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="274" src="https://3.bp.blogspot.com/-y_t6IG5RMuE/V-Flb6RQktI/AAAAAAAABN4/91je5cQumFcOI5bgJIliDkk3tX3MoPoAgCLcB/s640/image01.png" width="640" /></a></div><b>Itineraries in Google Trips</b><br /><br />With our first good approximate solution to the itineraries problem in hand, we started working with our colleagues from the Google Trips team, and we realized we’d barely scratched the surface. For instance, even if we produce the absolute perfect itinerary, any particular user of the system will very reasonably say, “That’s great, but all my friends say I also need to visit this other place.  Plus, I’m only around for the morning, and I don’t want to miss this place you listed in the afternoon. And I’ve already seen Big Ben twice.”  So rather than just producing an itinerary once and calling it a perfect day, we needed a fast dynamic algorithm for itineraries that users can modify on the fly to suit their individual taste.  And because many people have bad data connections while traveling, the solution had to be efficient enough to run disconnected on a phone.<br /><br /><b>Better Itineraries Through the Wisdom of Crowds</b><br /><br />While the algorithmic aspects of the problem were highly challenging, we realized that producing high-quality itineraries was just as dependent on our understanding of the many possible stopping points on the itinerary.  We had Google’s extensive travel database to identify the interesting places to visit, and we also had great data from Google’s existing systems about how to travel from any place to any other.  But we didn’t have a good sense for how people typically move through this geometry of places.  <br /><br />For this, we turned to the wisdom of crowds.  This type of wisdom is used by Google to <a href="https://googleblog.blogspot.com/2007/02/stuck-in-traffic.html">estimate delays on highways</a>, and to discover <a href="https://support.google.com/business/answer/6263531?hl=en">when restaurants are most busy</a>.  Here, we use the same techniques to learn about common visit sequences that we can stitch together into itineraries that feel good to our users. We combine Google's knowledge of <a href="https://techcrunch.com/2015/07/28/google-search-now-shows-you-when-local-businesses-are-busiest/">when places are popular</a>, with the directions between those places to gather an idea of what tourists like to do when travelling.<br /><br />And the crowd has a lot more wisdom to offer in the future.  For example, we noticed that visits to Buckingham Palace spike around 11:30 and stay a bit longer than at other times of the day.  This seemed a little strange to us, but when we looked more closely, it turns out to be the time of the <a href="https://www.royalcollection.org.uk/visit/buckinghampalace/what-to-see-and-do/changing-the-guard">Changing of the Guard</a>.  We’re looking now at ways to incorporate this type of timing information into the itinerary selection algorithms.<br /><br />So give it a try:  Google Trips, available now on <a href="https://play.google.com/store/apps/details?id=com.google.android.apps.travel.onthego">Android</a> and <a href="https://itunes.apple.com/app/id1081561570?mt=8">iOS</a>, has you covered from departure to return. ]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/the-280-year-old-algorithm-inside-google-trips/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>The 2016 Google Earth Engine User Summit: Turning pixels into insights</title>
		<link>https://googledata.org/google-research/the-2016-google-earth-engine-user-summit-turning-pixels-into-insights/</link>
		<comments>https://googledata.org/google-research/the-2016-google-earth-engine-user-summit-turning-pixels-into-insights/#comments</comments>
		<pubDate>Mon, 19 Sep 2016 16:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=73470801a8654dc38a04449f2e0ac8b1</guid>
		<description><![CDATA[Posted by Chris Herwig, Program Manager, Google Earth Engine"We are trying new methods [of flood modeling] in Earth Engine based on machine learning techniques which we think are cheaper, more scalable, and could exponentially drive down the cost of fl...]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Chris Herwig, Program Manager, Google Earth Engine</span><br /><br />"<i>We are trying new methods [of flood modeling] in Earth Engine based on machine learning techniques which we think are cheaper, more scalable, and could exponentially drive down the cost of flood mapping and make it accessible to everyone."</i><br />-Beth Tellman, Arizona State University and <a href="http://www.cloudtostreet.info/">Cloud to Street</a><br /><i><br /></i> Recently, Google headquarters hosted the <a href="http://earthenginesummit2016.earthoutreach.org/">Google Earth Engine User Summit 2016</a>, a three-day hands-on technical workshop for scientists and students interested in using Google Earth Engine for planetary-scale cloud-based geospatial analysis. Earth Engine combines a multi-petabyte catalog of satellite imagery and geospatial datasets with a simple, yet powerful API backed by Google's cloud, which scientists and researchers use to detect, measure, and predict changes to the Earth's surface. <br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://3.bp.blogspot.com/-VP3fyu1IeXM/V9xeL-MeJTI/AAAAAAAABM8/Q2ocLh1eQyEr68LRDi2g0SGDeuxa0CyfACLcB/s1600/image03.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="404" src="https://3.bp.blogspot.com/-VP3fyu1IeXM/V9xeL-MeJTI/AAAAAAAABM8/Q2ocLh1eQyEr68LRDi2g0SGDeuxa0CyfACLcB/s640/image03.jpg" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Earth Engine founder Rebecca Moore kicking off the first day of the summit</td></tr></tbody></table>Summit attendees could choose among <a href="http://earthenginesummit2016.earthoutreach.org/breakout-sessions-descriptions">twenty-five hands-on workshops</a> over the course of the three day summit, most generated for the summit specifically, giving attendees an exclusive introduction to the latest features in our platform. The sessions covered a wide range of topics and Earth Engine experience levels, from image classifiers and classifications, time series analysis, building custom web applications, all the way to arrays, matrices, and linear algebra in Earth Engine. <br /><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-nRhQuB-rmC4/V9xeVFKCtgI/AAAAAAAABNA/rT3zRK_bbQUem6A_sjl7ef0Gh9QIJL50ACLcB/s1600/image00.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="360" src="https://4.bp.blogspot.com/-nRhQuB-rmC4/V9xeVFKCtgI/AAAAAAAABNA/rT3zRK_bbQUem6A_sjl7ef0Gh9QIJL50ACLcB/s640/image00.png" width="640" /></a></div><a href="https://terrabella.google.com/">Terra Bella</a> Product Manager, Kristi Bohl, taught a <a href="https://docs.google.com/presentation/d/1SDxitSE8cLcWop2F-Q_KoF3GhdodqUucu8mv5vHwOPA/edit?usp=sharing">session on using SkySat imagery</a>, like the image above over Sydney, Australia, for change detection. Workshop attendees also learned how to take advantage of the deep temporal stack the SkySat archive offers for change-over-time analyses.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-gSGYN2_cWNY/V9xefCInbRI/AAAAAAAABNE/G9C7RuXsg_8lhUoXnWUiGRdzRKi4O0PDACLcB/s1600/image01.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="344" src="https://1.bp.blogspot.com/-gSGYN2_cWNY/V9xefCInbRI/AAAAAAAABNE/G9C7RuXsg_8lhUoXnWUiGRdzRKi4O0PDACLcB/s640/image01.jpg" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Cross-correlation between Landsat 8 NDVI and the sum of CHIRPS precipitation. Red is high cross-correlation and blue is low. The gap in data is because CHIRPS is masked over water.</td></tr></tbody></table>Nick Clinton, a developer advocate for Earth Engine, taught <a href="https://www.youtube.com/watch?v=_lCV3rZm6sg&amp;index=18&amp;list=PLWw80tqUZ5J9_3E_9C_bK8zt0mGHfvOrj">a time series session</a> that covered statistical techniques as applied to satellite imagery data. Students learned how to make graphics like the above, which shows the cross-correlation between <a href="http://landsat.usgs.gov/landsat8.php">Landsat 8 NDVI</a> and the sum of <a href="http://chg.geog.ucsb.edu/data/chirps/">CHIRPS precipitation</a> from the previous month over San Francisco, CA. The correlation should be high for relatively <a href="https://en.wikipedia.org/wiki/R/K_selection_theory">r-selected</a> plants like grasses and weeds and relatively low for perennials, shrubs, or forest.<br /><br />My <a href="https://www.youtube.com/watch?v=C_Yvg2XGZdI&amp;index=5&amp;list=PLWw80tqUZ5J9_3E_9C_bK8zt0mGHfvOrj">workshop session</a> covered how users can upload their own data into Earth Engine and the many different ways to take the results of their analyses with them, including rendering static map tiles hosted on Google Cloud Storage, exporting images, creating new assets, and even making movies, like this timelapse video of all the <a href="http://www.esa.int/Our_Activities/Observing_the_Earth/Copernicus/Sentinel-2">Sentinel 2A</a> images captured over Sydney Australia.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-XaMdTBJ5Tuw/V9xemb4vQDI/AAAAAAAABNI/3AWhfJF7x4sLsUJJjSeXf3v5G7c2GeduQCLcB/s1600/image04.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="304" src="https://4.bp.blogspot.com/-XaMdTBJ5Tuw/V9xemb4vQDI/AAAAAAAABNI/3AWhfJF7x4sLsUJJjSeXf3v5G7c2GeduQCLcB/s640/image04.gif" width="640" /></a></div>Along with the workshop sessions, we hosted five plenary speakers and 18 lightning talk presenters. These presenters shared how Earth Engine fits into their research, spanning from drought monitoring, agriculture, conservation, flood risk mapping, and hydrological analysis. <br /><br /><b>Plenary Speakers</b><br /><ul><li><a href="https://www.youtube.com/watch?v=E6mF64QSUYY&amp;feature=youtu.be">Agriculture in the Sentinel era: scaling up with Earth Engine</a>, Guido Lemoine, European Commission's Joint Research Centre</li><li>F<a href="https://www.youtube.com/watch?v=W7Ogja4ukSI&amp;feature=youtu.be">lood Vulnerability from the Cloud to the Street (and back!) powered by Google Earth Engine</a>, Beth Tellman, Arizona State University and Cloud to Street</li><li><a href="https://www.youtube.com/watch?v=bMh63zR_5sY&amp;feature=youtu.be">Accelerating Rangeland Conservation</a>, Brady Allred, University of Montana</li><li><a href="https://www.youtube.com/watch?v=t4JoQyzWAE8&amp;feature=youtu.be">Monitoring Drought with Google Earth Engine: From Archives to Answers</a>, Justin Huntington, Desert Research Institute / Western Regional Climate Center</li><li><a href="https://www.youtube.com/watch?v=NiOv3MjD4Sw&amp;feature=youtu.be">Automated methods for surface water detection</a>, Gennadii Donchytes, Deltares</li></ul><b>Lightning Presentations</b><br /><ul><li><a href="https://youtu.be/TFUdGgderz0">Mapping the Behavior of Rivers</a>, Alex Bryk, University of California, Berkeley</li><li><a href="https://youtu.be/Bv78nra-Lx0">Climate Data for Crisis and Health Applications</a>, Pietro Ceccato, Columbia University</li><li><a href="https://www.youtube.com/watch?v=AXFfKVDUkAc&amp;feature=youtu.be">Appalachian Communities at Risk</a>, Matt Wasson, Jeff Deal, Appalachian Voices</li><li><a href="https://www.youtube.com/watch?v=MZz2XrAEB7I&amp;feature=youtu.be">Water, Wildlife and Working Lands</a>, Patrick Donnelly, U.S. Fish and Wildlife Service</li><li><a href="https://www.youtube.com/watch?v=AsNdMvuMYeU&amp;feature=youtu.be">Stream-side NDVI and The Salmonid Population Viability Project</a>, Kurt Fesenmyer, Trout Unlimited</li><li><a href="https://www.youtube.com/watch?v=tJ73FTEfFz0&amp;feature=youtu.be">Mapping Evapotranspiration for Water Use and Availability</a>, Mac Friedrichs, USGS</li><li><a href="https://www.youtube.com/watch?v=zbHsi9HQ7F0&amp;feature=youtu.be">Dynamic Wildfire Modeling in Earth Engine</a>, Miranda Gray, Conservation Science Partners</li><li><a href="https://www.youtube.com/watch?v=ckA6ilW_gqA&amp;feature=youtu.be">Fishing at Scale, now also in Earth Engine</a>, David Kroodsma, Skytruth</li><li><a href="https://www.youtube.com/watch?v=7IfJuMAVe6g&amp;feature=youtu.be">Mapping crop yields from field to national scales in Earth Engine</a>, David Lobell, Stanford University</li><li><a href="https://www.youtube.com/watch?v=QX6XdBoKtdM&amp;feature=youtu.be">Mapping Pacific Wildfires Impacts with Earth Engine</a>, Matthew Lucas, University of Hawaii</li><li><a href="https://www.youtube.com/watch?v=W7WJRN-8aJw&amp;feature=youtu.be">EarthEnv.org - Environmental layers for accessing status and trends in biodiversity, ecosystems and climate</a>, Jeremy Malczyk, Map of Life</li><li><a href="https://www.youtube.com/watch?v=8taCA9NXJlg&amp;feature=youtu.be">Building a Landsat 8 Mosaic of Antarctica</a>, Allen Pope, University of Colorado Boulder</li><li><a href="https://www.youtube.com/watch?v=prd8UQjF0cc&amp;feature=youtu.be">Monitoring Primary Production at Broad Spatial and Temporal Scales</a>, Nathaniel Robinson, University of Montana</li><li><a href="https://www.youtube.com/watch?v=Gjs-hSJZskI&amp;feature=youtu.be">Assessing Urbanization Trends for Public Health: Modelling Nighttime Lights Imagery in Africa with Earth Engine</a>, David Savory, University of California, San Francisco</li><li><a href="https://www.youtube.com/watch?v=7vW3ntkRs_k&amp;feature=youtu.be">National-scale mapping of forest carbon</a>, Ty Wilson, US Forest Service</li><li><a href="https://www.youtube.com/watch?v=tUGT1o4Qm08&amp;feature=youtu.be">Utilizing Google Earth Engine to Enhance Decision-Making Capabilities</a>, Brittany Zajic, NASA DEVELOP National Program</li></ul><b>Keeping our users first</b><br /><b><br /></b> It is always inspiring to see such a diverse group of people come together to celebrate, learn, and share all the amazing and wondrous things people are doing with Earth Engine. It is not only an opportunity for our users to learn the latest techniques; it is also a way for the Earth Engine team to experience the new and exciting ways people are harnessing Earth Engine to solve some of the most pressing environmental issues facing humanity.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-PI6cPvrctgQ/V9xet3gAU4I/AAAAAAAABNM/L55U_70EiCYySRk9FLdji2ENsKp_Z81kgCLcB/s1600/image02.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="386" src="https://2.bp.blogspot.com/-PI6cPvrctgQ/V9xet3gAU4I/AAAAAAAABNM/L55U_70EiCYySRk9FLdji2ENsKp_Z81kgCLcB/s640/image02.jpg" width="640" /></a></div>We've already begun planning for next year's user summit, and based on the success of this year's, we're hoping to hold an even larger one. <br /><br />]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/the-2016-google-earth-engine-user-summit-turning-pixels-into-insights/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Research from VLDB 2016: Improved Friend Suggestion using Ego-Net Analysis</title>
		<link>https://googledata.org/google-research/research-from-vldb-2016-improved-friend-suggestion-using-ego-net-analysis/</link>
		<comments>https://googledata.org/google-research/research-from-vldb-2016-improved-friend-suggestion-using-ego-net-analysis/#comments</comments>
		<pubDate>Thu, 15 Sep 2016 18:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=66cfe71b47b6cc7833d27e70a56e4b31</guid>
		<description><![CDATA[<span>Posted by Alessandro Epasto, Research Scientist, Google Research NY</span><br /><br />On September 5 - 9, New Delhi, India hosted the <a href="http://vldb2016.persistent.com/">42nd International Conference on Very Large Data Bases</a> (VLDB), a premier annual forum for academic and industry research on databases, data management, data mining and data analytics. Over the past several years, Google has actively participated in VLDB, both as official sponsor and with numerous contributions to the research and industrial tracks. In this post, we would like to share the research presented in one of the Google papers from VLDB 2016. <br /><br />In <a href="http://www.vldb.org/pvldb/vol9/p324-epasto.pdf"><i>Ego-net Community Mining Applied to Friend Suggestion</i></a>,  co-authored by Googlers <a href="http://research.google.com/pubs/SilvioLattanzi.html">Silvio Lattanzi</a>, <a href="http://research.google.com/pubs/mirrokni.html">Vahab Mirrokni</a>, Ismail Oner Sebe, <a href="http://research.google.com/pubs/AhmedTaei.html">Ahmed Taei,</a> Sunita Verma and <a href="http://research.google.com/pubs/AlessandroEpasto.html">myself,</a> we explore how social networks can provide better friend suggestions to users, a challenging practical problem faced by all social network platforms<br /><br />Friend suggestion &#8211; the task of suggesting to a user the contacts she might already know in the network but that she hasn&#8217;t added yet &#8211; is major driver of user engagement and social connection in all online social networks. Designing a high quality system that can provide relevant and useful friend recommendations is very challenging, and requires  state-of-the-art machine learning algorithms based on a multitude of parameters.  <br /><br />An effective family of features for friend suggestion consist of <a href="https://en.wikipedia.org/wiki/Graph_(mathematics)">graph</a> features such as the <i>number of common friends </i>between two users. While widely used, the number of common friends has some major drawbacks, including the following which is shown in Figure 1.<br /><div></div><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://4.bp.blogspot.com/-88-zF6lIbX0/V9rekNAup4I/AAAAAAAABMY/EdJuYRKPC3sE6xfgOeV5xJogkaewvG3JACLcB/s1600/image01.png"><img border="0" height="492" src="https://4.bp.blogspot.com/-88-zF6lIbX0/V9rekNAup4I/AAAAAAAABMY/EdJuYRKPC3sE6xfgOeV5xJogkaewvG3JACLcB/s640/image01.png" width="640"></a></td></tr><tr><td>Figure 1: Ego-net of Sally.</td></tr></tbody></table>In this figure we represent the social connections of Sally and her friends &#8211; the <i>ego-net</i> of Sally. An ego-net of a node (in this case, Sally) is defined as the graph that contains the node itself, all of the node&#8217;s neighbors and the connection among <i>those</i> nodes. Sally has 6 friends in her ego-net: <b>A</b>lbert (her husband), <b>B</b>rian (her son), <b>C</b>harlotte (her mother) as well as <b>U</b>ma (her boss), <b>V</b>incent and <b>W</b>ally (two of her team members). Notice how <b>A</b>, <b>B</b> and <b>C</b> are all connected with each other while they do not know <b>U</b>, <b>V</b> or <b>W</b>. On the other hand <b>U,</b> <b>V</b> and <b>W</b> have all added each other as their friend (except <b>U</b> and <b>W</b> who are good friend but somehow forgot to add each other).<br /><br />Notice how each of <b>A</b>, <b>B</b>, <b>C</b> have a common friend with each of <b>U</b>, <b>V</b> and <b>W</b>: Sally herself. A friend recommendation system based on common neighbors might suggest to Sally&#8217;s son (for instance) to add Sally&#8217;s boss as his friend! In reality the situation is even more complicated because users&#8217; online and offline friends span several different social circles or communities (family, work, school, sports, etc). <br /><br />In our paper we introduce a novel technique for friend suggestions based on independently analyzing the ego-net structure. The main contribution of the paper is to show that it is possible to provide friend suggestions efficiently by constructing all ego-nets of the nodes in the graph and then independently applying community detection algorithms on them in large-scale <a href="https://en.wikipedia.org/wiki/MapReduce">distributed systems</a>. <br /><br />Specifically, the algorithm proceeds by constructing the ego-nets of all nodes and applying, independently on each of them, a community detection algorithm. More precisely the algorithm operates on so-called &#8220;ego-net-minus-ego&#8221; graphs, which is defined as the graph including only the neighbors of a given node, as shown in the figure below.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://2.bp.blogspot.com/-4taXubdlCw0/V9re8hjeM3I/AAAAAAAABMg/p8HMW4Ztsy4NL973u2fT-xHz0HHyfaM1ACLcB/s1600/image00.png"><img border="0" height="530" src="https://2.bp.blogspot.com/-4taXubdlCw0/V9re8hjeM3I/AAAAAAAABMg/p8HMW4Ztsy4NL973u2fT-xHz0HHyfaM1ACLcB/s640/image00.png" width="640"></a></td></tr><tr><td>Figure 2: Clustering of the ego-net of Sally.</td></tr></tbody></table>Notice how in this example the ego-net-minus-ego of Sally has two very clear communities: her family (<b>A</b>, <b>B</b>, <b>C</b>) and her co-workers (<b>U</b>, <b>V</b>, <b>W</b>) which are easily separated. Intuitively, this is because one might expect that while nodes (e.g. Sally) participate in many communities, there is usually a single (or a limited number of) contexts in which two specific neighbors interact. While Sally is both part of her family and work community, Sally and Uma interact <i>only</i> at work. Through extensive experimental evaluation on large-scale public social networks and formally through a simple mathematical model, our paper confirms this intuition. It seems that while communities are hard to separate in a global graph, they are easier to identify at the local level of ego-nets. <br /><br />This allows for a novel graph-based method for friend suggestion which intuitively only allows suggestion of pairs of users that are clustered together in the same community from the point of view of their common friends. With this method, <b>U</b> and <b>W</b> will be suggested to add each other (as they are in the same community and they are not yet connected) while <b>B</b> and <b>U</b> will <i>not</i> be suggested as friends as they span two different communities. <br /><br />From an algorithmic point of view, the paper introduces efficient parallel and distributed techniques for computing and clustering all ego-nets of very large graphs at the same time &#8211; a fundamental aspect enabling use of the system on the entire world Google+ graph.  We have applied this feature in the &#8220;You May Know&#8221; system of Google+, resulting in a clear positive impact on the prediction task, improving the acceptance rate by more than 1.5% and decreasing the rejection rate by more than 3.3% (a significative impact at Google scales).<br /><br />We believe that many future directions of work might stem from our preliminary results. For instance ego-net analysis could be potentially to automatically classify a user contacts in circles and to detect spam. Another interesting direction is the study of ego-network evolution in dynamic graphs.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Alessandro Epasto, Research Scientist, Google Research NY</span><br /><br />On September 5 - 9, New Delhi, India hosted the <a href="http://vldb2016.persistent.com/">42nd International Conference on Very Large Data Bases</a> (VLDB), a premier annual forum for academic and industry research on databases, data management, data mining and data analytics. Over the past several years, Google has actively participated in VLDB, both as official sponsor and with numerous contributions to the research and industrial tracks. In this post, we would like to share the research presented in one of the Google papers from VLDB 2016. <br /><br />In <a href="http://www.vldb.org/pvldb/vol9/p324-epasto.pdf"><i>Ego-net Community Mining Applied to Friend Suggestion</i></a>,  co-authored by Googlers <a href="http://research.google.com/pubs/SilvioLattanzi.html">Silvio Lattanzi</a>, <a href="http://research.google.com/pubs/mirrokni.html">Vahab Mirrokni</a>, Ismail Oner Sebe, <a href="http://research.google.com/pubs/AhmedTaei.html">Ahmed Taei,</a> Sunita Verma and <a href="http://research.google.com/pubs/AlessandroEpasto.html">myself,</a> we explore how social networks can provide better friend suggestions to users, a challenging practical problem faced by all social network platforms<br /><br />Friend suggestion – the task of suggesting to a user the contacts she might already know in the network but that she hasn’t added yet – is major driver of user engagement and social connection in all online social networks. Designing a high quality system that can provide relevant and useful friend recommendations is very challenging, and requires  state-of-the-art machine learning algorithms based on a multitude of parameters.  <br /><br />An effective family of features for friend suggestion consist of <a href="https://en.wikipedia.org/wiki/Graph_(mathematics)">graph</a> features such as the <i>number of common friends </i>between two users. While widely used, the number of common friends has some major drawbacks, including the following which is shown in Figure 1.<br /><div class="separator" style="clear: both; text-align: center;"></div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-88-zF6lIbX0/V9rekNAup4I/AAAAAAAABMY/EdJuYRKPC3sE6xfgOeV5xJogkaewvG3JACLcB/s1600/image01.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="492" src="https://4.bp.blogspot.com/-88-zF6lIbX0/V9rekNAup4I/AAAAAAAABMY/EdJuYRKPC3sE6xfgOeV5xJogkaewvG3JACLcB/s640/image01.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Figure 1: Ego-net of Sally.</td></tr></tbody></table>In this figure we represent the social connections of Sally and her friends – the <i>ego-net</i> of Sally. An ego-net of a node (in this case, Sally) is defined as the graph that contains the node itself, all of the node’s neighbors and the connection among <i>those</i> nodes. Sally has 6 friends in her ego-net: <b>A</b>lbert (her husband), <b>B</b>rian (her son), <b>C</b>harlotte (her mother) as well as <b>U</b>ma (her boss), <b>V</b>incent and <b>W</b>ally (two of her team members). Notice how <b>A</b>, <b>B</b> and <b>C</b> are all connected with each other while they do not know <b>U</b>, <b>V</b> or <b>W</b>. On the other hand <b>U,</b> <b>V</b> and <b>W</b> have all added each other as their friend (except <b>U</b> and <b>W</b> who are good friend but somehow forgot to add each other).<br /><br />Notice how each of <b>A</b>, <b>B</b>, <b>C</b> have a common friend with each of <b>U</b>, <b>V</b> and <b>W</b>: Sally herself. A friend recommendation system based on common neighbors might suggest to Sally’s son (for instance) to add Sally’s boss as his friend! In reality the situation is even more complicated because users’ online and offline friends span several different social circles or communities (family, work, school, sports, etc). <br /><br />In our paper we introduce a novel technique for friend suggestions based on independently analyzing the ego-net structure. The main contribution of the paper is to show that it is possible to provide friend suggestions efficiently by constructing all ego-nets of the nodes in the graph and then independently applying community detection algorithms on them in large-scale <a href="https://en.wikipedia.org/wiki/MapReduce">distributed systems</a>. <br /><br />Specifically, the algorithm proceeds by constructing the ego-nets of all nodes and applying, independently on each of them, a community detection algorithm. More precisely the algorithm operates on so-called “ego-net-minus-ego” graphs, which is defined as the graph including only the neighbors of a given node, as shown in the figure below.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://2.bp.blogspot.com/-4taXubdlCw0/V9re8hjeM3I/AAAAAAAABMg/p8HMW4Ztsy4NL973u2fT-xHz0HHyfaM1ACLcB/s1600/image00.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="530" src="https://2.bp.blogspot.com/-4taXubdlCw0/V9re8hjeM3I/AAAAAAAABMg/p8HMW4Ztsy4NL973u2fT-xHz0HHyfaM1ACLcB/s640/image00.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Figure 2: Clustering of the ego-net of Sally.</td></tr></tbody></table>Notice how in this example the ego-net-minus-ego of Sally has two very clear communities: her family (<b>A</b>, <b>B</b>, <b>C</b>) and her co-workers (<b>U</b>, <b>V</b>, <b>W</b>) which are easily separated. Intuitively, this is because one might expect that while nodes (e.g. Sally) participate in many communities, there is usually a single (or a limited number of) contexts in which two specific neighbors interact. While Sally is both part of her family and work community, Sally and Uma interact <i>only</i> at work. Through extensive experimental evaluation on large-scale public social networks and formally through a simple mathematical model, our paper confirms this intuition. It seems that while communities are hard to separate in a global graph, they are easier to identify at the local level of ego-nets. <br /><br />This allows for a novel graph-based method for friend suggestion which intuitively only allows suggestion of pairs of users that are clustered together in the same community from the point of view of their common friends. With this method, <b>U</b> and <b>W</b> will be suggested to add each other (as they are in the same community and they are not yet connected) while <b>B</b> and <b>U</b> will <i>not</i> be suggested as friends as they span two different communities. <br /><br />From an algorithmic point of view, the paper introduces efficient parallel and distributed techniques for computing and clustering all ego-nets of very large graphs at the same time – a fundamental aspect enabling use of the system on the entire world Google+ graph.  We have applied this feature in the “You May Know” system of Google+, resulting in a clear positive impact on the prediction task, improving the acceptance rate by more than 1.5% and decreasing the rejection rate by more than 3.3% (a significative impact at Google scales).<br /><br />We believe that many future directions of work might stem from our preliminary results. For instance ego-net analysis could be potentially to automatically classify a user contacts in circles and to detect spam. Another interesting direction is the study of ego-network evolution in dynamic graphs. ]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/research-from-vldb-2016-improved-friend-suggestion-using-ego-net-analysis/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Computational Thinking from a Dispositions Perspective</title>
		<link>https://googledata.org/google-research/computational-thinking-from-a-dispositions-perspective/</link>
		<comments>https://googledata.org/google-research/computational-thinking-from-a-dispositions-perspective/#comments</comments>
		<pubDate>Tue, 13 Sep 2016 17:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>
		<category><![CDATA[education]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=e401d7d4ae199a92c7567c152c2fd552</guid>
		<description><![CDATA[<span>Posted by Chris Stephenson, Head of Computer Science Education Programs at Google, and Joyce Malyn-Smith, Managing Project Director at Education Development Center (EDC)<br /></span> <br /><br /><i>(Cross-posted on the <a href="http://googleforeducation.blogspot.com/2016/09/Computational-Thinking.html">Google for Education Blog</a>)</i><br /><br />In K&#8211;12 computer science (CS) education, much of the discussion about what students need to learn and do to has centered around computational thinking (CT). While much of the current work in CT education is focused on core concepts and their application, the one area of CT that has not been well explored is the relationship between CT as a problem solving model,  and the dispositions or habits of mind that it can build in students of all ages. <br /><br />Exploring the mindset that CT education can engender depends, in part, on the definition of CT itself. While there are a number of definitions of CT in circulation,  <a href="http://www.amanyadav.org/CEP991A/wp-content/uploads/2014/08/Barr_Stephenson_2011.pdf">Valerie Barr and I</a> defined it in the following way:<br /><blockquote><i>CT is an approach to solving problems in a way that can be implemented with a computer. Students become not merely tool users but tool builders. They use a set of concepts, such as abstraction, recursion, and iteration, to process and analyze data, and to create real and virtual artifacts. CT is a problem solving methodology that can be automated and transferred and applied across subjects.</i></blockquote>Like many others, our view of CT also included the core CT concepts: abstraction, algorithms and procedures, automation, data collection and analysis, data representation, modeling and simulation, parallelization and problem decomposition.<br /><div><a href="https://3.bp.blogspot.com/-tBUwm1ickkM/V9YNM68H74I/AAAAAAAABLs/NfOpCTV8jG4sLH4O2VrHciXgPXCel0-LgCLcB/s1600/GoEDU_comp_thinking_0906_r2_option3.png"><img border="0" height="320" src="https://3.bp.blogspot.com/-tBUwm1ickkM/V9YNM68H74I/AAAAAAAABLs/NfOpCTV8jG4sLH4O2VrHciXgPXCel0-LgCLcB/s640/GoEDU_comp_thinking_0906_r2_option3.png" width="640"></a></div>The idea of dispositions, however, comes from the field of vocational education and research on career development which focuses on the <a href="http://googleforeducation.blogspot.com/2015/04/the-skills-agenda-preparing-students.html">personal qualities or soft skills needed for employment</a> (<i>see full report from Economist Intelligence Unit <a href="https://www.google.com/edu/resources/global-education/economist-intelligence-report/">here</a></i>). These skills traditionally include being responsible, adaptable, flexible, self-directed, and self-motivated; being able to solve simple and complex problems, having integrity, self-confidence, and self-control. They can also include the ability to work with people of different ages and cultures, collaboration, complex communication and expert thinking. <br /><a href="http://jwilson.coe.uga.edu/EMAT7050/Cuoco.HabitsOfMind.pdf"><br /></a> <a href="http://jwilson.coe.uga.edu/EMAT7050/Cuoco.HabitsOfMind.pdf">Cuoco, Goldenberg, and Mark&#8217;s</a> research also provided examples of what students should learn to develop the habits of mind used by scientists across numerous disciplines. These are: recognizing patterns, experimenting, describing, tinkering, inventing, visualizing, and conjecturing. <a href="http://dl.acm.org/citation.cfm?id=2751967&#38;CFID=659482068&#38;CFTOKEN=50296269">Potter and Vickers</a> also found that in the burgeoning field of cyber security &#8220;there is significant overlap between the roles for many soft skills, including analysis, consulting and process skills, leadership, and relationship management. Both communication and presentation skills were valued.&#8221;<br /><div><a href="https://4.bp.blogspot.com/-jqgJJ0dr-W8/V9YN4qt0BMI/AAAAAAAABLw/jkf7SSG416gGdQzVUCi8_0Xa6zhfWNCzACLcB/s1600/GoEDU_comp_thinking_0906_r2.png"><img border="0" height="320" src="https://4.bp.blogspot.com/-jqgJJ0dr-W8/V9YN4qt0BMI/AAAAAAAABLw/jkf7SSG416gGdQzVUCi8_0Xa6zhfWNCzACLcB/s640/GoEDU_comp_thinking_0906_r2.png" width="640"></a></div>CT, because of its emphasis on problem solving, provides a natural environment for embedding the idea of dispositions into K-12. According to the <a href="https://www.iste.org/explore/articleDetail?articleid=152">International Society for Technology in Education</a> and the <a href="https://csta.acm.org/Curriculum/sub/CompThinking.html">Computer Science Teachers Association</a>, the set of dispositions that student practice and internalize while learning about CT can include:<br /><ul><li>confidence in dealing with complexity,</li><li>persistence in working with difficult problems,</li><li>the ability to handle ambiguity,</li><li>the ability to deal with open-ended problems,</li><li>setting aside differences to work with others to achieve a common goal or solution, and</li><li>knowing one's strengths and weaknesses when working with others.</li></ul>Any teacher in any discipline is likely to tell you that persistence, problem solving, collaboration and awareness of one&#8217;s strengths and limitations are critical to successful learning for all students. So how do we make these dispositions a more explicit part of the CT curriculum? One of the ways to do so is to to call them out directly to students and explain why they are important in all areas of their study, career, and lives.  In addition educators can:<br /><ul><li>Post in the classroom&#173;&#173; a list of the Dispositions Leading to Success,</li><li>Help familiarize students with these dispositions by using the terms when talking with students and referring to the work they are doing. &#8220;Today we are going to be solving an open-ended problem.  What do you think that means?&#8221;</li><li>Help students understand that they are developing these dispositions by congratulating them when these dispositions lead to success: &#8220;Great problem-solving skills!&#8221;; &#8220;Great job! Your persistence helped solve the problem&#8221;; &#8220;You dealt with ambiguity really well!&#8221;.</li><li>Engage students in discussions about the dispositions: &#8220;Today we are going to work in teams. What does it mean to be on a team? What types of people would you want on your team and why?&#8221;</li><li>Help students articulate their dispositions when developing their resumes or preparing for job interviews.</li></ul>Guest speakers from industry might also:<br /><ul><li>Integrate the importance of dispositions into their talks with students: examples of the problems they have solved, how the different skills of team members led to different solutions, the role persistence played in solving a problem/developing a product or service&#8230;</li><li>Talk about the importance of dispositions to employers and how they contribute to their own organizational culture, the ways employers ask interviewees about their dispositions or how interviewees might respond (e.g. use the terms and give examples).</li></ul>As Google&#8217;s Director of Education and University Relations, Maggie Johnson noted in a <a href="http://googleforeducation.blogspot.com/2016/08/computational-thinking-for-all-students_3.html">recent blog post</a>, CT represents a core set of skills that are necessary for all students:<br /><blockquote><i>If we can make these explicit connections for students, they will see how the devices and apps that they use everyday are powered by algorithms and programs. They will learn the importance of data in making decisions. They will learn skills that will prepare them for a workforce that will be doing vastly different tasks than the workforce of today.</i></blockquote>In addition to these concepts, we can now add developing critical dispositions for success in computing and in life to the list of benefits for teaching CT to all students.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Chris Stephenson, Head of Computer Science Education Programs at Google, and Joyce Malyn-Smith, Managing Project Director at Education Development Center (EDC)<br /></span> <br /><br /><i>(Cross-posted on the <a href="http://googleforeducation.blogspot.com/2016/09/Computational-Thinking.html">Google for Education Blog</a>)</i><br /><br />In K–12 computer science (CS) education, much of the discussion about what students need to learn and do to has centered around computational thinking (CT). While much of the current work in CT education is focused on core concepts and their application, the one area of CT that has not been well explored is the relationship between CT as a problem solving model,  and the dispositions or habits of mind that it can build in students of all ages. <br /><br />Exploring the mindset that CT education can engender depends, in part, on the definition of CT itself. While there are a number of definitions of CT in circulation,  <a href="http://www.amanyadav.org/CEP991A/wp-content/uploads/2014/08/Barr_Stephenson_2011.pdf">Valerie Barr and I</a> defined it in the following way:<br /><blockquote class="tr_bq"><i>CT is an approach to solving problems in a way that can be implemented with a computer. Students become not merely tool users but tool builders. They use a set of concepts, such as abstraction, recursion, and iteration, to process and analyze data, and to create real and virtual artifacts. CT is a problem solving methodology that can be automated and transferred and applied across subjects.</i></blockquote>Like many others, our view of CT also included the core CT concepts: abstraction, algorithms and procedures, automation, data collection and analysis, data representation, modeling and simulation, parallelization and problem decomposition.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-tBUwm1ickkM/V9YNM68H74I/AAAAAAAABLs/NfOpCTV8jG4sLH4O2VrHciXgPXCel0-LgCLcB/s1600/GoEDU_comp_thinking_0906_r2_option3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://3.bp.blogspot.com/-tBUwm1ickkM/V9YNM68H74I/AAAAAAAABLs/NfOpCTV8jG4sLH4O2VrHciXgPXCel0-LgCLcB/s640/GoEDU_comp_thinking_0906_r2_option3.png" width="640" /></a></div>The idea of dispositions, however, comes from the field of vocational education and research on career development which focuses on the <a href="http://googleforeducation.blogspot.com/2015/04/the-skills-agenda-preparing-students.html">personal qualities or soft skills needed for employment</a> (<i>see full report from Economist Intelligence Unit <a href="https://www.google.com/edu/resources/global-education/economist-intelligence-report/">here</a></i>). These skills traditionally include being responsible, adaptable, flexible, self-directed, and self-motivated; being able to solve simple and complex problems, having integrity, self-confidence, and self-control. They can also include the ability to work with people of different ages and cultures, collaboration, complex communication and expert thinking. <br /><a href="http://jwilson.coe.uga.edu/EMAT7050/Cuoco.HabitsOfMind.pdf"><br /></a> <a href="http://jwilson.coe.uga.edu/EMAT7050/Cuoco.HabitsOfMind.pdf">Cuoco, Goldenberg, and Mark’s</a> research also provided examples of what students should learn to develop the habits of mind used by scientists across numerous disciplines. These are: recognizing patterns, experimenting, describing, tinkering, inventing, visualizing, and conjecturing. <a href="http://dl.acm.org/citation.cfm?id=2751967&amp;CFID=659482068&amp;CFTOKEN=50296269">Potter and Vickers</a> also found that in the burgeoning field of cyber security “there is significant overlap between the roles for many soft skills, including analysis, consulting and process skills, leadership, and relationship management. Both communication and presentation skills were valued.”<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-jqgJJ0dr-W8/V9YN4qt0BMI/AAAAAAAABLw/jkf7SSG416gGdQzVUCi8_0Xa6zhfWNCzACLcB/s1600/GoEDU_comp_thinking_0906_r2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://4.bp.blogspot.com/-jqgJJ0dr-W8/V9YN4qt0BMI/AAAAAAAABLw/jkf7SSG416gGdQzVUCi8_0Xa6zhfWNCzACLcB/s640/GoEDU_comp_thinking_0906_r2.png" width="640" /></a></div>CT, because of its emphasis on problem solving, provides a natural environment for embedding the idea of dispositions into K-12. According to the <a href="https://www.iste.org/explore/articleDetail?articleid=152">International Society for Technology in Education</a> and the <a href="https://csta.acm.org/Curriculum/sub/CompThinking.html">Computer Science Teachers Association</a>, the set of dispositions that student practice and internalize while learning about CT can include:<br /><ul><li>confidence in dealing with complexity,</li><li>persistence in working with difficult problems,</li><li>the ability to handle ambiguity,</li><li>the ability to deal with open-ended problems,</li><li>setting aside differences to work with others to achieve a common goal or solution, and</li><li>knowing one's strengths and weaknesses when working with others.</li></ul>Any teacher in any discipline is likely to tell you that persistence, problem solving, collaboration and awareness of one’s strengths and limitations are critical to successful learning for all students. So how do we make these dispositions a more explicit part of the CT curriculum? One of the ways to do so is to to call them out directly to students and explain why they are important in all areas of their study, career, and lives.  In addition educators can:<br /><ul><li>Post in the classroom­­ a list of the Dispositions Leading to Success,</li><li>Help familiarize students with these dispositions by using the terms when talking with students and referring to the work they are doing. “Today we are going to be solving an open-ended problem.  What do you think that means?”</li><li>Help students understand that they are developing these dispositions by congratulating them when these dispositions lead to success: “Great problem-solving skills!”; “Great job! Your persistence helped solve the problem”; “You dealt with ambiguity really well!”.</li><li>Engage students in discussions about the dispositions: “Today we are going to work in teams. What does it mean to be on a team? What types of people would you want on your team and why?”</li><li>Help students articulate their dispositions when developing their resumes or preparing for job interviews.</li></ul>Guest speakers from industry might also:<br /><ul><li>Integrate the importance of dispositions into their talks with students: examples of the problems they have solved, how the different skills of team members led to different solutions, the role persistence played in solving a problem/developing a product or service…</li><li>Talk about the importance of dispositions to employers and how they contribute to their own organizational culture, the ways employers ask interviewees about their dispositions or how interviewees might respond (e.g. use the terms and give examples).</li></ul>As Google’s Director of Education and University Relations, Maggie Johnson noted in a <a href="http://googleforeducation.blogspot.com/2016/08/computational-thinking-for-all-students_3.html">recent blog post</a>, CT represents a core set of skills that are necessary for all students:<br /><blockquote class="tr_bq"><i>If we can make these explicit connections for students, they will see how the devices and apps that they use everyday are powered by algorithms and programs. They will learn the importance of data in making decisions. They will learn skills that will prepare them for a workforce that will be doing vastly different tasks than the workforce of today.</i></blockquote>In addition to these concepts, we can now add developing critical dispositions for success in computing and in life to the list of benefits for teaching CT to all students.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/computational-thinking-from-a-dispositions-perspective/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Announcing the First Annual Global PhD Fellowship Summit and the 2016 Google PhD Fellows</title>
		<link>https://googledata.org/google-research/announcing-the-first-annual-global-phd-fellowship-summit-and-the-2016-google-phd-fellows/</link>
		<comments>https://googledata.org/google-research/announcing-the-first-annual-global-phd-fellowship-summit-and-the-2016-google-phd-fellows/#comments</comments>
		<pubDate>Wed, 07 Sep 2016 19:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=beb506b885a46ed1b15a329c0d425af5</guid>
		<description><![CDATA[<span>Posted by Michael Rennaker, Program Manager, University Relations</span><br /><br />In 2009, Google created the <a href="http://research.google.com/research-outreach.html#/research-outreach/graduate-fellowships">PhD Fellowship Program</a> to recognize and support outstanding graduate students doing exceptional research in Computer Science and related disciplines. Now in its eighth year, our Fellowships have helped support over 250 graduate students in <a href="http://google-au.blogspot.com.au/2015/08/phd-fellowships-to-support-cutting-edge.html">Australia</a>, <a href="http://www.google.cn/intl/en/university/research/phdfellowship.html">China and East Asia</a>, <a href="https://research.google.com/university/relations/phd_fellowship_india.html">India</a>, <a href="http://googleresearch.blogspot.com/2015/02/announcing-2015-north-american-google.html">North America</a>, <a href="http://googleresearch.blogspot.com/2015/06/announcing-2015-google-european.html">Europe and the Middle East</a> who seek to shape and influence the future of technology.<br /><br />Recently, Google PhD Fellows from around the globe converged on our Mountain View campus for the first annual Global PhD Fellowship Summit.  The students heard talks from researchers like <a href="http://research.google.com/pubs/jeff.html">Jeff Dean</a>, <a href="http://research.google.com/pubs/author21120.html">Fran&#231;oise Beaufays</a>, <a href="http://research.google.com/pubs/author205.html">Peter Norvig</a>, <a href="http://research.google.com/pubs/MayaGupta.html">Maya Gupta</a> and <a href="http://research.google.com/pubs/AminVahdat.html">Amin Vahdat</a>, and got a glimpse into some of the state-of-the-art research pursued across Google.  <br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://4.bp.blogspot.com/-FKBH8S5My3A/V9BVvjGxl-I/AAAAAAAABLM/oa1bwsjCrYMFv8GOePgQsPgeR9PAPACwQCLcB/s1600/image02.png"><img border="0" height="288" src="https://4.bp.blogspot.com/-FKBH8S5My3A/V9BVvjGxl-I/AAAAAAAABLM/oa1bwsjCrYMFv8GOePgQsPgeR9PAPACwQCLcB/s640/image02.png" width="640"></a></td></tr><tr><td>Senior Google Fellow Jeff Dean shares how TensorFlow is used at Google</td></tr></tbody></table>Fellows also had the chance to connect one-on-one with Googlers to discuss their research, as well as receive feedback from leaders in their fields. The event wrapped up with a panel discussion with <a href="http://research.google.com/pubs/author173.html">Dan Russell</a>, <a href="http://research.google.com/pubs/KristenLeFevre.html">Kristen LeFevre</a>, <a href="http://research.google.com/pubs/author39086.html">Douglas Eck</a> and <a href="http://research.google.com/pubs/author21120.html">Fran&#231;oise Beaufays</a> about their unique career paths. <a href="https://research.google.com/pubs/104788.html">Maggie Johnson</a> concluded the Summit by sharing about the different types of research environments across academia and industry.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://2.bp.blogspot.com/-8qdHYEnUzoM/V9BV3Q_ycQI/AAAAAAAABLQ/H0sGdTPDX7AasHoWHZ_0nwb25dRnKjUiwCLcB/s1600/img2.png"><img border="0" height="202" src="https://2.bp.blogspot.com/-8qdHYEnUzoM/V9BV3Q_ycQI/AAAAAAAABLQ/H0sGdTPDX7AasHoWHZ_0nwb25dRnKjUiwCLcB/s640/img2.png" width="640"></a></td></tr><tr><td>(Left) PhD Fellows share their work with Google researchers during the poster session<br />(Right) Research panelists share their journeys through academia and industry</td></tr></tbody></table>Our PhD Fellows represent some the best and brightest young researchers around the globe in Computer Science and it is our ongoing goal to support them as they make their mark on the world.<br /><br />We&#8217;d also like to welcome the newest class of Google PhD Fellows recently awarded in China and East Asia, India, and Australia. We look forward to seeing each of them at next year&#8217;s summit!<br /><br /><b>2016 Global PhD Fellows</b><br /><br /><b>Computational Neuroscience</b><br />Cameron (Po-Hsuan) Chen, <i>Princeton University</i><br />Grace Lindsay, <i>Columbia University</i><br />Martino Sorbaro Sindaci, <i>The University of Edinburgh</i><br /><br /><b>Human-Computer Interaction</b><br />Dana McKay, <i>University of Melbourne</i><br />Koki Nagano, <i>University of Southern California</i><br />Arvind Satyanarayan, <i>Stanford University</i><br />Amy Xian Zhang, <i>Massachusetts Institute of Technology</i><br /><br /><b>Machine Learning</b><br />Olivier Bachem, <i>Swiss Federal Institute of Technology Zurich</i><br />Tianqi Chen, <i>University of Washington</i><br />Emily Denton, <i>New York University</i><br />Kwan Hui Lim, <i>University of Melbourne</i><br />Yves-Laurent Kom Samo, <i>University of Oxford</i><br />Woosang Lim, <i>Korea Advanced Institute of Science and Technology</i><br />Anirban Santara, <i>Indian Institute of Technology Kharagpur</i><br />Daniel Jaymin Mankowitz, <i>Technion - Israel Institute of Technology</i><br />Lucas Maystre, <i>&#201;cole Polytechnique F&#233;d&#233;rale de Lausanne</i><br />Arvind Neelakantan, <i>University of Massachusetts, Amherst</i><br />Ludwig Schmidt, <i>Massachusetts Institute of Technology</i><br />Quanming Yao, <i>The Hong Kong University of Science and Technology</i><br />Shandian Zhe, <i>Purdue University, West Lafayette</i><br /><br /><b>Machine Perception, Speech Technology and Computer Vision</b><br />Eugen Beck, <i>RWTH Aachen University</i><br />Yu-Wei Chao, <i>University of Michigan, Ann Arbor</i><br />Wei Liu, <i>University of North Carolina at Chapel Hill</i><br />Aron Monszpart, <i>University College London</i><br />Thomas Schoeps, <i>Swiss Federal Institute of Technology Zurich</i><br />Tian Tan, <i>Shanghai Jiao Tong University</i><br />Chia-Yin Tsai, <i>Carnegie Mellon University</i><br />Weitao Xu, <i>University of Queensland</i><br /><br /><b>Market Algorithms</b><br />Hossein Esfandiari, <i>University of Maryland, College Park</i><br />Sandy Heydrich, <i>Saarland University - Saarbrucken GSCS</i><br />Rad Niazadeh, <i>Cornell University</i><br />Sadra Yazdanbod, <i>Georgia Institute of Technology</i><br /><br /><b>Mobile Computing</b><br />Lei Kang, <i>University of Wisconsin</i><br />Tauhidur Rahman, <i>Cornell University</i><br />Chungkuk Yoo, <i>Korea Advanced Institute of Science and Technology</i><br />Yuhao Zhu, <i>University of Texas, Austin</i><br /><br /><b>Natural Language Processing</b><br />Tamer Alkhouli, <i>RWTH Aachen University</i><br />Jose Camacho Collados, <i>Sapienza - Universit&#224; di Roma</i><br /><br /><b>Privacy and Security</b><br />Chitra Javali, <i>University of New South Wales</i><br />Kartik Nayak, <i>University of Maryland, College Park</i><br />Nicolas Papernot, <i>Pennsylvania State University</i><br />Damian Vizar, <i>&#201;cole Polytechnique F&#233;d&#233;rale de Lausanne</i><br />Xi Wu, <i>University of Wisconsin</i><br /><br /><b>Programming Languages, Algorithms and Software Engineering</b><br />Marcelo Sousa, <i>University of Oxford</i><br />Arpita Biswas, <i>Indian Institute of Science</i><br /><br /><b>Structured Data and Database Management</b><br />Xiang Ren, <i>University of Illinois, Urbana-Champaign</i><br /><br /><b>Systems and Networking</b><br />Ying Chen, <i>Tsinghua University</i><br />Andrew Crotty, <i>Brown University</i><br />Aniruddha Singh Kushwaha, <i>Indian Institute of Technology Bombay</i><br />Ilias Marinos, <i>University of Cambridge</i><br />Kay Ousterhout, <i>University of California, Berkeley</i><br />Hong Zhang, <i>The Hong Kong University of Science and Technology</i>]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Michael Rennaker, Program Manager, University Relations</span><br /><br />In 2009, Google created the <a href="http://research.google.com/research-outreach.html#/research-outreach/graduate-fellowships">PhD Fellowship Program</a> to recognize and support outstanding graduate students doing exceptional research in Computer Science and related disciplines. Now in its eighth year, our Fellowships have helped support over 250 graduate students in <a href="http://google-au.blogspot.com.au/2015/08/phd-fellowships-to-support-cutting-edge.html">Australia</a>, <a href="http://www.google.cn/intl/en/university/research/phdfellowship.html">China and East Asia</a>, <a href="https://research.google.com/university/relations/phd_fellowship_india.html">India</a>, <a href="http://googleresearch.blogspot.com/2015/02/announcing-2015-north-american-google.html">North America</a>, <a href="http://googleresearch.blogspot.com/2015/06/announcing-2015-google-european.html">Europe and the Middle East</a> who seek to shape and influence the future of technology.<br /><br />Recently, Google PhD Fellows from around the globe converged on our Mountain View campus for the first annual Global PhD Fellowship Summit.  The students heard talks from researchers like <a href="http://research.google.com/pubs/jeff.html">Jeff Dean</a>, <a href="http://research.google.com/pubs/author21120.html">Françoise Beaufays</a>, <a href="http://research.google.com/pubs/author205.html">Peter Norvig</a>, <a href="http://research.google.com/pubs/MayaGupta.html">Maya Gupta</a> and <a href="http://research.google.com/pubs/AminVahdat.html">Amin Vahdat</a>, and got a glimpse into some of the state-of-the-art research pursued across Google.  <br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-FKBH8S5My3A/V9BVvjGxl-I/AAAAAAAABLM/oa1bwsjCrYMFv8GOePgQsPgeR9PAPACwQCLcB/s1600/image02.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="288" src="https://4.bp.blogspot.com/-FKBH8S5My3A/V9BVvjGxl-I/AAAAAAAABLM/oa1bwsjCrYMFv8GOePgQsPgeR9PAPACwQCLcB/s640/image02.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Senior Google Fellow Jeff Dean shares how TensorFlow is used at Google</td></tr></tbody></table>Fellows also had the chance to connect one-on-one with Googlers to discuss their research, as well as receive feedback from leaders in their fields. The event wrapped up with a panel discussion with <a href="http://research.google.com/pubs/author173.html">Dan Russell</a>, <a href="http://research.google.com/pubs/KristenLeFevre.html">Kristen LeFevre</a>, <a href="http://research.google.com/pubs/author39086.html">Douglas Eck</a> and <a href="http://research.google.com/pubs/author21120.html">Françoise Beaufays</a> about their unique career paths. <a href="https://research.google.com/pubs/104788.html">Maggie Johnson</a> concluded the Summit by sharing about the different types of research environments across academia and industry.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://2.bp.blogspot.com/-8qdHYEnUzoM/V9BV3Q_ycQI/AAAAAAAABLQ/H0sGdTPDX7AasHoWHZ_0nwb25dRnKjUiwCLcB/s1600/img2.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="202" src="https://2.bp.blogspot.com/-8qdHYEnUzoM/V9BV3Q_ycQI/AAAAAAAABLQ/H0sGdTPDX7AasHoWHZ_0nwb25dRnKjUiwCLcB/s640/img2.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">(Left) PhD Fellows share their work with Google researchers during the poster session<br />(Right) Research panelists share their journeys through academia and industry</td></tr></tbody></table>Our PhD Fellows represent some the best and brightest young researchers around the globe in Computer Science and it is our ongoing goal to support them as they make their mark on the world.<br /><br />We’d also like to welcome the newest class of Google PhD Fellows recently awarded in China and East Asia, India, and Australia. We look forward to seeing each of them at next year’s summit!<br /><br /><b>2016 Global PhD Fellows</b><br /><br /><b>Computational Neuroscience</b><br />Cameron (Po-Hsuan) Chen, <i>Princeton University</i><br />Grace Lindsay, <i>Columbia University</i><br />Martino Sorbaro Sindaci, <i>The University of Edinburgh</i><br /><br /><b>Human-Computer Interaction</b><br />Dana McKay, <i>University of Melbourne</i><br />Koki Nagano, <i>University of Southern California</i><br />Arvind Satyanarayan, <i>Stanford University</i><br />Amy Xian Zhang, <i>Massachusetts Institute of Technology</i><br /><br /><b>Machine Learning</b><br />Olivier Bachem, <i>Swiss Federal Institute of Technology Zurich</i><br />Tianqi Chen, <i>University of Washington</i><br />Emily Denton, <i>New York University</i><br />Kwan Hui Lim, <i>University of Melbourne</i><br />Yves-Laurent Kom Samo, <i>University of Oxford</i><br />Woosang Lim, <i>Korea Advanced Institute of Science and Technology</i><br />Anirban Santara, <i>Indian Institute of Technology Kharagpur</i><br />Daniel Jaymin Mankowitz, <i>Technion - Israel Institute of Technology</i><br />Lucas Maystre, <i>École Polytechnique Fédérale de Lausanne</i><br />Arvind Neelakantan, <i>University of Massachusetts, Amherst</i><br />Ludwig Schmidt, <i>Massachusetts Institute of Technology</i><br />Quanming Yao, <i>The Hong Kong University of Science and Technology</i><br />Shandian Zhe, <i>Purdue University, West Lafayette</i><br /><br /><b>Machine Perception, Speech Technology and Computer Vision</b><br />Eugen Beck, <i>RWTH Aachen University</i><br />Yu-Wei Chao, <i>University of Michigan, Ann Arbor</i><br />Wei Liu, <i>University of North Carolina at Chapel Hill</i><br />Aron Monszpart, <i>University College London</i><br />Thomas Schoeps, <i>Swiss Federal Institute of Technology Zurich</i><br />Tian Tan, <i>Shanghai Jiao Tong University</i><br />Chia-Yin Tsai, <i>Carnegie Mellon University</i><br />Weitao Xu, <i>University of Queensland</i><br /><br /><b>Market Algorithms</b><br />Hossein Esfandiari, <i>University of Maryland, College Park</i><br />Sandy Heydrich, <i>Saarland University - Saarbrucken GSCS</i><br />Rad Niazadeh, <i>Cornell University</i><br />Sadra Yazdanbod, <i>Georgia Institute of Technology</i><br /><br /><b>Mobile Computing</b><br />Lei Kang, <i>University of Wisconsin</i><br />Tauhidur Rahman, <i>Cornell University</i><br />Chungkuk Yoo, <i>Korea Advanced Institute of Science and Technology</i><br />Yuhao Zhu, <i>University of Texas, Austin</i><br /><br /><b>Natural Language Processing</b><br />Tamer Alkhouli, <i>RWTH Aachen University</i><br />Jose Camacho Collados, <i>Sapienza - Università di Roma</i><br /><br /><b>Privacy and Security</b><br />Chitra Javali, <i>University of New South Wales</i><br />Kartik Nayak, <i>University of Maryland, College Park</i><br />Nicolas Papernot, <i>Pennsylvania State University</i><br />Damian Vizar, <i>École Polytechnique Fédérale de Lausanne</i><br />Xi Wu, <i>University of Wisconsin</i><br /><br /><b>Programming Languages, Algorithms and Software Engineering</b><br />Marcelo Sousa, <i>University of Oxford</i><br />Arpita Biswas, <i>Indian Institute of Science</i><br /><br /><b>Structured Data and Database Management</b><br />Xiang Ren, <i>University of Illinois, Urbana-Champaign</i><br /><br /><b>Systems and Networking</b><br />Ying Chen, <i>Tsinghua University</i><br />Andrew Crotty, <i>Brown University</i><br />Aniruddha Singh Kushwaha, <i>Indian Institute of Technology Bombay</i><br />Ilias Marinos, <i>University of Cambridge</i><br />Kay Ousterhout, <i>University of California, Berkeley</i><br />Hong Zhang, <i>The Hong Kong University of Science and Technology</i>]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/announcing-the-first-annual-global-phd-fellowship-summit-and-the-2016-google-phd-fellows/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Reproducible Science: Cancer Researchers Embrace Containers in the Cloud</title>
		<link>https://googledata.org/google-research/reproducible-science-cancer-researchers-embrace-containers-in-the-cloud/</link>
		<comments>https://googledata.org/google-research/reproducible-science-cancer-researchers-embrace-containers-in-the-cloud/#comments</comments>
		<pubDate>Tue, 06 Sep 2016 18:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=e48e8d165ac0bdc98bb5b95683101ad4</guid>
		<description><![CDATA[<span>Posted by Dr. Kyle Ellrott, Oregon Health and Sciences University, Dr. Josh Stuart, University of California Santa Cruz, and Dr. Paul Boutros, Ontario Institute for Cancer Research</span><br /><br /><i>Today we hear from the principal investigators of the ICGC-TCGA DREAM Somatic Mutation Calling Challenges about how they are encouraging cancer researchers to make use of Docker and Google Cloud Platform to gain a deeper understanding of the complex genetic mutations that occur in cancer, while doing so in a reproducible way.</i><br /><i>&#8211; Nicole Deflaux and Jonathan Bingham, Google Genomics</i><br /><br />Today&#8217;s genomic analysis software tools often give different answers when run in different computing environments - that&#8217;s like getting a different diagnosis from your doctor depending on which examination room you&#8217;re sitting in. <a href="https://en.wikipedia.org/wiki/Reproducibility">Reproducible</a> science matters, especially in cancer research where so many lives are at stake.  The <a href="http://www.cancermoonshot2020.org/">Cancer Moonshot</a> has called for the research world to '<a href="https://www.whitehouse.gov/the-press-office/2016/01/12/inspiring-new-generation-defy-bounds-innovation-moonshot-cure-cancer">Break down silos and bring all the cancer fighters together</a>'.  Portable software &#8220;<a href="https://en.wikipedia.org/wiki/Software_container">containers</a>&#8221; and cloud computing hold the potential to help achieve these goals by making scientific data analysis more reproducible, reusable and scalable. <br /><br />Our team of researchers from the <a href="http://oicr.on.ca/">Ontario Institute for Cancer Research</a>, <a href="https://genomics.soe.ucsc.edu/">University of California Santa Cruz</a>, <a href="http://sagebase.org/">Sage Bionetworks</a> and <a href="https://www.ohsu.edu/xd/education/schools/school-of-medicine/departments/computational-biology/">Oregon Health and Sciences University</a> is pushing the frontiers by encouraging scientists to package up their software in reusable <a href="https://www.docker.com/">Docker</a> containers and make use of cloud-resident data from the <a href="https://cbiit.nci.nih.gov/ncip/nci-cancer-genomics-cloud-pilots">Cancer Cloud Pilots funded by the National Cancer Institute</a>.<br /><br />In 2014 we initiated the <a href="https://research.googleblog.com/2014/07/facilitating-genomics-research-with.html">ICGC-TCGA DREAM Somatic Mutation Calling (SMC) Challenges</a> where Google provided credits on <a href="https://cloud.google.com/">Google Cloud Platform</a>. The first result of this collaboration was the DREAM-SMC DNA challenge, a public challenge that engaged cancer researchers from around the world to find the best methods for discovering <a href="https://en.wikipedia.org/wiki/Mutation#Somatic_mutations">DNA somatic mutations</a>. By the end of the challenge, over 400 registered participants competed by submitting 3,500 open-source entries for 14 test genomes, <a href="http://www.nature.com/nmeth/journal/v12/n7/full/nmeth.3407.html">providing key insights</a> on the strengths and limitations of the current mutation detection methods.<br /><br />The SMC-DNA challenge enabled comparison of results, but it did little to facilitate the exchange of cross-platform software tools. Accessing extremely large genome sequence input files and shepherding complex software pipelines created a &#8220;double whammy&#8221; to discourage data sharing and software reuse.<br /><br />How can we overcome these barriers?<br /><br />Exciting developments have taken place in the past couple of years that may annihilate these last barriers. The availability of cloud technologies and <a href="https://en.wikipedia.org/wiki/Docker_(software)">containerization</a> can serve as the vanguards of reproducibility and interoperability.<br /><br />Thus, a new way of creating open DREAM challenges has emerged: rather than encouraging the status quo where participants run their own methods themselves on their own systems, and the results cannot be verified, the new challenge design requires participants to submit open-source code packaged in Docker containers so that anyone can run their methods and verify the results. Real-time leaderboards show which entries are winning and top performers have a chance to claim a prize. <br /><br />Working with Google Genomics and Google Cloud Platform, the DREAM-SMC organizers are now using cloud and containerization technologies to enable portability and reproducibility as a core part of the DREAM challenges. The latest SMC installments, the <a href="https://www.synapse.org/SMCHet">SMC-Het Challenge</a> and the <a href="https://www.synapse.org/SMC_RNA">SMC-RNA Challenge</a> have implemented this new plan:<br /><br /><ul><li><a href="https://www.synapse.org/SMCHet">SMC-Het Challenge</a>: Tumour biopsies are composed of many different cell types in addition to tumour cells, including normal tissue and infiltrating immune cells. Furthermore, the tumours themselves are made of a mixture of different subpopulations, all related to one another through cell division and mutation. Critically, each sub-population can have distinct clinical outcomes, with some more resistant to treatment or more likely to metastasize than others. The goal of the SMC-Het Challenge is to identify the best methods for predicting <a href="https://en.wikipedia.org/wiki/Tumour_heterogeneity">tumor subpopulations</a> and their &#8220;family tree&#8221; of relatedness from genome sequencing data.</li><li><a href="https://www.synapse.org/SMC_RNA">SMC-RNA Challenge</a>: The alteration of RNA production is a fundamental mechanism by which cancer cells rewire cellular circuitry. Genomic rearrangements in cancer cells can produce fused protein products that can bestow Frankenstein-like properties. Both RNA abundances and novel fusions can serve as the basis for clinically-important prognostic biomarkers. The SMC-RNA Challenge will identify the best methods to detect such rogue expressed RNAs in cancer cells.</li></ul><br />Ultimately, the success will be gauged by the amount of serious participation in these latest competitions. So far, the signs are encouraging. SMC-Het, which focuses on a very new research area, launched in November  2015 and has already enlisted 18 teams contributing over 70 submissions. SMC-RNA just recently launched and will run until early 2017, with several of the world leaders in the field starting to prepare entries. What&#8217;s great about the submissions being packaged in containers is that even after the challenges end, the tested methods can be applied and further adapted by anyone around the world.<br /><br />Thus, the moon shot need not be a lucky solo attempt made by one hero in one moment of inspiration. Instead, the new informatics of clouds and containers will enable us to combine intelligence so we can build a series of bridges from here to there.<br /><br />To participate in the DREAM challenges, visit the <a href="https://www.synapse.org/SMCHet">SMC-Het</a> and <a href="https://www.synapse.org/SMC_RNA">SMC-RNA</a> Challenge sites.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Dr. Kyle Ellrott, Oregon Health and Sciences University, Dr. Josh Stuart, University of California Santa Cruz, and Dr. Paul Boutros, Ontario Institute for Cancer Research</span><br /><br /><i>Today we hear from the principal investigators of the ICGC-TCGA DREAM Somatic Mutation Calling Challenges about how they are encouraging cancer researchers to make use of Docker and Google Cloud Platform to gain a deeper understanding of the complex genetic mutations that occur in cancer, while doing so in a reproducible way.</i><br /><i>– Nicole Deflaux and Jonathan Bingham, Google Genomics</i><br /><br />Today’s genomic analysis software tools often give different answers when run in different computing environments - that’s like getting a different diagnosis from your doctor depending on which examination room you’re sitting in. <a href="https://en.wikipedia.org/wiki/Reproducibility">Reproducible</a> science matters, especially in cancer research where so many lives are at stake.  The <a href="http://www.cancermoonshot2020.org/">Cancer Moonshot</a> has called for the research world to '<a href="https://www.whitehouse.gov/the-press-office/2016/01/12/inspiring-new-generation-defy-bounds-innovation-moonshot-cure-cancer">Break down silos and bring all the cancer fighters together</a>'.  Portable software “<a href="https://en.wikipedia.org/wiki/Software_container">containers</a>” and cloud computing hold the potential to help achieve these goals by making scientific data analysis more reproducible, reusable and scalable. <br /><br />Our team of researchers from the <a href="http://oicr.on.ca/">Ontario Institute for Cancer Research</a>, <a href="https://genomics.soe.ucsc.edu/">University of California Santa Cruz</a>, <a href="http://sagebase.org/">Sage Bionetworks</a> and <a href="https://www.ohsu.edu/xd/education/schools/school-of-medicine/departments/computational-biology/">Oregon Health and Sciences University</a> is pushing the frontiers by encouraging scientists to package up their software in reusable <a href="https://www.docker.com/">Docker</a> containers and make use of cloud-resident data from the <a href="https://cbiit.nci.nih.gov/ncip/nci-cancer-genomics-cloud-pilots">Cancer Cloud Pilots funded by the National Cancer Institute</a>.<br /><br />In 2014 we initiated the <a href="https://research.googleblog.com/2014/07/facilitating-genomics-research-with.html">ICGC-TCGA DREAM Somatic Mutation Calling (SMC) Challenges</a> where Google provided credits on <a href="https://cloud.google.com/">Google Cloud Platform</a>. The first result of this collaboration was the DREAM-SMC DNA challenge, a public challenge that engaged cancer researchers from around the world to find the best methods for discovering <a href="https://en.wikipedia.org/wiki/Mutation#Somatic_mutations">DNA somatic mutations</a>. By the end of the challenge, over 400 registered participants competed by submitting 3,500 open-source entries for 14 test genomes, <a href="http://www.nature.com/nmeth/journal/v12/n7/full/nmeth.3407.html">providing key insights</a> on the strengths and limitations of the current mutation detection methods.<br /><br />The SMC-DNA challenge enabled comparison of results, but it did little to facilitate the exchange of cross-platform software tools. Accessing extremely large genome sequence input files and shepherding complex software pipelines created a “double whammy” to discourage data sharing and software reuse.<br /><br />How can we overcome these barriers?<br /><br />Exciting developments have taken place in the past couple of years that may annihilate these last barriers. The availability of cloud technologies and <a href="https://en.wikipedia.org/wiki/Docker_(software)">containerization</a> can serve as the vanguards of reproducibility and interoperability.<br /><br />Thus, a new way of creating open DREAM challenges has emerged: rather than encouraging the status quo where participants run their own methods themselves on their own systems, and the results cannot be verified, the new challenge design requires participants to submit open-source code packaged in Docker containers so that anyone can run their methods and verify the results. Real-time leaderboards show which entries are winning and top performers have a chance to claim a prize. <br /><br />Working with Google Genomics and Google Cloud Platform, the DREAM-SMC organizers are now using cloud and containerization technologies to enable portability and reproducibility as a core part of the DREAM challenges. The latest SMC installments, the <a href="https://www.synapse.org/SMCHet">SMC-Het Challenge</a> and the <a href="https://www.synapse.org/SMC_RNA">SMC-RNA Challenge</a> have implemented this new plan:<br /><br /><ul><li><a href="https://www.synapse.org/SMCHet">SMC-Het Challenge</a>: Tumour biopsies are composed of many different cell types in addition to tumour cells, including normal tissue and infiltrating immune cells. Furthermore, the tumours themselves are made of a mixture of different subpopulations, all related to one another through cell division and mutation. Critically, each sub-population can have distinct clinical outcomes, with some more resistant to treatment or more likely to metastasize than others. The goal of the SMC-Het Challenge is to identify the best methods for predicting <a href="https://en.wikipedia.org/wiki/Tumour_heterogeneity">tumor subpopulations</a> and their “family tree” of relatedness from genome sequencing data.</li><li><a href="https://www.synapse.org/SMC_RNA">SMC-RNA Challenge</a>: The alteration of RNA production is a fundamental mechanism by which cancer cells rewire cellular circuitry. Genomic rearrangements in cancer cells can produce fused protein products that can bestow Frankenstein-like properties. Both RNA abundances and novel fusions can serve as the basis for clinically-important prognostic biomarkers. The SMC-RNA Challenge will identify the best methods to detect such rogue expressed RNAs in cancer cells.</li></ul><br />Ultimately, the success will be gauged by the amount of serious participation in these latest competitions. So far, the signs are encouraging. SMC-Het, which focuses on a very new research area, launched in November  2015 and has already enlisted 18 teams contributing over 70 submissions. SMC-RNA just recently launched and will run until early 2017, with several of the world leaders in the field starting to prepare entries. What’s great about the submissions being packaged in containers is that even after the challenges end, the tested methods can be applied and further adapted by anyone around the world.<br /><br />Thus, the moon shot need not be a lucky solo attempt made by one hero in one moment of inspiration. Instead, the new informatics of clouds and containers will enable us to combine intelligence so we can build a series of bridges from here to there.<br /><br />To participate in the DREAM challenges, visit the <a href="https://www.synapse.org/SMCHet">SMC-Het</a> and <a href="https://www.synapse.org/SMC_RNA">SMC-RNA</a> Challenge sites.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/reproducible-science-cancer-researchers-embrace-containers-in-the-cloud/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Improving Inception and Image Classification in TensorFlow</title>
		<link>https://googledata.org/google-research/improving-inception-and-image-classification-in-tensorflow/</link>
		<comments>https://googledata.org/google-research/improving-inception-and-image-classification-in-tensorflow/#comments</comments>
		<pubDate>Wed, 31 Aug 2016 19:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=148fb035b492b738fa7c8267ca9d78d7</guid>
		<description><![CDATA[<span>Posted by Alex Alemi, Software Engineer</span><br /><a href="https://research.googleblog.com/2016/08/tf-slim-high-level-library-to-define.html"><br /></a> Earlier this week, <a href="https://research.googleblog.com/2016/08/tf-slim-high-level-library-to-define.html">we announced the latest release of the TF-Slim library</a> for TensorFlow, a lightweight package for defining, training and evaluating models, as well as checkpoints and model definitions for several competitive networks in the field of image classification. <br /><br />In order to spur even further progress in the field, today we are happy to announce the release of <a href="https://github.com/tensorflow/models/blob/master/slim/nets/inception_resnet_v2.py">Inception-ResNet-v2</a>, a convolutional neural network (CNN) that achieves a new state of the art in terms of accuracy on the <a href="http://image-net.org/challenges/LSVRC/2012/">ILSVRC image classification benchmark</a>. Inception-ResNet-v2 is a variation of our earlier <a href="http://arxiv.org/abs/1512.00567">Inception V3</a> model which borrows some ideas from Microsoft's ResNet papers <a href="https://arxiv.org/abs/1512.03385">[1]</a><a href="https://arxiv.org/abs/1603.05027">[2]</a>. The full details of the model are in our arXiv preprint <a href="http://arxiv.org/abs/1602.07261"><i>Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning</i></a>.<br /><br />Residual connections allow shortcuts in the model and have allowed researchers to successfully train even deeper neural networks, which have lead to even better performance. This has also enabled significant simplification of the Inception blocks. Just compare the model architectures in the figures below:<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://2.bp.blogspot.com/-9KD48z54MBs/V8cVz11fM0I/AAAAAAAABKM/sCC0vVEz_dMOsyb0D8AFwqkrrCavdlkSACLcB/s1600/image02.png"><img border="0" height="238" src="https://2.bp.blogspot.com/-9KD48z54MBs/V8cVz11fM0I/AAAAAAAABKM/sCC0vVEz_dMOsyb0D8AFwqkrrCavdlkSACLcB/s640/image02.png" width="640"></a></td></tr><tr><td>Schematic diagram of Inception V3</td></tr></tbody></table><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://1.bp.blogspot.com/-O7AznVGY9js/V8cV_wKKsMI/AAAAAAAABKQ/maO7n2w3dT4Pkcmk7wgGqiSX5FUW2sfZgCLcB/s1600/image00.png"><img border="0" height="402" src="https://1.bp.blogspot.com/-O7AznVGY9js/V8cV_wKKsMI/AAAAAAAABKQ/maO7n2w3dT4Pkcmk7wgGqiSX5FUW2sfZgCLcB/s640/image00.png" width="640"></a></td></tr><tr><td>Schematic diagram of Inception-ResNet-v2</td></tr></tbody></table>At the top of the second Inception-ResNet-v2 figure, you'll see the full network expanded.  Notice that this network is considerably deeper than the previous Inception V3.  Below in the main figure is an easier to read version of the same network where the repeated residual blocks have been compressed.  Here, notice that the inception blocks have been simplified, containing fewer parallel towers than the previous Inception V3.<br /><br />The Inception-ResNet-v2 architecture is more accurate than previous state of the art models, as shown in the table below, which reports the Top-1 and Top-5 validation accuracies on the <a href="http://image-net.org/challenges/LSVRC/2012/">ILSVRC 2012 image classification benchmark</a> based on a single crop of the image.  Furthermore, this new model only requires roughly twice the memory and computation compared to Inception V3.<br /><br /><table border="1" cellspacing="0"><tbody><tr><td><b></b><br /><b>Model</b></td> <td><div><b>Architecture</b></div></td> <td><b></b><br /><b>Checkpoint</b></td> <td><b></b><br /><b>Top-1 Accuracy</b></td> <td><b></b><br /><b>Top-5 Accuracy</b></td> </tr><tr><td><div><a href="http://arxiv.org/abs/1602.07261" target="_blank">Inception-ResNet-v2</a></div></td> <td><a href="https://github.com/tensorflow/models/blob/master/slim/nets/inception_resnet_v2.py" target="_blank"></a><br /><a href="https://github.com/tensorflow/models/blob/master/slim/nets/inception_resnet_v2.py" target="_blank">Code</a></td> <td><div><a href="http://download.tensorflow.org/models/inception_resnet_v2_2016_08_30.tar.gz" target="_blank">inception_resnet_v2_2016_08_30.tar.gz</a></div></td> <td>80.4</td> <td>95.3</td> </tr><tr><td><div><a href="http://arxiv.org/abs/1512.00567" target="_blank">Inception V3</a></div></td> <td><a href="https://github.com/tensorflow/models/blob/master/slim/nets/inception_v3.py" target="_blank"></a><br /><a href="https://github.com/tensorflow/models/blob/master/slim/nets/inception_v3.py" target="_blank">Code</a></td> <td><div><a href="http://download.tensorflow.org/models/inception_v3_2016_08_28.tar.gz" target="_blank">inception_v3_2016_08_28.tar.gz</a></div></td> <td>78.0</td> <td>93.9</td> </tr><tr><td><div><a href="https://arxiv.org/abs/1512.03385" target="_blank">ResNet 152</a></div></td> <td><a href="https://github.com/tensorflow/models/blob/master/slim/nets/resnet_v1.py" target="_blank"></a><br /><a href="https://github.com/tensorflow/models/blob/master/slim/nets/resnet_v1.py" target="_blank">Code</a></td> <td><div><a href="http://download.tensorflow.org/models/resnet_v1_152_2016_08_28.tar.gz" target="_blank">resnet_v1_152_2016_08_28.tar.gz</a></div></td> <td>76.8</td> <td>93.2</td> </tr><tr><td><div><a href="http://arxiv.org/abs/1603.05027" target="_blank">ResNet V2 200</a></div></td> <td><a href="https://github.com/tensorflow/models/blob/master/slim/nets/resnet_v2.py" target="_blank"></a><br /><a href="https://github.com/tensorflow/models/blob/master/slim/nets/resnet_v2.py" target="_blank">Code</a></td> <td><div>TBA</div></td> <td>79.9*</td> <td>95.2*</td> </tr></tbody></table><div><span>(*): Results quoted in ResNet paper.</span></div><br />As an example, while both Inception V3 and Inception-ResNet-v2 models excel at identifying individual dog breeds, the new model does noticeably better. For instance, whereas the old model mistakenly reported Alaskan Malamute for the picture on the right, the new Inception-ResNet-v2 model correctly identifies the dog breeds in both images.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://1.bp.blogspot.com/-icBtPPVrGF4/V8cXIbbXxaI/AAAAAAAABLA/PTu1-rdmS5A7zn0nFcjA4XmyUpFE3epnwCEw/s1600/performance.png"><img border="0" height="242" src="https://1.bp.blogspot.com/-icBtPPVrGF4/V8cXIbbXxaI/AAAAAAAABLA/PTu1-rdmS5A7zn0nFcjA4XmyUpFE3epnwCEw/s640/performance.png" width="640"></a></td></tr><tr><td>An <a href="https://en.wikipedia.org/wiki/Alaskan_Malamute">Alaskan Malamute</a> (<a href="https://en.wikipedia.org/wiki/Alaskan_Malamute#/media/File:Alaskan_Malamute.jpg">left</a>) and a <a href="https://en.wikipedia.org/wiki/Siberian_Husky">Siberian Husky</a> (<a href="https://commons.wikimedia.org/wiki/Siberian_Husky#/media/File:Siberian-husky.jpg">right</a>). Images from Wikipedia</td></tr></tbody></table>In order to allow people to  immediately begin experimenting, we are also releasing a <a href="http://download.tensorflow.org/models/inception_resnet_v2_2016_08_30.tar.gz">pre-trained instance</a> of the new Inception-ResNet-v2, as part of the <a href="https://github.com/tensorflow/models/blob/master/slim/">TF-Slim Image Model Library</a>.<br /><br />We are excited to see what the community does with this improved model, following along as people adapt it and compare its performance on various tasks. Want to get started? See the accompanying <a href="https://github.com/tensorflow/models/blob/master/slim/README.md">instructions</a> on how to train, evaluate or fine-tune a network.<br /><br />As always, releasing the code was a team effort. Specific thanks are due to:<br /><ul><li><b>Model Architecture</b> - Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alex Alemi</li><li><b>Systems Infrastructure</b> - Jon Shlens, Benoit Steiner, Mark Sandler, and David Andersen</li><li><b>TensorFlow-Slim</b> - Sergio Guadarrama and Nathan Silberman</li><li><b>Model Visualization</b> - Fernanda Vi&#233;gas and James Wexler</li></ul>]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Alex Alemi, Software Engineer</span><br /><a href="https://research.googleblog.com/2016/08/tf-slim-high-level-library-to-define.html"><br /></a> Earlier this week, <a href="https://research.googleblog.com/2016/08/tf-slim-high-level-library-to-define.html">we announced the latest release of the TF-Slim library</a> for TensorFlow, a lightweight package for defining, training and evaluating models, as well as checkpoints and model definitions for several competitive networks in the field of image classification. <br /><br />In order to spur even further progress in the field, today we are happy to announce the release of <a href="https://github.com/tensorflow/models/blob/master/slim/nets/inception_resnet_v2.py">Inception-ResNet-v2</a>, a convolutional neural network (CNN) that achieves a new state of the art in terms of accuracy on the <a href="http://image-net.org/challenges/LSVRC/2012/">ILSVRC image classification benchmark</a>. Inception-ResNet-v2 is a variation of our earlier <a href="http://arxiv.org/abs/1512.00567">Inception V3</a> model which borrows some ideas from Microsoft's ResNet papers <a href="https://arxiv.org/abs/1512.03385">[1]</a><a href="https://arxiv.org/abs/1603.05027">[2]</a>. The full details of the model are in our arXiv preprint <a href="http://arxiv.org/abs/1602.07261"><i>Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning</i></a>.<br /><br />Residual connections allow shortcuts in the model and have allowed researchers to successfully train even deeper neural networks, which have lead to even better performance. This has also enabled significant simplification of the Inception blocks. Just compare the model architectures in the figures below:<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://2.bp.blogspot.com/-9KD48z54MBs/V8cVz11fM0I/AAAAAAAABKM/sCC0vVEz_dMOsyb0D8AFwqkrrCavdlkSACLcB/s1600/image02.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="238" src="https://2.bp.blogspot.com/-9KD48z54MBs/V8cVz11fM0I/AAAAAAAABKM/sCC0vVEz_dMOsyb0D8AFwqkrrCavdlkSACLcB/s640/image02.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Schematic diagram of Inception V3</td></tr></tbody></table><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-O7AznVGY9js/V8cV_wKKsMI/AAAAAAAABKQ/maO7n2w3dT4Pkcmk7wgGqiSX5FUW2sfZgCLcB/s1600/image00.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="402" src="https://1.bp.blogspot.com/-O7AznVGY9js/V8cV_wKKsMI/AAAAAAAABKQ/maO7n2w3dT4Pkcmk7wgGqiSX5FUW2sfZgCLcB/s640/image00.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Schematic diagram of Inception-ResNet-v2</td></tr></tbody></table>At the top of the second Inception-ResNet-v2 figure, you'll see the full network expanded.  Notice that this network is considerably deeper than the previous Inception V3.  Below in the main figure is an easier to read version of the same network where the repeated residual blocks have been compressed.  Here, notice that the inception blocks have been simplified, containing fewer parallel towers than the previous Inception V3.<br /><br />The Inception-ResNet-v2 architecture is more accurate than previous state of the art models, as shown in the table below, which reports the Top-1 and Top-5 validation accuracies on the <a href="http://image-net.org/challenges/LSVRC/2012/">ILSVRC 2012 image classification benchmark</a> based on a single crop of the image.  Furthermore, this new model only requires roughly twice the memory and computation compared to Inception V3.<br /><br /><table border="1" bordercolor="#888" cellspacing="0" style="border-collapse: collapse; border-color: rgb(136 , 136 , 136); border-width: 1px; width: 100%px;"><tbody><tr> <td><b></b><br /><center><b>Model</b></center></td> <td><div style="text-align: center;"><b>Architecture</b></div></td> <td><b></b><br /><center><b>Checkpoint</b></center></td> <td><b></b><br /><center><b>Top-1 Accuracy</b></center></td> <td><b></b><br /><center><b>Top-5 Accuracy</b></center></td> </tr><tr> <td><div style="text-align: center;"><a href="http://arxiv.org/abs/1602.07261" >Inception-ResNet-v2</a></div></td> <td><a href="https://github.com/tensorflow/models/blob/master/slim/nets/inception_resnet_v2.py" ></a><br /><center><a href="https://github.com/tensorflow/models/blob/master/slim/nets/inception_resnet_v2.py" >Code</a></center></td> <td><div style="text-align: center;"><a href="http://download.tensorflow.org/models/inception_resnet_v2_2016_08_30.tar.gz" >inception_resnet_v2_2016_08_30.tar.gz</a></div></td> <td><center>80.4</center></td> <td><center>95.3</center></td> </tr><tr> <td><div style="text-align: center;"><a href="http://arxiv.org/abs/1512.00567" >Inception V3</a></div></td> <td><a href="https://github.com/tensorflow/models/blob/master/slim/nets/inception_v3.py" ></a><br /><center><a href="https://github.com/tensorflow/models/blob/master/slim/nets/inception_v3.py" >Code</a></center></td> <td><div style="text-align: center;"><a href="http://download.tensorflow.org/models/inception_v3_2016_08_28.tar.gz" >inception_v3_2016_08_28.tar.gz</a></div></td> <td><center>78.0</center></td> <td><center>93.9</center></td> </tr><tr> <td><div style="text-align: center;"><a href="https://arxiv.org/abs/1512.03385" >ResNet 152</a></div></td> <td><a href="https://github.com/tensorflow/models/blob/master/slim/nets/resnet_v1.py" ></a><br /><center><a href="https://github.com/tensorflow/models/blob/master/slim/nets/resnet_v1.py" >Code</a></center></td> <td><div style="text-align: center;"><a href="http://download.tensorflow.org/models/resnet_v1_152_2016_08_28.tar.gz" >resnet_v1_152_2016_08_28.tar.gz</a></div></td> <td><center>76.8</center></td> <td><center>93.2</center></td> </tr><tr> <td><div style="text-align: center;"><a href="http://arxiv.org/abs/1603.05027" >ResNet V2 200</a></div></td> <td><a href="https://github.com/tensorflow/models/blob/master/slim/nets/resnet_v2.py" ></a><br /><center><a href="https://github.com/tensorflow/models/blob/master/slim/nets/resnet_v2.py" >Code</a></center></td> <td><div style="text-align: center;">TBA</div></td> <td><center>79.9*</center></td> <td><center>95.2*</center></td> </tr></tbody> </table><div style="text-align: center;"><span style="font-size: x-small;">(*): Results quoted in ResNet paper.</span></div><br />As an example, while both Inception V3 and Inception-ResNet-v2 models excel at identifying individual dog breeds, the new model does noticeably better. For instance, whereas the old model mistakenly reported Alaskan Malamute for the picture on the right, the new Inception-ResNet-v2 model correctly identifies the dog breeds in both images.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-icBtPPVrGF4/V8cXIbbXxaI/AAAAAAAABLA/PTu1-rdmS5A7zn0nFcjA4XmyUpFE3epnwCEw/s1600/performance.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="242" src="https://1.bp.blogspot.com/-icBtPPVrGF4/V8cXIbbXxaI/AAAAAAAABLA/PTu1-rdmS5A7zn0nFcjA4XmyUpFE3epnwCEw/s640/performance.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">An <a href="https://en.wikipedia.org/wiki/Alaskan_Malamute">Alaskan Malamute</a> (<a href="https://en.wikipedia.org/wiki/Alaskan_Malamute#/media/File:Alaskan_Malamute.jpg">left</a>) and a <a href="https://en.wikipedia.org/wiki/Siberian_Husky">Siberian Husky</a> (<a href="https://commons.wikimedia.org/wiki/Siberian_Husky#/media/File:Siberian-husky.jpg">right</a>). Images from Wikipedia</td></tr></tbody></table>In order to allow people to  immediately begin experimenting, we are also releasing a <a href="http://download.tensorflow.org/models/inception_resnet_v2_2016_08_30.tar.gz">pre-trained instance</a> of the new Inception-ResNet-v2, as part of the <a href="https://github.com/tensorflow/models/blob/master/slim/">TF-Slim Image Model Library</a>.<br /><br />We are excited to see what the community does with this improved model, following along as people adapt it and compare its performance on various tasks. Want to get started? See the accompanying <a href="https://github.com/tensorflow/models/blob/master/slim/README.md">instructions</a> on how to train, evaluate or fine-tune a network.<br /><br />As always, releasing the code was a team effort. Specific thanks are due to:<br /><ul><li><b>Model Architecture</b> - Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alex Alemi</li><li><b>Systems Infrastructure</b> - Jon Shlens, Benoit Steiner, Mark Sandler, and David Andersen</li><li><b>TensorFlow-Slim</b> - Sergio Guadarrama and Nathan Silberman</li><li><b>Model Visualization</b> - Fernanda Viégas and James Wexler</li></ul>]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/improving-inception-and-image-classification-in-tensorflow/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>TF-Slim: A high level library to define complex models in TensorFlow</title>
		<link>https://googledata.org/google-research/tf-slim-a-high-level-library-to-define-complex-models-in-tensorflow/</link>
		<comments>https://googledata.org/google-research/tf-slim-a-high-level-library-to-define-complex-models-in-tensorflow/#comments</comments>
		<pubDate>Tue, 30 Aug 2016 17:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=fafc75db26a416e91d2020a332955d5e</guid>
		<description><![CDATA[Posted by Nathan Silberman and Sergio Guadarrama, Google Research  Earlier this year, we released a TensorFlow implementation of a state-of-the-art image classification model known as Inception-V3. This code allowed users to train the model on the  Ima...]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Nathan Silberman and Sergio Guadarrama, Google Research</span>  <br /><br />Earlier this year, <a href="https://research.googleblog.com/2016/03/train-your-own-image-classifier-with.html">we released</a> a TensorFlow implementation of a state-of-the-art image classification model known as <a href="http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44903.pdf">Inception-V3</a>. This code allowed users to train the model on the  <a href="http://image-net.org/challenges/LSVRC/2012/">ImageNet classification dataset</a> via synchronized gradient descent, using either a single local machine or a cluster of machines. The Inception-V3 model was built on an experimental <a href="https://www.tensorflow.org/">TensorFlow</a> library called <a href="https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim">TF-Slim</a>, a lightweight package for defining, training and evaluating models in TensorFlow. The TF-Slim library provides common abstractions which enable users to define models quickly and concisely, while keeping the model architecture transparent and its hyperparameters explicit.<br /><br />Since that release, TF-Slim has grown substantially, with many types of <a href="https://www.tensorflow.org/api_docs/python/contrib.layers.html#layers-contrib">layers</a>, <a href="https://www.tensorflow.org/api_docs/python/contrib.losses.html#losses-contrib">loss functions</a>, and <a href="https://www.tensorflow.org/api_docs/python/contrib.metrics.html#metrics-contrib">evaluation metrics</a> added, along with handy routines for <a href="https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/slim/python/slim/learning.py">training</a> and <a href="https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/slim/python/slim/evaluation.py">evaluating</a> models. These routines take care of all the details you need to worry about when working at scale, such as reading data in parallel, deploying models on multiple machines, and more. Additionally, we have created the <a href="https://github.com/tensorflow/models/tree/master/slim">TF-Slim Image Models library</a>, which provides definitions and training scripts for many widely used image classification models, using standard datasets. TF-Slim and its components are already widely used within Google, and many of these improvements have already been integrated into <a href="https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim">tf.contrib.slim</a>.<br /><br />Today, we are proud to share the latest release of TF-Slim with the TF community. Some highlights of this release include:<br /><ul><li>Many new kinds of <a href="https://www.tensorflow.org/api_docs/python/contrib.layers.html#layers-contrib">layers</a> (such as <a href="http://arxiv.org/abs/1606.00915">Atrous Convolution</a> and <a href="http://www.matthewzeiler.com/pubs/cvpr2010/cvpr2010.pdf">Deconvolution</a>) enabling a much richer family of neural network architectures.</li><li>Support for more loss functions and <a href="https://www.tensorflow.org/api_docs/python/contrib.metrics.html#metrics-contrib">evaluation metrics</a> (e.g., mAP, IoU).</li><li>A <a href="https://github.com/tensorflow/models/blob/master/slim/deployment/model_deploy.py">deployment library</a> to make it easier to perform synchronous or asynchronous training using multiple GPUs/CPUs, on the same machine or on multiple machines.</li><li><a href="https://github.com/tensorflow/models/tree/master/slim/nets">Code</a> to define and train many widely used image classification models (e.g., <a href="http://arxiv.org/abs/1512.00567">Inception</a><sup>[1][2][3]</sup>, <a href="http://www.robots.ox.ac.uk/~vgg/research/very_deep/">VGG</a><sup>[4]</sup>, <a href="https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf">AlexNet</a><sup>[5]</sup>, <a href="https://arxiv.org/abs/1512.03385">ResNet</a><sup>[6]</sup>).</li><li><a href="https://github.com/tensorflow/models/tree/master/slim#pre-trained-models">Pre-trained</a> model weights for the above image classification models. These models have been trained on the <a href="http://image-net.org/challenges/LSVRC/2012/">ImageNet classification dataset</a>, but can be used for many other computer vision tasks. As a simple example, we provide code to <a href="https://github.com/tensorflow/models/tree/master/slim#fine-tuning-a-model-from-an-existing-checkpoint">fine-tune</a> these classifiers to a new set of output labels.</li><li><a href="https://github.com/tensorflow/models/tree/master/slim/datasets">Tools</a> to easily process standard image datasets, such as <a href="http://www.image-net.org/challenges/LSVRC/">ImageNet</a>, <a href="https://www.cs.toronto.edu/~kriz/cifar.html">CIFAR10</a> and <a href="http://yann.lecun.com/exdb/mnist/">MNIST</a>.</li></ul>Want to get started using TF-Slim? See the <a href="https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim">README</a> for details. Interested in working with image classification models? See these <a href="https://github.com/tensorflow/models/blob/master/slim/README.md">instructions</a> or this <a href="https://github.com/tensorflow/models/blob/master/slim/slim_walkthough.ipynb">Jupyter notebook</a>.<br /><br />The release of the TF-Slim library and the pre-trained model zoo has been the result of widespread collaboration within Google Research. In particular we want to highlight the vital contributions of the following researchers:<br /><ul><li><b>TF-Slim:</b> Sergio Guadarrama, Nathan Silberman.</li><li><b>Model Definitions and Checkpoints:</b> Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Jon Shlens, Zbigniew Wojna, Vivek Rathod, George Papandreou, Alex Alemi</li><li><b>Systems Infrastructure:</b> Jon Shlens, Matthieu Devin, Martin Wicke</li><li><b>Jupyter notebook:</b> Nathan Silberman, Kevin Murphy</li></ul><b><i>References:</i></b><br />[1] <a href="http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43022.pdf">Going deeper with convolutions</a>, <i>Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, CVPR 2015</i><br />[2] <a href="http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43442.pdf">Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift</a> <i>Sergey Ioffe, Christian Szegedy, ICML 2015</i><br />[3] <a href="http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44903.pdf">Rethinking the Inception Architecture for Computer Vision</a>, <i>Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, Zbigniew Wojna, arXiv technical report 2015</i><br />[4] <a href="http://arxiv.org/pdf/1409.1556">Very Deep Convolutional Networks for Large-Scale Image Recognition</a>, <i>Karen Simonyan, Andrew Zisserman, ICLR 2015</i><br />[5] <a href="https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf">ImageNet Classification with Deep Convolutional Neural Networks</a>, <i>Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, NIPS 2012</i><br />[6] <a href="https://arxiv.org/abs/1512.03385">Deep Residual Learning for Image Recognition</a>, <i>Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, CVPR 2016</i>]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/tf-slim-a-high-level-library-to-define-complex-models-in-tensorflow/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Text summarization with TensorFlow</title>
		<link>https://googledata.org/google-research/text-summarization-with-tensorflow/</link>
		<comments>https://googledata.org/google-research/text-summarization-with-tensorflow/#comments</comments>
		<pubDate>Wed, 24 Aug 2016 18:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=da8915c0bce466d04b40d2971d75e41c</guid>
		<description><![CDATA[<span>Posted by Peter Liu and Xin Pan, Software Engineers, Google Brain Team</span><br /><br />Every day, people rely on a wide variety of sources to stay informed -- from news stories to social media posts to search results. Being able to develop Machine Learning models that can automatically deliver accurate summaries of longer text can be useful for digesting such large amounts of information in a compressed form, and is a long-term goal of the <a href="https://research.google.com/teams/brain/">Google Brain team</a>. <br /><br />Summarization can also serve as an interesting reading comprehension test for machines. To summarize well, machine learning models need to be able to comprehend documents and distill the important information, tasks which are highly challenging for computers, especially as the length of a document increases.<br /><br />In an effort to push this research forward, we&#8217;re open-sourcing <a href="https://github.com/tensorflow/models/tree/master/textsum">TensorFlow model code</a> for the task of generating news headlines on <a href="https://catalog.ldc.upenn.edu/LDC2012T21">Annotated English Gigaword</a>, a dataset often used in summarization research. We also specify the hyper-parameters in the documentation that achieve better than published state-of-the-art on the most commonly used <a href="https://en.wikipedia.org/wiki/ROUGE_(metric)">metric</a> as of the time of writing. Below we also provide samples generated by the model.<br /><b><br /></b> <b>Extractive and Abstractive summarization</b><br /><br />One approach to summarization is to extract parts of the document that are deemed interesting by some metric (for example, <a href="https://en.wikipedia.org/wiki/Tf%E2%80%93idf">inverse-document frequency</a>) and join them to form a summary. Algorithms of this flavor are called extractive summarization.<br /><blockquote>Original Text: <i><b>Alice and Bob</b> took the train to <b>visit the zoo</b>. They <b>saw</b> a baby giraffe, a lion, and <b>a flock of</b> colorful tropical <b>birds</b>.</i>&#160;</blockquote><blockquote>Extractive Summary: <i>Alice and Bob visit the zoo. saw a flock of birds.</i></blockquote>Above we extract the words bolded in the original text and concatenate them to form a summary. As we can see, sometimes the extractive constraint can make the summary awkward or grammatically strange. <br /><br />Another approach is to simply summarize as humans do, which is to not impose the extractive constraint and allow for rephrasings. This is called abstractive summarization.<br /><blockquote>Abstractive summary: <i>Alice and Bob visited the zoo and saw animals and birds.</i></blockquote>In this example, we used words not in the original text, maintaining more of the information in a similar amount of words. It&#8217;s clear we would prefer good abstractive summarizations, but how could an algorithm begin to do this?<br /><br /><b>About the TensorFlow model</b><br /><br />It turns out for shorter texts, summarization can be learned end-to-end with a deep learning technique called <a href="http://arxiv.org/abs/1409.3215">sequence-to-sequence learning</a>, similar to what makes <a href="https://research.googleblog.com/2015/11/computer-respond-to-this-email.html">Smart Reply for Inbox</a> possible. In particular, we&#8217;re able to train such models to produce very good headlines for news articles. In this case, the model reads the article text and writes a suitable headline.<br /><br />To get an idea of what the model produces, you can take a look at some examples below. The first column shows the first sentence of a news article which is the model input, and the second column shows what headline the model has written.<br /><br /><div dir="ltr"><table><colgroup><col width="314"><col width="202"></colgroup><tbody><tr><td><div dir="ltr"><span>Input: Article 1st sentence</span></div></td><td><div dir="ltr"><span>Model-written headline</span></div></td></tr><tr><td><div dir="ltr"><span>metro-goldwyn-mayer reported a third-quarter net loss of dlrs 16 million due mainly to the effect of accounting rules adopted this year </span></div></td><td><div dir="ltr"><span>mgm reports 16 million net loss on higher revenue </span></div></td></tr><tr><td><div dir="ltr"><span>starting from july 1, the island province of hainan in southern china will implement strict market access control on all incoming livestock and animal products to prevent the possible spread of epidemic diseases </span></div></td><td><div dir="ltr"><span>hainan to curb spread of diseases</span></div></td></tr><tr><td><div dir="ltr"><span>australian wine exports hit a record 52.1 million liters worth 260 million dollars (143 million us) in september, the government statistics office reported on monday </span></div></td><td><div dir="ltr"><span>australian wine exports hit record high in september</span></div></td></tr></tbody></table></div><br /><b>Future Research</b><br /><br />We&#8217;ve observed that due to the nature of news headlines, the model can generate good headlines from reading just a few sentences from the beginning of the article. Although this task serves as a nice proof-of-concept, we started looking at more difficult datasets where reading the entire document is necessary to produce good summaries. In those tasks training from scratch with this model architecture does not do as well as some other techniques we&#8217;re researching, but it serves as a baseline. We hope <a href="https://github.com/tensorflow/models/tree/master/textsum">this release</a> can also serve as a baseline for others in their summarization research.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Peter Liu and Xin Pan, Software Engineers, Google Brain Team</span><br /><br />Every day, people rely on a wide variety of sources to stay informed -- from news stories to social media posts to search results. Being able to develop Machine Learning models that can automatically deliver accurate summaries of longer text can be useful for digesting such large amounts of information in a compressed form, and is a long-term goal of the <a href="https://research.google.com/teams/brain/">Google Brain team</a>. <br /><br />Summarization can also serve as an interesting reading comprehension test for machines. To summarize well, machine learning models need to be able to comprehend documents and distill the important information, tasks which are highly challenging for computers, especially as the length of a document increases.<br /><br />In an effort to push this research forward, we’re open-sourcing <a href="https://github.com/tensorflow/models/tree/master/textsum">TensorFlow model code</a> for the task of generating news headlines on <a href="https://catalog.ldc.upenn.edu/LDC2012T21">Annotated English Gigaword</a>, a dataset often used in summarization research. We also specify the hyper-parameters in the documentation that achieve better than published state-of-the-art on the most commonly used <a href="https://en.wikipedia.org/wiki/ROUGE_(metric)">metric</a> as of the time of writing. Below we also provide samples generated by the model.<br /><b><br /></b> <b>Extractive and Abstractive summarization</b><br /><br />One approach to summarization is to extract parts of the document that are deemed interesting by some metric (for example, <a href="https://en.wikipedia.org/wiki/Tf%E2%80%93idf">inverse-document frequency</a>) and join them to form a summary. Algorithms of this flavor are called extractive summarization.<br /><blockquote class="tr_bq">Original Text: <i><b>Alice and Bob</b> took the train to <b>visit the zoo</b>. They <b>saw</b> a baby giraffe, a lion, and <b>a flock of</b> colorful tropical <b>birds</b>.</i>&nbsp;</blockquote><blockquote class="tr_bq">Extractive Summary: <i>Alice and Bob visit the zoo. saw a flock of birds.</i></blockquote>Above we extract the words bolded in the original text and concatenate them to form a summary. As we can see, sometimes the extractive constraint can make the summary awkward or grammatically strange. <br /><br />Another approach is to simply summarize as humans do, which is to not impose the extractive constraint and allow for rephrasings. This is called abstractive summarization.<br /><blockquote class="tr_bq">Abstractive summary: <i>Alice and Bob visited the zoo and saw animals and birds.</i></blockquote>In this example, we used words not in the original text, maintaining more of the information in a similar amount of words. It’s clear we would prefer good abstractive summarizations, but how could an algorithm begin to do this?<br /><br /><b>About the TensorFlow model</b><br /><br />It turns out for shorter texts, summarization can be learned end-to-end with a deep learning technique called <a href="http://arxiv.org/abs/1409.3215">sequence-to-sequence learning</a>, similar to what makes <a href="https://research.googleblog.com/2015/11/computer-respond-to-this-email.html">Smart Reply for Inbox</a> possible. In particular, we’re able to train such models to produce very good headlines for news articles. In this case, the model reads the article text and writes a suitable headline.<br /><br />To get an idea of what the model produces, you can take a look at some examples below. The first column shows the first sentence of a news article which is the model input, and the second column shows what headline the model has written.<br /><br /><div dir="ltr" style="margin-left: 33pt;"><table style="border-collapse: collapse; border: none;"><colgroup><col width="314"></col><col width="202"></col></colgroup><tbody><tr style="height: 0px;"><td style="border-bottom: solid #000000 1px; border-left: solid #000000 1px; border-right: solid #000000 1px; border-top: solid #000000 1px; padding: 7px 7px 7px 7px; vertical-align: top;"><div dir="ltr" style="line-height: 1; margin-bottom: 0pt; margin-top: 0pt; text-align: center;"><span style="background-color: transparent; color: black; font-family: &quot;times&quot; , &quot;times new roman&quot; , serif; font-style: normal; font-variant: normal; font-weight: bold; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Input: Article 1st sentence</span></div></td><td style="border-bottom: solid #000000 1px; border-left: solid #000000 1px; border-right: solid #000000 1px; border-top: solid #000000 1px; padding: 7px 7px 7px 7px; vertical-align: top;"><div dir="ltr" style="line-height: 1; margin-bottom: 0pt; margin-top: 0pt; text-align: center;"><span style="background-color: transparent; color: black; font-family: &quot;times&quot; , &quot;times new roman&quot; , serif; font-style: normal; font-variant: normal; font-weight: bold; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Model-written headline</span></div></td></tr><tr style="height: 0px;"><td style="border-bottom: solid #000000 1px; border-left: solid #000000 1px; border-right: solid #000000 1px; border-top: solid #000000 1px; padding: 7px 7px 7px 7px; vertical-align: top;"><div dir="ltr" style="line-height: 1; margin-bottom: 0pt; margin-top: 0pt;"><span style="background-color: transparent; color: black; font-family: &quot;times&quot; , &quot;times new roman&quot; , serif; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">metro-goldwyn-mayer reported a third-quarter net loss of dlrs 16 million due mainly to the effect of accounting rules adopted this year </span></div></td><td style="border-bottom: solid #000000 1px; border-left: solid #000000 1px; border-right: solid #000000 1px; border-top: solid #000000 1px; padding: 7px 7px 7px 7px; vertical-align: top;"><div dir="ltr" style="line-height: 1; margin-bottom: 0pt; margin-top: 0pt;"><span style="background-color: transparent; font-family: &quot;times&quot; , &quot;times new roman&quot; , serif; font-style: normal; font-weight: normal; vertical-align: baseline; white-space: pre-wrap;">mgm reports 16 million net loss on higher revenue </span></div></td></tr><tr style="height: 0px;"><td style="border-bottom: solid #000000 1px; border-left: solid #000000 1px; border-right: solid #000000 1px; border-top: solid #000000 1px; padding: 7px 7px 7px 7px; vertical-align: top;"><div dir="ltr" style="line-height: 1; margin-bottom: 0pt; margin-top: 0pt;"><span style="background-color: transparent; color: black; font-family: &quot;times&quot; , &quot;times new roman&quot; , serif; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">starting from july 1, the island province of hainan in southern china will implement strict market access control on all incoming livestock and animal products to prevent the possible spread of epidemic diseases </span></div></td><td style="border-bottom: solid #000000 1px; border-left: solid #000000 1px; border-right: solid #000000 1px; border-top: solid #000000 1px; padding: 7px 7px 7px 7px; vertical-align: top;"><div dir="ltr" style="line-height: 1; margin-bottom: 0pt; margin-top: 0pt;"><span style="background-color: transparent; font-family: &quot;times&quot; , &quot;times new roman&quot; , serif; font-style: normal; font-weight: normal; vertical-align: baseline; white-space: pre-wrap;">hainan to curb spread of diseases</span></div></td></tr><tr style="height: 0px;"><td style="border-bottom: solid #000000 1px; border-left: solid #000000 1px; border-right: solid #000000 1px; border-top: solid #000000 1px; padding: 7px 7px 7px 7px; vertical-align: top;"><div dir="ltr" style="line-height: 1; margin-bottom: 0pt; margin-top: 0pt;"><span style="background-color: transparent; color: black; font-family: &quot;times&quot; , &quot;times new roman&quot; , serif; font-style: normal; font-variant: normal; font-weight: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">australian wine exports hit a record 52.1 million liters worth 260 million dollars (143 million us) in september, the government statistics office reported on monday </span></div></td><td style="border-bottom: solid #000000 1px; border-left: solid #000000 1px; border-right: solid #000000 1px; border-top: solid #000000 1px; padding: 7px 7px 7px 7px; vertical-align: top;"><div dir="ltr" style="line-height: 1; margin-bottom: 0pt; margin-top: 0pt;"><span style="background-color: transparent; font-family: &quot;times&quot; , &quot;times new roman&quot; , serif; font-style: normal; font-weight: normal; vertical-align: baseline; white-space: pre-wrap;">australian wine exports hit record high in september</span></div></td></tr></tbody></table></div><br /><b>Future Research</b><br /><br />We’ve observed that due to the nature of news headlines, the model can generate good headlines from reading just a few sentences from the beginning of the article. Although this task serves as a nice proof-of-concept, we started looking at more difficult datasets where reading the entire document is necessary to produce good summaries. In those tasks training from scratch with this model architecture does not do as well as some other techniques we’re researching, but it serves as a baseline. We hope <a href="https://github.com/tensorflow/models/tree/master/textsum">this release</a> can also serve as a baseline for others in their summarization research.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/text-summarization-with-tensorflow/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Meet Parsey’s Cousins: Syntax for 40 languages, plus new SyntaxNet capabilities</title>
		<link>https://googledata.org/google-research/meet-parseys-cousins-syntax-for-40-languages-plus-new-syntaxnet-capabilities/</link>
		<comments>https://googledata.org/google-research/meet-parseys-cousins-syntax-for-40-languages-plus-new-syntaxnet-capabilities/#comments</comments>
		<pubDate>Mon, 08 Aug 2016 16:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=dc99d77d3c44b5acd487023f768a41a0</guid>
		<description><![CDATA[<span>Posted by Chris Alberti, Dave Orr &#38; Slav Petrov, Google Natural Language Understanding Team</span><br /><br />Just in time for <a href="http://acl2016.org/">ACL 2016</a>, we are pleased to announce that Parsey McParseface, <a href="https://research.googleblog.com/2016/05/announcing-syntaxnet-worlds-most.html">released in May as part of SyntaxNet</a>&#160;and the basis for the <a href="https://cloud.google.com/natural-language/">Cloud Natural Language API</a>, now has 40 cousins! Parsey&#8217;s Cousins is a collection of pretrained syntactic models for 40 languages, capable of analyzing the native language of more than half of the world&#8217;s population at often unprecedented <a href="https://github.com/tensorflow/models/blob/master/syntaxnet/universal.md">accuracy</a>. To better address the linguistic phenomena occurring in these languages we have endowed SyntaxNet with new abilities for <a href="https://en.wikipedia.org/wiki/Text_segmentation">Text Segmentation</a> and <a href="https://en.wikipedia.org/wiki/Morphology_(linguistics)">Morphological Analysis</a>.<br /><br />When we released Parsey, we were already planning to expand to more languages, and it soon became clear that this was both urgent and important, because researchers were having trouble creating top notch SyntaxNet models for other languages.<br /><br />The reason for that is a little bit subtle. SyntaxNet, like other <a href="https://www.tensorflow.org/">TensorFlow</a> models, has a lot of knobs to turn, which affect accuracy and speed. These knobs are called hyperparameters, and control things like the learning rate and its decay, momentum, and random initialization. Because neural networks are more sensitive to the choice of these hyperparameters than many other machine learning algorithms, picking the right hyperparameter setting is very important. Unfortunately there is no tested and proven way of doing this and picking good hyperparameters is mostly an empirical science -- we try a bunch of settings and see what works best.<br /><br />An additional challenge is that training these models can take a long time, several days on very fast hardware. Our solution is to train many models in parallel via <a href="https://en.wikipedia.org/wiki/MapReduce">MapReduce</a>, and when one looks promising, train a bunch more models with similar settings to fine-tune the results. This can really add up -- on average, we train more than 70 models per language. The plot below shows how the accuracy varies depending on the hyperparameters as training progresses. The best models are up to 4% absolute more accurate than ones trained without hyperparameter tuning.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://1.bp.blogspot.com/-6kU_Cu49gq0/V6hY_5jpw3I/AAAAAAAABJM/D5_odV19sVgvSWai1VRXQl8wtRnJZpAcwCLcB/s1600/image05.png"><img border="0" height="436" src="https://1.bp.blogspot.com/-6kU_Cu49gq0/V6hY_5jpw3I/AAAAAAAABJM/D5_odV19sVgvSWai1VRXQl8wtRnJZpAcwCLcB/s640/image05.png" width="640"></a></td></tr><tr><td>Held-out set accuracy for various English parsing models with different hyperparameters (each line corresponds to one training run with specific hyperparameters). In some cases training is a lot slower and in many cases a suboptimal choice of hyperparameters leads to significantly lower accuracy. We are releasing the best model that we were able to train for each language.</td></tr></tbody></table>In order to do a good job at analyzing the grammar of other languages, it was not sufficient to just fine-tune our English setup. We also had to expand the capabilities of SyntaxNet. The first extension is a model for text segmentation, which is the task of identifying word boundaries. In languages like English, this isn&#8217;t very hard -- you can mostly look for spaces and punctuation. In Chinese, however, this can be very challenging, because words are not separated by spaces. To correctly analyze dependencies between Chinese words, SyntaxNet needs to understand text segmentation -- and now it does.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://3.bp.blogspot.com/-SH6wmh38Nq0/V6hZqbHD4LI/AAAAAAAABJQ/zJwbHpKvCqAJy5E0ng0yPiskJupFlrzbwCEw/s1600/image03.png"><img border="0" height="176" src="https://3.bp.blogspot.com/-SH6wmh38Nq0/V6hZqbHD4LI/AAAAAAAABJQ/zJwbHpKvCqAJy5E0ng0yPiskJupFlrzbwCEw/s640/image03.png" width="640"></a></td></tr><tr><td>Analysis of a Chinese string into a parse tree showing dependency labels, word tokens, and parts of speech (read top to bottom for each word token).</td></tr></tbody></table>The second extension is a model for morphological analysis. Morphology is a language feature that is poorly represented in English. It describes inflection: i.e., how the grammatical function and meaning of the word changes as its spelling changes. In English, we add an -s to a word to indicate plurality. In Russian, a <a href="https://en.wikipedia.org/wiki/Russian_grammar">heavily inflected language</a>, morphology can indicate number, gender, whether the word is the subject or object of a sentence, possessives, prepositional phrases, and more. To understand the syntax of a sentence in Russian, SyntaxNet needs to understand morphology -- and now it does.<br /><div><a href="https://4.bp.blogspot.com/-0WFKVxMUcyw/V6hbG3jKHDI/AAAAAAAABJc/X-BoqS_91CY8JQhSkVq4GGlLY3Cw3NkBwCLcB/s1600/image00.png"><img border="0" height="148" src="https://4.bp.blogspot.com/-0WFKVxMUcyw/V6hbG3jKHDI/AAAAAAAABJc/X-BoqS_91CY8JQhSkVq4GGlLY3Cw3NkBwCLcB/s640/image00.png" width="640"></a></div><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://2.bp.blogspot.com/-Vj4qA9EdxXY/V6hbz7rdO0I/AAAAAAAABJo/Jj7XLGqgHhEQ8qb1ou_DwgUFqSCZmvOzQCLcB/s1600/image01.png"><img border="0" height="234" src="https://2.bp.blogspot.com/-Vj4qA9EdxXY/V6hbz7rdO0I/AAAAAAAABJo/Jj7XLGqgHhEQ8qb1ou_DwgUFqSCZmvOzQCLcB/s640/image01.png" width="640"></a></td></tr><tr><td>Parse trees showing dependency labels, parts of speech, and morphology.</td></tr></tbody></table>As you might have noticed, the parse trees for all of the sentences above look very similar. This is because we follow the content-head principle, under which dependencies are drawn between content words, with function words becoming leaves in the parse tree. This idea was developed by the <a href="http://universaldependencies.org/">Universal Dependencies</a> project in order to increase parallelism between languages. Parsey&#8217;s Cousins are trained on <a href="https://en.wikipedia.org/wiki/Treebank">treebanks</a> provided by this project and are designed to be cross-linguistically consistent and thus easier to use in multi-lingual language understanding applications.<br /><br />Using the same set of labels across languages can help us understand how sentences in different languages, or variations in the same language, convey the same meaning. In all of the above examples, the root indicates the main verb of the sentence and there is a passive nominal subject (indicated by the arc labeled with &#8216;nsubjpass&#8217;) and a passive auxiliary (&#8216;auxpass&#8217;). If you look closely, you will also notice some differences because the grammar of each language differs. For example, English uses the preposition &#8216;by,&#8217; where Russian uses morphology to mark that the phrase &#8216;the publisher (&#1080;&#1079;&#1076;&#1072;&#1090;&#1077;&#1083;&#1077;&#1084;)&#8217; is in <a href="https://en.wikipedia.org/wiki/Instrumental_case">instrumental case</a> -- the meaning is the same, it is just expressed differently. <br /><br />Google has been involved in the Universal Dependencies project since its <a href="http://universaldependencies.org/introduction.html#history">inception</a> and we are very excited to be able to bring together our efforts on datasets and modeling. We hope that this release will facilitate research progress in building computer systems that can understand all of the world&#8217;s languages.<br /><br />Parsey's Cousins can be found on <a href="https://github.com/tensorflow/models/blob/master/syntaxnet/universal.md">GitHub</a>, along with <a href="https://github.com/tensorflow/models/tree/master/syntaxnet/syntaxnet/models/parsey_mcparseface">Parsey McParseface</a> and <a href="https://github.com/tensorflow/models/tree/master/syntaxnet">SyntaxNet</a>.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Chris Alberti, Dave Orr &amp; Slav Petrov, Google Natural Language Understanding Team</span><br /><br />Just in time for <a href="http://acl2016.org/">ACL 2016</a>, we are pleased to announce that Parsey McParseface, <a href="https://research.googleblog.com/2016/05/announcing-syntaxnet-worlds-most.html">released in May as part of SyntaxNet</a>&nbsp;and the basis for the <a href="https://cloud.google.com/natural-language/">Cloud Natural Language API</a>, now has 40 cousins! Parsey’s Cousins is a collection of pretrained syntactic models for 40 languages, capable of analyzing the native language of more than half of the world’s population at often unprecedented <a href="https://github.com/tensorflow/models/blob/master/syntaxnet/universal.md">accuracy</a>. To better address the linguistic phenomena occurring in these languages we have endowed SyntaxNet with new abilities for <a href="https://en.wikipedia.org/wiki/Text_segmentation">Text Segmentation</a> and <a href="https://en.wikipedia.org/wiki/Morphology_(linguistics)">Morphological Analysis</a>.<br /><br />When we released Parsey, we were already planning to expand to more languages, and it soon became clear that this was both urgent and important, because researchers were having trouble creating top notch SyntaxNet models for other languages.<br /><br />The reason for that is a little bit subtle. SyntaxNet, like other <a href="https://www.tensorflow.org/">TensorFlow</a> models, has a lot of knobs to turn, which affect accuracy and speed. These knobs are called hyperparameters, and control things like the learning rate and its decay, momentum, and random initialization. Because neural networks are more sensitive to the choice of these hyperparameters than many other machine learning algorithms, picking the right hyperparameter setting is very important. Unfortunately there is no tested and proven way of doing this and picking good hyperparameters is mostly an empirical science -- we try a bunch of settings and see what works best.<br /><br />An additional challenge is that training these models can take a long time, several days on very fast hardware. Our solution is to train many models in parallel via <a href="https://en.wikipedia.org/wiki/MapReduce">MapReduce</a>, and when one looks promising, train a bunch more models with similar settings to fine-tune the results. This can really add up -- on average, we train more than 70 models per language. The plot below shows how the accuracy varies depending on the hyperparameters as training progresses. The best models are up to 4% absolute more accurate than ones trained without hyperparameter tuning.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-6kU_Cu49gq0/V6hY_5jpw3I/AAAAAAAABJM/D5_odV19sVgvSWai1VRXQl8wtRnJZpAcwCLcB/s1600/image05.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="436" src="https://1.bp.blogspot.com/-6kU_Cu49gq0/V6hY_5jpw3I/AAAAAAAABJM/D5_odV19sVgvSWai1VRXQl8wtRnJZpAcwCLcB/s640/image05.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Held-out set accuracy for various English parsing models with different hyperparameters (each line corresponds to one training run with specific hyperparameters). In some cases training is a lot slower and in many cases a suboptimal choice of hyperparameters leads to significantly lower accuracy. We are releasing the best model that we were able to train for each language.</td></tr></tbody></table>In order to do a good job at analyzing the grammar of other languages, it was not sufficient to just fine-tune our English setup. We also had to expand the capabilities of SyntaxNet. The first extension is a model for text segmentation, which is the task of identifying word boundaries. In languages like English, this isn’t very hard -- you can mostly look for spaces and punctuation. In Chinese, however, this can be very challenging, because words are not separated by spaces. To correctly analyze dependencies between Chinese words, SyntaxNet needs to understand text segmentation -- and now it does.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://3.bp.blogspot.com/-SH6wmh38Nq0/V6hZqbHD4LI/AAAAAAAABJQ/zJwbHpKvCqAJy5E0ng0yPiskJupFlrzbwCEw/s1600/image03.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="176" src="https://3.bp.blogspot.com/-SH6wmh38Nq0/V6hZqbHD4LI/AAAAAAAABJQ/zJwbHpKvCqAJy5E0ng0yPiskJupFlrzbwCEw/s640/image03.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Analysis of a Chinese string into a parse tree showing dependency labels, word tokens, and parts of speech (read top to bottom for each word token).</td></tr></tbody></table>The second extension is a model for morphological analysis. Morphology is a language feature that is poorly represented in English. It describes inflection: i.e., how the grammatical function and meaning of the word changes as its spelling changes. In English, we add an -s to a word to indicate plurality. In Russian, a <a href="https://en.wikipedia.org/wiki/Russian_grammar">heavily inflected language</a>, morphology can indicate number, gender, whether the word is the subject or object of a sentence, possessives, prepositional phrases, and more. To understand the syntax of a sentence in Russian, SyntaxNet needs to understand morphology -- and now it does.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-0WFKVxMUcyw/V6hbG3jKHDI/AAAAAAAABJc/X-BoqS_91CY8JQhSkVq4GGlLY3Cw3NkBwCLcB/s1600/image00.png" imageanchor="1"><img border="0" height="148" src="https://4.bp.blogspot.com/-0WFKVxMUcyw/V6hbG3jKHDI/AAAAAAAABJc/X-BoqS_91CY8JQhSkVq4GGlLY3Cw3NkBwCLcB/s640/image00.png" width="640" /></a></div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://2.bp.blogspot.com/-Vj4qA9EdxXY/V6hbz7rdO0I/AAAAAAAABJo/Jj7XLGqgHhEQ8qb1ou_DwgUFqSCZmvOzQCLcB/s1600/image01.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="234" src="https://2.bp.blogspot.com/-Vj4qA9EdxXY/V6hbz7rdO0I/AAAAAAAABJo/Jj7XLGqgHhEQ8qb1ou_DwgUFqSCZmvOzQCLcB/s640/image01.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Parse trees showing dependency labels, parts of speech, and morphology.</td></tr></tbody></table>As you might have noticed, the parse trees for all of the sentences above look very similar. This is because we follow the content-head principle, under which dependencies are drawn between content words, with function words becoming leaves in the parse tree. This idea was developed by the <a href="http://universaldependencies.org/">Universal Dependencies</a> project in order to increase parallelism between languages. Parsey’s Cousins are trained on <a href="https://en.wikipedia.org/wiki/Treebank">treebanks</a> provided by this project and are designed to be cross-linguistically consistent and thus easier to use in multi-lingual language understanding applications.<br /><br />Using the same set of labels across languages can help us understand how sentences in different languages, or variations in the same language, convey the same meaning. In all of the above examples, the root indicates the main verb of the sentence and there is a passive nominal subject (indicated by the arc labeled with ‘nsubjpass’) and a passive auxiliary (‘auxpass’). If you look closely, you will also notice some differences because the grammar of each language differs. For example, English uses the preposition ‘by,’ where Russian uses morphology to mark that the phrase ‘the publisher (издателем)’ is in <a href="https://en.wikipedia.org/wiki/Instrumental_case">instrumental case</a> -- the meaning is the same, it is just expressed differently. <br /><br />Google has been involved in the Universal Dependencies project since its <a href="http://universaldependencies.org/introduction.html#history">inception</a> and we are very excited to be able to bring together our efforts on datasets and modeling. We hope that this release will facilitate research progress in building computer systems that can understand all of the world’s languages.<br /><br />Parsey's Cousins can be found on <a href="https://github.com/tensorflow/models/blob/master/syntaxnet/universal.md">GitHub</a>, along with <a href="https://github.com/tensorflow/models/tree/master/syntaxnet/syntaxnet/models/parsey_mcparseface">Parsey McParseface</a> and <a href="https://github.com/tensorflow/models/tree/master/syntaxnet">SyntaxNet</a>.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/meet-parseys-cousins-syntax-for-40-languages-plus-new-syntaxnet-capabilities/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>ACL 2016 &amp; Research at Google</title>
		<link>https://googledata.org/google-research/acl-2016-research-at-google/</link>
		<comments>https://googledata.org/google-research/acl-2016-research-at-google/#comments</comments>
		<pubDate>Sun, 07 Aug 2016 12:17:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=45e392ed1740436d8a135f0f7aa34bfd</guid>
		<description><![CDATA[<span>Posted by Slav Petrov, Research Scientist</span><br /><br />This week, Berlin hosts the <a href="http://acl2016.org/">2016 Annual Meeting of the Association for Computational Linguistics</a> (ACL 2016), the premier conference of the field of computational linguistics, covering a broad spectrum of diverse research areas that are concerned with computational approaches to natural language. As a leader in <a href="https://en.wikipedia.org/wiki/Natural_language_processing">Natural Language Processing</a> (NLP) and a Platinum Sponsor of the conference, Google will be on hand to showcase research interests that include syntax, semantics, discourse, conversation, multilingual modeling, sentiment analysis, question answering, summarization, and generally building better learners using labeled and unlabeled data, state-of-the-art modeling, and learning from indirect supervision. <br /><br />Our systems are used in numerous ways across Google, impacting user experience in search, mobile, apps, ads, translate and more. Our work spans the range of traditional NLP tasks, with general-purpose syntax and semantic algorithms underpinning more specialized systems.<br />Our researchers are experts in natural language processing and machine learning, and combine methodological research with applied science, and our engineers are equally involved in long-term research efforts and driving immediate applications of our technology. <br /><br />If you&#8217;re attending ACL 2016, we hope that you&#8217;ll stop by the booth to check out some demos,  meet our researchers and discuss projects and opportunities at Google that go into solving interesting problems for billions of people. Learn more about Google research being presented at ACL 2016 below (Googlers highlighted in <span><span>blue</span></span>), and visit the Natural Language Understanding Team page at <a href="http://g.co/NLUTeam">g.co/NLUTeam</a>.<br /><br /><b><u>Papers</u></b><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-1015.pdf">Generalized Transition-based Dependency Parsing via Control Parameters</a><br /><i><span>Bernd Bohnet</span>,<span>&#160;Ryan McDonald</span>,<span>&#160;Emily Pitler</span>,<span>&#160;Ji Ma</span></i><br /><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-1013.pdf">Learning the Curriculum with Bayesian Optimization for Task-Specific Word Representation Learning</a><br /><i>Yulia Tsvetkov, Manaal Faruqui, <span>Wang Ling (Google DeepMind)</span>,<span>&#160;Chris Dyer (Google DeepMind)</span></i><br /><br /><a href="https://transacl.org/ojs/index.php/tacl/article/view/730/166">Morpho-syntactic Lexicon Generation Using Graph-based Semi-supervised Learning</a> (<a href="https://www.transacl.org/ojs/index.php/tacl">TACL</a>)<br /><i>Manaal Faruqui, <span>Ryan McDonald</span>,<span>&#160;Radu Soricut</span></i><br /><br /><a href="https://transacl.org/ojs/index.php/tacl/article/view/892/207">Many Languages, One Parser</a> (<a href="https://www.transacl.org/ojs/index.php/tacl">TACL</a>)<br /><i>Waleed Ammar, George Mulcaire, Miguel Ballesteros, <span>Chris Dyer (Google DeepMind)</span><a href="http://research.googleblog.com/#1" name="top1"><sup>*</sup></a>, Noah A. Smith</i><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-1057.pdf"><br /></a> <a href="http://www.aclweb.org/anthology/P/P16/P16-1057.pdf">Latent Predictor Networks for Code Generation</a><br /><i><span>Wang Ling (Google DeepMind)</span>,<span>&#160;Phil Blunsom (Google DeepMind)</span>,<span>&#160;Edward Grefenstette (Google DeepMind)</span>,<span>&#160;Karl Moritz Hermann (Google DeepMind)</span>,<span>&#160;Tom&#225;&#353; Ko&#269;isk&#253; (Google DeepMind)</span>,<span>&#160;Fumin Wang (Google DeepMind)</span>,<span>&#160;Andrew Senior (Google DeepMind)</span></i><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-1059.pdf"><br /></a> <a href="http://www.aclweb.org/anthology/P/P16/P16-1059.pdf">Collective Entity Resolution with Multi-Focal Attention</a><br /><i><span>Amir Globerson</span>,<span>&#160;Nevena Lazic</span>,<span>&#160;</span>Soumen Chakrabarti, <span>Amarnag Subramanya</span>,<span>&#160;Michael Ringgaard</span>,<span>&#160;Fernando Pereira</span></i><br /><br /><a href="http://www.aclweb.org/anthology/Q/Q15/Q15-1036.pdf">Plato: A Selective Context Model for Entity Resolution</a> (<a href="https://www.transacl.org/ojs/index.php/tacl">TACL</a>)<br /><i><span>Nevena Lazic</span>,<span>&#160;Amarnag Subramanya</span>,<span>&#160;Michael Ringgaard</span>,<span>&#160;Fernando Pereira</span></i><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-1145.pdf"><br /></a> <a href="http://www.aclweb.org/anthology/P/P16/P16-1145.pdf">WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia</a><br /><i><span>Daniel Hewlett</span>,<span>&#160;Alexandre Lacoste</span>,<span>&#160;Llion Jones</span>,<span>&#160;Illia Polosukhin</span>,<span>&#160;Andrew Fandrianto</span>,<span>&#160;Jay Han</span>,<span>&#160;Matthew Kelcey</span>,<span>&#160;David Berthelot</span></i><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-1147.pdf"><br /></a> <a href="http://www.aclweb.org/anthology/P/P16/P16-1147.pdf">Stack-propagation: Improved Representation Learning for Syntax</a><br /><i>Yuan Zhang, <span>David Weiss</span></i><br /><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-1157.pdf">Cross-lingual Models of Word Embeddings: An Empirical Comparison</a><br /><i>Shyam Upadhyay, Manaal Faruqui, <span>Chris Dyer (Google DeepMind)</span>,&#160;<span>Dan Roth</span></i><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-1231.pdf"><br /></a> <a href="http://www.aclweb.org/anthology/P/P16/P16-1231.pdf">Globally Normalized Transition-Based Neural Networks</a><b><i> (Outstanding Papers Session)</i></b><br /><i><span>Daniel Andor</span>,<span>&#160;Chris Alberti</span>,<span>&#160;David Weiss</span>,<span>&#160;Aliaksei Severyn</span>,<span>&#160;Alessandro Presta</span>,<span>&#160;Kuzman Ganchev</span>,&#160;<span>Slav Petrov</span>,<span>&#160;Michael Collins</span></i><br /><br /><b><u>Posters</u></b><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-2014.pdf">Cross-lingual projection for class-based language models</a><br /><i><span>Beat Gfeller</span>,<span>&#160;Vlad Schogol</span>,<span>&#160;Keith Hall</span></i><br /><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-1103.pdf">Synthesizing Compound Words for Machine Translation</a><br /><i>Austin Matthews, <span>Eva Schlinger</span><a href="http://research.googleblog.com/#1" name="top1"><sup>*</sup></a>, Alon Lavie, <span>Chris Dyer (Google DeepMind)</span><a href="http://research.googleblog.com/#1" name="top1"><sup>*</sup></a></i><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-1184.pdf"><br /></a> <a href="http://www.aclweb.org/anthology/P/P16/P16-1184.pdf">Cross-Lingual Morphological Tagging for Low-Resource Languages</a><br /><i>Jan Buys, <span>Jan A. Botha</span></i><br /><br /><b><u>Workshops</u></b><br /><a href="https://sites.google.com/site/repl4nlp2016/">1st Workshop on Representation Learning for NLP</a><br />Keynote Speakers include: <i><span>Raia Hadsell (Google DeepMind)</span></i><br />Workshop Organizers include: <i><span>Edward Grefenstette (Google DeepMind)</span></i><i>,</i><i><span>&#160;Phil Blunsom (Google DeepMind)</span></i><i>,</i><i><span>&#160;Karl Moritz Hermann (Google DeepMind)</span></i><br />Program Committee members include: <i><span>Tom&#225;&#353; Ko&#269;isk&#253; (Google DeepMind)</span></i><i>,</i><i><span>&#160;Wang Ling (Google DeepMind)</span></i><i>,</i><i><span>&#160;Ankur Parikh (Google)</span></i><i>,</i><i><span>&#160;John Platt (Google)</span></i><i>,</i><i><span>&#160;Oriol Vinyals (Google DeepMind)</span></i><br /><br /><a href="https://sites.google.com/site/repevalacl16/">1st Workshop on Evaluating Vector-Space Representations for NLP</a><br />Contributed Papers:<br /><a href="http://aclweb.org/anthology/W/W16/W16-2506.pdf">Problems With Evaluation of Word Embeddings Using Word Similarity Tasks</a><br /><i>Manaal Faruqui, Yulia Tsvetkov, Pushpendre Rastogi, <span>Chris Dyer (Google DeepMind)</span><a href="http://research.googleblog.com/#1" name="top1"><sup>*</sup></a></i><br /><br /><a href="http://aclweb.org/anthology/W/W16/W16-2520.pdf">Correlation-based Intrinsic Evaluation of Word Vector Representations</a><br /><i>Yulia Tsvetkov, Manaal Faruqui, <span>Chris Dyer (Google DeepMind)</span></i><br /><br /><a href="http://zwei.dwds.de/statfsm/">SIGFSM Workshop on Statistical NLP and Weighted Automata</a><br />Contributed Papers:<br /><a href="http://aclweb.org/anthology/W/W16/W16-2404.pdf">Distributed representation and estimation of WFST-based n-gram models</a><br /><i><span>Cyril Allauzen</span></i><i>,</i><i><span>&#160;Michael Riley</span></i><i>,</i><i><span>&#160;Brian Roark</span></i><br /><br /><a href="http://aclweb.org/anthology/W/W16/W16-2409.pdf">Pynini: A Python library for weighted finite-state grammar compilation</a><br /><i><span>Kyle Gorman</span></i><br /><br /><span><br /><a name="1"><b>* </b></a>Work completed at CMU<a href="http://research.googleblog.com/#top1"><sup>&#8617;</sup></a><br /></span>]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Slav Petrov, Research Scientist</span><br /><br />This week, Berlin hosts the <a href="http://acl2016.org/">2016 Annual Meeting of the Association for Computational Linguistics</a> (ACL 2016), the premier conference of the field of computational linguistics, covering a broad spectrum of diverse research areas that are concerned with computational approaches to natural language. As a leader in <a href="https://en.wikipedia.org/wiki/Natural_language_processing">Natural Language Processing</a> (NLP) and a Platinum Sponsor of the conference, Google will be on hand to showcase research interests that include syntax, semantics, discourse, conversation, multilingual modeling, sentiment analysis, question answering, summarization, and generally building better learners using labeled and unlabeled data, state-of-the-art modeling, and learning from indirect supervision. <br /><br />Our systems are used in numerous ways across Google, impacting user experience in search, mobile, apps, ads, translate and more. Our work spans the range of traditional NLP tasks, with general-purpose syntax and semantic algorithms underpinning more specialized systems.<br />Our researchers are experts in natural language processing and machine learning, and combine methodological research with applied science, and our engineers are equally involved in long-term research efforts and driving immediate applications of our technology. <br /><br />If you’re attending ACL 2016, we hope that you’ll stop by the booth to check out some demos,  meet our researchers and discuss projects and opportunities at Google that go into solving interesting problems for billions of people. Learn more about Google research being presented at ACL 2016 below (Googlers highlighted in <span style="background-color: white;"><span style="color: #3d85c6;">blue</span></span>), and visit the Natural Language Understanding Team page at <a href="http://g.co/NLUTeam">g.co/NLUTeam</a>.<br /><br /><b><u>Papers</u></b><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-1015.pdf">Generalized Transition-based Dependency Parsing via Control Parameters</a><br /><i><span style="color: #3d85c6;">Bernd Bohnet</span>,<span style="color: #3d85c6;">&nbsp;Ryan McDonald</span>,<span style="color: #3d85c6;">&nbsp;Emily Pitler</span>,<span style="color: #3d85c6;">&nbsp;Ji Ma</span></i><br /><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-1013.pdf">Learning the Curriculum with Bayesian Optimization for Task-Specific Word Representation Learning</a><br /><i>Yulia Tsvetkov, Manaal Faruqui, <span style="color: #3d85c6;">Wang Ling (Google DeepMind)</span>,<span style="color: #3d85c6;">&nbsp;Chris Dyer (Google DeepMind)</span></i><br /><br /><a href="https://transacl.org/ojs/index.php/tacl/article/view/730/166">Morpho-syntactic Lexicon Generation Using Graph-based Semi-supervised Learning</a> (<a href="https://www.transacl.org/ojs/index.php/tacl">TACL</a>)<br /><i>Manaal Faruqui, <span style="color: #3d85c6;">Ryan McDonald</span>,<span style="color: #3d85c6;">&nbsp;Radu Soricut</span></i><br /><br /><a href="https://transacl.org/ojs/index.php/tacl/article/view/892/207">Many Languages, One Parser</a> (<a href="https://www.transacl.org/ojs/index.php/tacl">TACL</a>)<br /><i>Waleed Ammar, George Mulcaire, Miguel Ballesteros, <span style="color: #3d85c6;">Chris Dyer (Google DeepMind)</span><a href="http://research.googleblog.com/2016/08/acl-2016-research-at-google.html#1" name="top1"><sup>*</sup></a>, Noah A. Smith</i><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-1057.pdf"><br /></a> <a href="http://www.aclweb.org/anthology/P/P16/P16-1057.pdf">Latent Predictor Networks for Code Generation</a><br /><i><span style="color: #3d85c6;">Wang Ling (Google DeepMind)</span>,<span style="color: #3d85c6;">&nbsp;Phil Blunsom (Google DeepMind)</span>,<span style="color: #3d85c6;">&nbsp;Edward Grefenstette (Google DeepMind)</span>,<span style="color: #3d85c6;">&nbsp;Karl Moritz Hermann (Google DeepMind)</span>,<span style="color: #3d85c6;">&nbsp;Tomáš Kočiský (Google DeepMind)</span>,<span style="color: #3d85c6;">&nbsp;Fumin Wang (Google DeepMind)</span>,<span style="color: #3d85c6;">&nbsp;Andrew Senior (Google DeepMind)</span></i><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-1059.pdf"><br /></a> <a href="http://www.aclweb.org/anthology/P/P16/P16-1059.pdf">Collective Entity Resolution with Multi-Focal Attention</a><br /><i><span style="color: #3d85c6;">Amir Globerson</span>,<span style="color: #3d85c6;">&nbsp;Nevena Lazic</span>,<span style="color: #3d85c6;">&nbsp;</span>Soumen Chakrabarti, <span style="color: #3d85c6;">Amarnag Subramanya</span>,<span style="color: #3d85c6;">&nbsp;Michael Ringgaard</span>,<span style="color: #3d85c6;">&nbsp;Fernando Pereira</span></i><br /><br /><a href="http://www.aclweb.org/anthology/Q/Q15/Q15-1036.pdf">Plato: A Selective Context Model for Entity Resolution</a> (<a href="https://www.transacl.org/ojs/index.php/tacl">TACL</a>)<br /><i><span style="color: #3d85c6;">Nevena Lazic</span>,<span style="color: #3d85c6;">&nbsp;Amarnag Subramanya</span>,<span style="color: #3d85c6;">&nbsp;Michael Ringgaard</span>,<span style="color: #3d85c6;">&nbsp;Fernando Pereira</span></i><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-1145.pdf"><br /></a> <a href="http://www.aclweb.org/anthology/P/P16/P16-1145.pdf">WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia</a><br /><i><span style="color: #3d85c6;">Daniel Hewlett</span>,<span style="color: #3d85c6;">&nbsp;Alexandre Lacoste</span>,<span style="color: #3d85c6;">&nbsp;Llion Jones</span>,<span style="color: #3d85c6;">&nbsp;Illia Polosukhin</span>,<span style="color: #3d85c6;">&nbsp;Andrew Fandrianto</span>,<span style="color: #3d85c6;">&nbsp;Jay Han</span>,<span style="color: #3d85c6;">&nbsp;Matthew Kelcey</span>,<span style="color: #3d85c6;">&nbsp;David Berthelot</span></i><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-1147.pdf"><br /></a> <a href="http://www.aclweb.org/anthology/P/P16/P16-1147.pdf">Stack-propagation: Improved Representation Learning for Syntax</a><br /><i>Yuan Zhang, <span style="color: #3d85c6;">David Weiss</span></i><br /><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-1157.pdf">Cross-lingual Models of Word Embeddings: An Empirical Comparison</a><br /><i>Shyam Upadhyay, Manaal Faruqui, <span style="color: #3d85c6;">Chris Dyer (Google DeepMind)</span>,&nbsp;<span style="color: #3d85c6;">Dan Roth</span></i><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-1231.pdf"><br /></a> <a href="http://www.aclweb.org/anthology/P/P16/P16-1231.pdf">Globally Normalized Transition-Based Neural Networks</a><b><i> (Outstanding Papers Session)</i></b><br /><i><span style="color: #3d85c6;">Daniel Andor</span>,<span style="color: #3d85c6;">&nbsp;Chris Alberti</span>,<span style="color: #3d85c6;">&nbsp;David Weiss</span>,<span style="color: #3d85c6;">&nbsp;Aliaksei Severyn</span>,<span style="color: #3d85c6;">&nbsp;Alessandro Presta</span>,<span style="color: #3d85c6;">&nbsp;Kuzman Ganchev</span>,&nbsp;<span style="color: #3d85c6;">Slav Petrov</span>,<span style="color: #3d85c6;">&nbsp;Michael Collins</span></i><br /><br /><b><u>Posters</u></b><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-2014.pdf">Cross-lingual projection for class-based language models</a><br /><i><span style="color: #3d85c6;">Beat Gfeller</span>,<span style="color: #3d85c6;">&nbsp;Vlad Schogol</span>,<span style="color: #3d85c6;">&nbsp;Keith Hall</span></i><br /><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-1103.pdf">Synthesizing Compound Words for Machine Translation</a><br /><i>Austin Matthews, <span style="color: #3d85c6;">Eva Schlinger</span><a href="http://research.googleblog.com/2016/08/acl-2016-research-at-google.html#1" name="top1"><sup>*</sup></a>, Alon Lavie, <span style="color: #3d85c6;">Chris Dyer (Google DeepMind)</span><a href="http://research.googleblog.com/2016/08/acl-2016-research-at-google.html#1" name="top1"><sup>*</sup></a></i><br /><a href="http://www.aclweb.org/anthology/P/P16/P16-1184.pdf"><br /></a> <a href="http://www.aclweb.org/anthology/P/P16/P16-1184.pdf">Cross-Lingual Morphological Tagging for Low-Resource Languages</a><br /><i>Jan Buys, <span style="color: #3d85c6;">Jan A. Botha</span></i><br /><br /><b><u>Workshops</u></b><br /><a href="https://sites.google.com/site/repl4nlp2016/">1st Workshop on Representation Learning for NLP</a><br />Keynote Speakers include: <i><span style="color: #3d85c6;">Raia Hadsell (Google DeepMind)</span></i><br />Workshop Organizers include: <i><span style="color: #3d85c6;">Edward Grefenstette (Google DeepMind)</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Phil Blunsom (Google DeepMind)</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Karl Moritz Hermann (Google DeepMind)</span></i><br />Program Committee members include: <i><span style="color: #3d85c6;">Tomáš Kočiský (Google DeepMind)</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Wang Ling (Google DeepMind)</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Ankur Parikh (Google)</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;John Platt (Google)</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Oriol Vinyals (Google DeepMind)</span></i><br /><br /><a href="https://sites.google.com/site/repevalacl16/">1st Workshop on Evaluating Vector-Space Representations for NLP</a><br />Contributed Papers:<br /><a href="http://aclweb.org/anthology/W/W16/W16-2506.pdf">Problems With Evaluation of Word Embeddings Using Word Similarity Tasks</a><br /><i>Manaal Faruqui, Yulia Tsvetkov, Pushpendre Rastogi, <span style="color: #3d85c6;">Chris Dyer (Google DeepMind)</span><a href="http://research.googleblog.com/2016/08/acl-2016-research-at-google.html#1" name="top1"><sup>*</sup></a></i><br /><br /><a href="http://aclweb.org/anthology/W/W16/W16-2520.pdf">Correlation-based Intrinsic Evaluation of Word Vector Representations</a><br /><i>Yulia Tsvetkov, Manaal Faruqui, <span style="color: #3d85c6;">Chris Dyer (Google DeepMind)</span></i><br /><br /><a href="http://zwei.dwds.de/statfsm/">SIGFSM Workshop on Statistical NLP and Weighted Automata</a><br />Contributed Papers:<br /><a href="http://aclweb.org/anthology/W/W16/W16-2404.pdf">Distributed representation and estimation of WFST-based n-gram models</a><br /><i><span style="color: #3d85c6;">Cyril Allauzen</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Michael Riley</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Brian Roark</span></i><br /><br /><a href="http://aclweb.org/anthology/W/W16/W16-2409.pdf">Pynini: A Python library for weighted finite-state grammar compilation</a><br /><i><span style="color: #3d85c6;">Kyle Gorman</span></i><br /><br /><span class="Apple-style-span" style="font-size: small;"><br /><a name="1"><b>* </b></a>Work completed at CMU<a href="http://research.googleblog.com/2016/08/acl-2016-research-at-google.html#top1"><sup>↩</sup></a><br /></span>]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/acl-2016-research-at-google/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Computational Thinking for All Students</title>
		<link>https://googledata.org/google-research/computational-thinking-for-all-students/</link>
		<comments>https://googledata.org/google-research/computational-thinking-for-all-students/#comments</comments>
		<pubDate>Wed, 03 Aug 2016 17:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>
		<category><![CDATA[education]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=ee175c39da2ac4a8a10a01fdf0c7eec1</guid>
		<description><![CDATA[<span>Posted by Maggie Johnson, Director of Education and University Relations, Google</span><br /><br /><i>(Crossposted on the <a href="http://googleforeducation.blogspot.com/2016/08/computational-thinking-for-all-students_3.html">Google for Education Blog</a>, and the the <a href="http://www.huffingtonpost.com/entry/to-all-students-learning-computational-thinking-will_us_57a15f7ee4b004301c522b7c">Huffington Post</a>)</i><br /><br />Last year, I wrote about <a href="https://research.googleblog.com/2015/07/should-my-kid-learn-to-code_14.html" target="_blank">the importance of teaching computational thinking to all K-12 students</a>. Given the growing use of computing, algorithms and data in all fields from the humanities to medicine to business, it&#8217;s becoming increasingly important for students to understand the basics of computer science (CS). One lesson we have learned through Google&#8217;s CS education outreach efforts is that these skills can be accessible to all students, if we introduce them early in K-5. These are truly 21st century skills which can, over time, produce a workforce ready for a technology-enabled and driven economy. <br /><br />How can teachers start introducing computational thinking in early school curriculum? It is already present in many topic areas - algorithms for solving math problems, for example. However, what is often missing in current examples of computational thinking is the explicit connection between what students are learning and its application in computing. For example, once a student has mastered adding multi-digit numbers, the following algorithm could be presented:<br /><ol><li>Add together the digits in the ones place. If the result is &#60; 10, it becomes the ones digit of the answer. If it's &#62;= 10 or greater, the ones digit of the result becomes the ones digit of the answer, and you add 1 to the next column.</li><li>Add together the digits in the tens place, plus the 1 carried over from the ones place, if necessary. If the answer &#60; than 10, it becomes the tens digit of the answer; if it's &#62;= 10, the ones digit becomes the tens digit of the answer and 1 is added to the next column.</li><li>Repeat this process for any additional columns until they are all added.</li></ol><div><a href="https://4.bp.blogspot.com/-W_QBteVg0nY/V6DKkWyQ95I/AAAAAAAABI0/MXZ1MIfpBDoeCRPNK4zXjAah1aFIQv_IACLcB/s1600/image00.png"><img border="0" height="320" src="https://4.bp.blogspot.com/-W_QBteVg0nY/V6DKkWyQ95I/AAAAAAAABI0/MXZ1MIfpBDoeCRPNK4zXjAah1aFIQv_IACLcB/s640/image00.png" width="640"></a></div>This allows a teacher to present the concept of an algorithm and its use in computing, as well as the most important elements of any computer program: conditional branching (&#8220;if the result is less than 10&#8230;&#8221;) and iteration (&#8220;repeat this process&#8230;&#8221;). Going a step farther, a teacher translating the algorithm into a running program can have a compelling effect. When something that students have used to solve an instance of a problem can automatically solve all instances of the that problem, it&#8217;s quite a powerful moment for them even if they don&#8217;t do the coding themselves.  <br /><br />Google has created <a href="https://computationalthinkingcourse.withgoogle.com/course?use_last_location=true" target="_blank">an online course for K-12 teachers to learn about computational thinking</a> and how to make these explicit connections for their students. We also have a <a href="https://www.google.com/edu/resources/programs/exploring-computational-thinking/index.html#!ct-materials" target="_blank">large repository of lessons</a>, explorations and programs to support teachers and students. Our videos illustrate real-world examples of the application of computational thinking in Google&#8217;s products and services, and we have compiled a set of <a href="https://www.google.com/edu/resources/programs/exploring-computational-thinking/index.html#!resources" target="_blank">great resources showing how to integrate computational thinking into existing curriculum</a>. We also recently announced <a href="https://projectbloks.withgoogle.com/" target="_blank">Project Bloks</a> to engage younger children in computational thinking. Finally, <a href="http://code.org/">code.org</a>, for whom Google is a primary sponsor, has <a href="https://code.org/educate/curriculum/elementary-school" target="_blank">curriculum and materials</a> for K-5 teachers and students. <br /><br />We feel that computational thinking is a core skill for all students. If we can make these explicit connections for students, they will see how the devices and apps that they use everyday are powered by algorithms and programs. They will learn the importance of data in making decisions. They will learn skills that will prepare them for a workforce that will be doing vastly different tasks than the workforce of today. We owe it to all students to give them every possible opportunity to be productive and successful members of society.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Maggie Johnson, Director of Education and University Relations, Google</span><br /><br /><i>(Crossposted on the <a href="http://googleforeducation.blogspot.com/2016/08/computational-thinking-for-all-students_3.html">Google for Education Blog</a>, and the the <a href="http://www.huffingtonpost.com/entry/to-all-students-learning-computational-thinking-will_us_57a15f7ee4b004301c522b7c">Huffington Post</a>)</i><br /><br />Last year, I wrote about <a href="https://research.googleblog.com/2015/07/should-my-kid-learn-to-code_14.html" >the importance of teaching computational thinking to all K-12 students</a>. Given the growing use of computing, algorithms and data in all fields from the humanities to medicine to business, it’s becoming increasingly important for students to understand the basics of computer science (CS). One lesson we have learned through Google’s CS education outreach efforts is that these skills can be accessible to all students, if we introduce them early in K-5. These are truly 21st century skills which can, over time, produce a workforce ready for a technology-enabled and driven economy. <br /><br />How can teachers start introducing computational thinking in early school curriculum? It is already present in many topic areas - algorithms for solving math problems, for example. However, what is often missing in current examples of computational thinking is the explicit connection between what students are learning and its application in computing. For example, once a student has mastered adding multi-digit numbers, the following algorithm could be presented:<br /><ol><li>Add together the digits in the ones place. If the result is &lt; 10, it becomes the ones digit of the answer. If it's &gt;= 10 or greater, the ones digit of the result becomes the ones digit of the answer, and you add 1 to the next column.</li><li>Add together the digits in the tens place, plus the 1 carried over from the ones place, if necessary. If the answer &lt; than 10, it becomes the tens digit of the answer; if it's &gt;= 10, the ones digit becomes the tens digit of the answer and 1 is added to the next column.</li><li>Repeat this process for any additional columns until they are all added.</li></ol><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-W_QBteVg0nY/V6DKkWyQ95I/AAAAAAAABI0/MXZ1MIfpBDoeCRPNK4zXjAah1aFIQv_IACLcB/s1600/image00.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://4.bp.blogspot.com/-W_QBteVg0nY/V6DKkWyQ95I/AAAAAAAABI0/MXZ1MIfpBDoeCRPNK4zXjAah1aFIQv_IACLcB/s640/image00.png" width="640" /></a></div>This allows a teacher to present the concept of an algorithm and its use in computing, as well as the most important elements of any computer program: conditional branching (“if the result is less than 10…”) and iteration (“repeat this process…”). Going a step farther, a teacher translating the algorithm into a running program can have a compelling effect. When something that students have used to solve an instance of a problem can automatically solve all instances of the that problem, it’s quite a powerful moment for them even if they don’t do the coding themselves.  <br /><br />Google has created <a href="https://computationalthinkingcourse.withgoogle.com/course?use_last_location=true" >an online course for K-12 teachers to learn about computational thinking</a> and how to make these explicit connections for their students. We also have a <a href="https://www.google.com/edu/resources/programs/exploring-computational-thinking/index.html#!ct-materials" >large repository of lessons</a>, explorations and programs to support teachers and students. Our videos illustrate real-world examples of the application of computational thinking in Google’s products and services, and we have compiled a set of <a href="https://www.google.com/edu/resources/programs/exploring-computational-thinking/index.html#!resources" >great resources showing how to integrate computational thinking into existing curriculum</a>. We also recently announced <a href="https://projectbloks.withgoogle.com/" >Project Bloks</a> to engage younger children in computational thinking. Finally, <a href="http://code.org/">code.org</a>, for whom Google is a primary sponsor, has <a href="https://code.org/educate/curriculum/elementary-school" >curriculum and materials</a> for K-5 teachers and students. <br /><br />We feel that computational thinking is a core skill for all students. If we can make these explicit connections for students, they will see how the devices and apps that they use everyday are powered by algorithms and programs. They will learn the importance of data in making decisions. They will learn skills that will prepare them for a workforce that will be doing vastly different tasks than the workforce of today. We owe it to all students to give them every possible opportunity to be productive and successful members of society.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/computational-thinking-for-all-students/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Announcing an Open Source ADC board for BeagleBone</title>
		<link>https://googledata.org/google-research/announcing-an-open-source-adc-board-for-beaglebone-2/</link>
		<comments>https://googledata.org/google-research/announcing-an-open-source-adc-board-for-beaglebone-2/#comments</comments>
		<pubDate>Wed, 20 Jul 2016 17:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=ad7d6463323ea6dca10c91ad01d6a1d9</guid>
		<description><![CDATA[<span>Posted by Jason Holt, Software Engineer</span><br /><br /><i>(Cross-posted on the <a href="http://google-opensource.blogspot.com/2016/07/announcing-open-source-adc-board-for.html">Google Open Source Blog</a>)</i><br /><br />Working with electronics, we often find ourselves soldering up a half baked electronic circuit to detect some sort of signal. For example, last year we wanted to measure the strength of a <a href="https://en.wikipedia.org/wiki/Carrier_wave">carrier</a>. We started with traditional analog circuits &#8212; <a href="https://en.wikipedia.org/wiki/Operational_amplifier">amplifier</a>, <a href="https://en.wikipedia.org/wiki/LC_circuit">filter</a>, <a href="https://en.wikipedia.org/wiki/Envelope_detector">envelope detector</a>, <a href="https://en.wikipedia.org/wiki/Comparator">threshold</a>. You can see some of our prototypes in the image below; they get pretty messy.<br /><div><a href="https://4.bp.blogspot.com/-pWzu2OS_v0o/V46NilT-u3I/AAAAAAAABIU/oB_yBhuJVZYT0o5Ag8zQVjkyl_10SxD_ACEw/s1600/image01.png"><img border="0" height="584" src="https://4.bp.blogspot.com/-pWzu2OS_v0o/V46NilT-u3I/AAAAAAAABIU/oB_yBhuJVZYT0o5Ag8zQVjkyl_10SxD_ACEw/s640/image01.png" width="640"></a></div>While there's a certain satisfaction in taming a signal using the physical properties of capacitors, coils of wire and transistors, it's usually easier to digitize the signal with an <a href="https://en.wikipedia.org/wiki/Analog-to-digital_converter">Analog to Digital Converter</a> (ADC) and manage it with <a href="https://en.wikipedia.org/wiki/Digital_signal_processing">Digital Signal Processing</a> (DSP) instead of electronic parts. Tweaking software doesn't require a soldering iron, and lets us modify signals in ways that would require impossible analog circuits.<br /><br />There are several standard solutions for digitizing a signal: connect a laptop to an oscilloscope or <a href="https://en.wikipedia.org/wiki/Data_acquisition">Data Acquisition System</a> (DAQ) via USB or Ethernet, or use the onboard ADCs of a maker board like an <a href="https://www.arduino.cc/">Arduino</a>. The former are sensitive and accurate, but also big and power hungry. The latter are cheap and tiny, but slower and have enough RAM for only milliseconds worth of high speed sample data.  <br /><br />That led us to investigate single board computers like the <a href="http://beagleboard.org/bone">BeagleBone</a> and <a href="https://www.raspberrypi.org/">Raspberry Pi</a>, which are small and cheap like an Arduino, but have specs like a smartphone.  And crucially, the BeagleBone's <a href="https://en.wikipedia.org/wiki/System_on_a_chip">system-on-a-chip</a> (SoC) combines a beefy ARMv7 CPU with two smaller Programmable Realtime Units (PRUs) that have access to all 512MB of system RAM.  This lets us dedicate the PRUs to the time-sensitive and repetitive task of reading each sample out of an external ADC, while the main CPU lets us use the data with the GNU/Linux tools we're used to.<br /><br />The result is an open source <a href="http://elinux.org/Beagleboard:BeagleBone_Capes">BeagleBone cape</a> we've named <a href="https://github.com/google/prudaq/wiki">PRUDAQ</a>.  It's built around the Analog Devices AD9201 ADC, which samples two inputs simultaneously at up to 20 megasamples per second, per channel.  Simultaneous sampling and high sample rates make it useful for <a href="https://en.wikipedia.org/wiki/Software-defined_radio">software-defined radio</a> (SDR) and scientific applications where a built-in ADC isn't quite up to the task.  <br /><div><a href="https://2.bp.blogspot.com/-Y_H0QecZ6kg/V46OZ2vEiGI/AAAAAAAABIc/i4wW5aW5v84x63t9OM4ormEA6rgY6UPigCLcB/s1600/image00.png"><img border="0" height="530" src="https://2.bp.blogspot.com/-Y_H0QecZ6kg/V46OZ2vEiGI/AAAAAAAABIc/i4wW5aW5v84x63t9OM4ormEA6rgY6UPigCLcB/s640/image00.png" width="640"></a></div><br />Our open source electrical design and sample code are available on <a href="https://github.com/google/prudaq/wiki">GitHub</a>, and <a href="https://groupgets.com/manufacturers/getlab/products/prudaq">GroupGets</a> has boards ready to ship for $79.  We also were fortunate to have help from Google intern Kumar Abhishek.  He added support for PRUDAQ to his <a href="https://summerofcode.withgoogle.com/">Google Summer of Code</a> project <a href="https://github.com/abhishek-kakkar/BeagleLogic">BeagleLogic</a> that performs much better than our sample code.<br /><br />We started <a href="https://github.com/google/prudaq/wiki">PRUDAQ</a> for our own needs, but quickly realized that others might also find it useful.  We're excited to get your feedback through the <a href="https://groups.google.com/d/forum/prudaq-users">email list</a>.  Tell us what can be done with inexpensive fast ADCs paired with inexpensive fast CPUs!]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Jason Holt, Software Engineer</span><br /><br /><i>(Cross-posted on the <a href="http://google-opensource.blogspot.com/2016/07/announcing-open-source-adc-board-for.html">Google Open Source Blog</a>)</i><br /><br />Working with electronics, we often find ourselves soldering up a half baked electronic circuit to detect some sort of signal. For example, last year we wanted to measure the strength of a <a href="https://en.wikipedia.org/wiki/Carrier_wave">carrier</a>. We started with traditional analog circuits — <a href="https://en.wikipedia.org/wiki/Operational_amplifier">amplifier</a>, <a href="https://en.wikipedia.org/wiki/LC_circuit">filter</a>, <a href="https://en.wikipedia.org/wiki/Envelope_detector">envelope detector</a>, <a href="https://en.wikipedia.org/wiki/Comparator">threshold</a>. You can see some of our prototypes in the image below; they get pretty messy.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-pWzu2OS_v0o/V46NilT-u3I/AAAAAAAABIU/oB_yBhuJVZYT0o5Ag8zQVjkyl_10SxD_ACEw/s1600/image01.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="584" src="https://4.bp.blogspot.com/-pWzu2OS_v0o/V46NilT-u3I/AAAAAAAABIU/oB_yBhuJVZYT0o5Ag8zQVjkyl_10SxD_ACEw/s640/image01.png" width="640" /></a></div>While there's a certain satisfaction in taming a signal using the physical properties of capacitors, coils of wire and transistors, it's usually easier to digitize the signal with an <a href="https://en.wikipedia.org/wiki/Analog-to-digital_converter">Analog to Digital Converter</a> (ADC) and manage it with <a href="https://en.wikipedia.org/wiki/Digital_signal_processing">Digital Signal Processing</a> (DSP) instead of electronic parts. Tweaking software doesn't require a soldering iron, and lets us modify signals in ways that would require impossible analog circuits.<br /><br />There are several standard solutions for digitizing a signal: connect a laptop to an oscilloscope or <a href="https://en.wikipedia.org/wiki/Data_acquisition">Data Acquisition System</a> (DAQ) via USB or Ethernet, or use the onboard ADCs of a maker board like an <a href="https://www.arduino.cc/">Arduino</a>. The former are sensitive and accurate, but also big and power hungry. The latter are cheap and tiny, but slower and have enough RAM for only milliseconds worth of high speed sample data.  <br /><br />That led us to investigate single board computers like the <a href="http://beagleboard.org/bone">BeagleBone</a> and <a href="https://www.raspberrypi.org/">Raspberry Pi</a>, which are small and cheap like an Arduino, but have specs like a smartphone.  And crucially, the BeagleBone's <a href="https://en.wikipedia.org/wiki/System_on_a_chip">system-on-a-chip</a> (SoC) combines a beefy ARMv7 CPU with two smaller Programmable Realtime Units (PRUs) that have access to all 512MB of system RAM.  This lets us dedicate the PRUs to the time-sensitive and repetitive task of reading each sample out of an external ADC, while the main CPU lets us use the data with the GNU/Linux tools we're used to.<br /><br />The result is an open source <a href="http://elinux.org/Beagleboard:BeagleBone_Capes">BeagleBone cape</a> we've named <a href="https://github.com/google/prudaq/wiki">PRUDAQ</a>.  It's built around the Analog Devices AD9201 ADC, which samples two inputs simultaneously at up to 20 megasamples per second, per channel.  Simultaneous sampling and high sample rates make it useful for <a href="https://en.wikipedia.org/wiki/Software-defined_radio">software-defined radio</a> (SDR) and scientific applications where a built-in ADC isn't quite up to the task.  <br /><div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-Y_H0QecZ6kg/V46OZ2vEiGI/AAAAAAAABIc/i4wW5aW5v84x63t9OM4ormEA6rgY6UPigCLcB/s1600/image00.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="530" src="https://2.bp.blogspot.com/-Y_H0QecZ6kg/V46OZ2vEiGI/AAAAAAAABIc/i4wW5aW5v84x63t9OM4ormEA6rgY6UPigCLcB/s640/image00.png" width="640" /></a></div><br />Our open source electrical design and sample code are available on <a href="https://github.com/google/prudaq/wiki">GitHub</a>, and <a href="https://groupgets.com/manufacturers/getlab/products/prudaq">GroupGets</a> has boards ready to ship for $79.  We also were fortunate to have help from Google intern Kumar Abhishek.  He added support for PRUDAQ to his <a href="https://summerofcode.withgoogle.com/">Google Summer of Code</a> project <a href="https://github.com/abhishek-kakkar/BeagleLogic">BeagleLogic</a> that performs much better than our sample code.<br /><br />We started <a href="https://github.com/google/prudaq/wiki">PRUDAQ</a> for our own needs, but quickly realized that others might also find it useful.  We're excited to get your feedback through the <a href="https://groups.google.com/d/forum/prudaq-users">email list</a>.  Tell us what can be done with inexpensive fast ADCs paired with inexpensive fast CPUs!]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/announcing-an-open-source-adc-board-for-beaglebone-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Towards an exact (quantum) description of chemistry</title>
		<link>https://googledata.org/google-research/towards-an-exact-quantum-description-of-chemistry/</link>
		<comments>https://googledata.org/google-research/towards-an-exact-quantum-description-of-chemistry/#comments</comments>
		<pubDate>Mon, 18 Jul 2016 20:30:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=5c7d56e7d3a504ef56959ce73db4864b</guid>
		<description><![CDATA[<span>Posted by Ryan Babbush, Quantum Software Engineer</span><br /><br />&#8220;<i>...nature isn't classical, dammit, and if you want to make a simulation of nature, you'd better make it quantum mechanical...</i>&#8221; - <a href="https://en.wikipedia.org/wiki/Richard_Feynman">Richard Feynman</a>, <a href="http://people.eecs.berkeley.edu/~christos/classics/Feynman.pdf">Simulating Physics with Computers</a><br /><br />One of the most promising applications of quantum computing is the ability to efficiently model quantum systems in nature that are considered intractable for classical computers. Now, in collaboration with the <a href="http://aspuru.chem.harvard.edu/">Aspuru-Guzik group at Harvard</a> and researchers from Lawrence Berkeley National Labs, UC Santa Barbara, Tufts University and University College London, we have performed the first completely scalable quantum simulation of a molecule. Our experimental results are detailed in the paper <a href="https://journals.aps.org/prx/pdf/10.1103/PhysRevX.6.031007"><i>Scalable Quantum Simulation of Molecular Energies</i></a>, which recently appeared in <a href="https://journals.aps.org/prx/"><i>Physical Review X</i></a>.<br /><br />The goal of our experiment was to use quantum hardware to efficiently solve the <a href="https://en.wikipedia.org/wiki/Computational_chemistry#Electronic_structure">molecular electronic structure problem</a>, which seeks the solution for the lowest energy configuration of electrons in the presence of a given nuclear configuration. In order to predict chemical reaction rates (which govern the mechanism of chemical reactions), one must make these calculations to extremely high precision. The ability to predict such rates could revolutionize the design of solar cells, industrial catalysts, batteries, flexible electronics, medicines, materials and more. The primary difficulty is that molecular systems form <a href="https://en.wikipedia.org/wiki/Quantum_entanglement">highly entangled quantum superposition states</a> which require exponentially many classical computing resources in order to represent to sufficiently high precision. For example, exactly computing the energies of methane (CH<sub>4</sub>) takes about one second, but the same calculation takes about ten minutes for ethane (C<sub>2</sub>H<sub>6</sub>) and about ten days for propane (C<sub>3</sub>H<sub>8</sub>).<br /><br />In our experiment, we focus on an approach known as the <a href="http://iopscience.iop.org/article/10.1088/1367-2630/18/2/023023/meta">variational quantum eigensolver</a> (VQE), which can be understood as a quantum analog of a neural network. Whereas a classical neural network is a parameterized mapping that one trains in order to model classical data, VQE is a parameterized mapping (e.g. a quantum circuit) that one trains in order to model quantum data (e.g. a molecular wavefunction). The training objective for VQE is the molecular energy function, which is always minimized by the true ground state. The quantum advantage of VQE is that quantum bits can efficiently represent the molecular wavefunction whereas exponentially many classical bits would be required.<br /><br />Using VQE, we quantum computed the energy landscape of molecular hydrogen, H<sub>2</sub>. We compared the performance of VQE to another quantum algorithm for chemistry, the <a href="http://science.sciencemag.org/content/309/5741/1704">phase estimation algorithm</a> (PEA). Experimentally computed energies, as a function of the H - H bond length, are shown below alongside the exact curve. We were able to obtain such high performance with VQE because the neural-network-like training loop helped to establish experimentally optimal circuit parameters for representing the wavefunction in the presence of systematic control errors. One can understand this by considering a hardware implementation of a neural network with a faulty weight, e.g. the weight is only represented half as strong as it should be. Because the weights of the neural network are established via a closed-loop training procedure which can compensate for such systematic errors, the hardware neural network is robust against such imperfections. Likewise, despite systematic errors in our implementation of the VQE circuit, we are still able to learn an accurate model for the wavefunction. This robustness inspires hope that VQE may be able to solve classically intractable problems without <a href="https://research.googleblog.com/2015/03/a-step-closer-to-quantum-computation.html">quantum error correction</a>.<br /><div><a href="https://2.bp.blogspot.com/-nDPJWmMCQ5Q/V405VGumkNI/AAAAAAAABH8/vvWfnVAHes89GRdaONr2Y0-DeY-fmoSNACLcB/s1600/image00.png"><img border="0" height="480" src="https://2.bp.blogspot.com/-nDPJWmMCQ5Q/V405VGumkNI/AAAAAAAABH8/vvWfnVAHes89GRdaONr2Y0-DeY-fmoSNACLcB/s640/image00.png" width="640"></a></div>While the energies of molecular hydrogen can be computed classically (albeit inefficiently), as one scales up quantum hardware it becomes possible to simulate even larger chemical systems, including classically intractable ones. For instance, with only about a hundred reliable quantum bits one could model the process by which <a href="https://en.wikipedia.org/wiki/Nitrogen_fixation#Biological_nitrogen_fixation">bacteria produce fertilizer</a> at room temperature. Elucidating this mechanism is a famous open problem in chemistry because the way <a href="https://en.wikipedia.org/wiki/Haber_process">humans produce fertilizer</a> is extremely inefficient and consumes 1-2% of the world's energy annually. Such calculations could also assist with breakthroughs in fundamental science, for instance, in the understanding of <a href="https://en.wikipedia.org/wiki/High-temperature_superconductivity">high temperature superconductivity</a>.<br /><br />Though many theoretical and experimental challenges lay ahead, a quantum enabled paradigm shift from qualitative / descriptive chemistry simulations to quantitative / predictive chemistry simulations could modernize the field so dramatically that the examples imaginable today are just the tip of the iceberg.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Ryan Babbush, Quantum Software Engineer</span><br /><br />“<i>...nature isn't classical, dammit, and if you want to make a simulation of nature, you'd better make it quantum mechanical...</i>” - <a href="https://en.wikipedia.org/wiki/Richard_Feynman">Richard Feynman</a>, <a href="http://people.eecs.berkeley.edu/~christos/classics/Feynman.pdf">Simulating Physics with Computers</a><br /><br />One of the most promising applications of quantum computing is the ability to efficiently model quantum systems in nature that are considered intractable for classical computers. Now, in collaboration with the <a href="http://aspuru.chem.harvard.edu/">Aspuru-Guzik group at Harvard</a> and researchers from Lawrence Berkeley National Labs, UC Santa Barbara, Tufts University and University College London, we have performed the first completely scalable quantum simulation of a molecule. Our experimental results are detailed in the paper <a href="https://journals.aps.org/prx/pdf/10.1103/PhysRevX.6.031007"><i>Scalable Quantum Simulation of Molecular Energies</i></a>, which recently appeared in <a href="https://journals.aps.org/prx/"><i>Physical Review X</i></a>.<br /><br />The goal of our experiment was to use quantum hardware to efficiently solve the <a href="https://en.wikipedia.org/wiki/Computational_chemistry#Electronic_structure">molecular electronic structure problem</a>, which seeks the solution for the lowest energy configuration of electrons in the presence of a given nuclear configuration. In order to predict chemical reaction rates (which govern the mechanism of chemical reactions), one must make these calculations to extremely high precision. The ability to predict such rates could revolutionize the design of solar cells, industrial catalysts, batteries, flexible electronics, medicines, materials and more. The primary difficulty is that molecular systems form <a href="https://en.wikipedia.org/wiki/Quantum_entanglement">highly entangled quantum superposition states</a> which require exponentially many classical computing resources in order to represent to sufficiently high precision. For example, exactly computing the energies of methane (CH<sub>4</sub>) takes about one second, but the same calculation takes about ten minutes for ethane (C<sub>2</sub>H<sub>6</sub>) and about ten days for propane (C<sub>3</sub>H<sub>8</sub>).<br /><br />In our experiment, we focus on an approach known as the <a href="http://iopscience.iop.org/article/10.1088/1367-2630/18/2/023023/meta">variational quantum eigensolver</a> (VQE), which can be understood as a quantum analog of a neural network. Whereas a classical neural network is a parameterized mapping that one trains in order to model classical data, VQE is a parameterized mapping (e.g. a quantum circuit) that one trains in order to model quantum data (e.g. a molecular wavefunction). The training objective for VQE is the molecular energy function, which is always minimized by the true ground state. The quantum advantage of VQE is that quantum bits can efficiently represent the molecular wavefunction whereas exponentially many classical bits would be required.<br /><br />Using VQE, we quantum computed the energy landscape of molecular hydrogen, H<sub>2</sub>. We compared the performance of VQE to another quantum algorithm for chemistry, the <a href="http://science.sciencemag.org/content/309/5741/1704">phase estimation algorithm</a> (PEA). Experimentally computed energies, as a function of the H - H bond length, are shown below alongside the exact curve. We were able to obtain such high performance with VQE because the neural-network-like training loop helped to establish experimentally optimal circuit parameters for representing the wavefunction in the presence of systematic control errors. One can understand this by considering a hardware implementation of a neural network with a faulty weight, e.g. the weight is only represented half as strong as it should be. Because the weights of the neural network are established via a closed-loop training procedure which can compensate for such systematic errors, the hardware neural network is robust against such imperfections. Likewise, despite systematic errors in our implementation of the VQE circuit, we are still able to learn an accurate model for the wavefunction. This robustness inspires hope that VQE may be able to solve classically intractable problems without <a href="https://research.googleblog.com/2015/03/a-step-closer-to-quantum-computation.html">quantum error correction</a>.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-nDPJWmMCQ5Q/V405VGumkNI/AAAAAAAABH8/vvWfnVAHes89GRdaONr2Y0-DeY-fmoSNACLcB/s1600/image00.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="480" src="https://2.bp.blogspot.com/-nDPJWmMCQ5Q/V405VGumkNI/AAAAAAAABH8/vvWfnVAHes89GRdaONr2Y0-DeY-fmoSNACLcB/s640/image00.png" width="640" /></a></div>While the energies of molecular hydrogen can be computed classically (albeit inefficiently), as one scales up quantum hardware it becomes possible to simulate even larger chemical systems, including classically intractable ones. For instance, with only about a hundred reliable quantum bits one could model the process by which <a href="https://en.wikipedia.org/wiki/Nitrogen_fixation#Biological_nitrogen_fixation">bacteria produce fertilizer</a> at room temperature. Elucidating this mechanism is a famous open problem in chemistry because the way <a href="https://en.wikipedia.org/wiki/Haber_process">humans produce fertilizer</a> is extremely inefficient and consumes 1-2% of the world's energy annually. Such calculations could also assist with breakthroughs in fundamental science, for instance, in the understanding of <a href="https://en.wikipedia.org/wiki/High-temperature_superconductivity">high temperature superconductivity</a>.<br /><br />Though many theoretical and experimental challenges lay ahead, a quantum enabled paradigm shift from qualitative / descriptive chemistry simulations to quantitative / predictive chemistry simulations could modernize the field so dramatically that the examples imaginable today are just the tip of the iceberg.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/towards-an-exact-quantum-description-of-chemistry/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Wide &amp; Deep Learning: Better Together with TensorFlow</title>
		<link>https://googledata.org/google-research/wide-deep-learning-better-together-with-tensorflow/</link>
		<comments>https://googledata.org/google-research/wide-deep-learning-better-together-with-tensorflow/#comments</comments>
		<pubDate>Wed, 29 Jun 2016 17:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=5ab3d63636ea67de6a655787d3f794d9</guid>
		<description><![CDATA[<span>Posted by Heng-Tze Cheng, Senior Software Engineer, Google Research</span><br /><br />The human brain is a sophisticated learning machine, forming rules by memorizing everyday events (&#8220;sparrows can fly&#8221; and &#8220;pigeons can fly&#8221;) and generalizing those learnings to apply to things we haven't seen before (&#8220;animals with wings can fly&#8221;). Perhaps more powerfully, memorization also allows us to further refine our generalized rules with exceptions (&#8220;penguins can't fly&#8221;). As we were exploring how to advance machine intelligence, we asked ourselves the question&#8212;can we teach computers to learn like humans do, by combining the power of memorization and generalization?<br /><br />It's not an easy question to answer, but by jointly training a wide linear model (for memorization) alongside a deep neural network (for generalization), one can combine the strengths of both to bring us one step closer. At Google, we call it <b>Wide &#38; Deep Learning</b>. It's useful for generic large-scale regression and classification problems with sparse inputs (<a href="https://en.wikipedia.org/wiki/Categorical_variable">categorical features</a> with a large number of possible feature values), such as recommender systems, search, and ranking problems.<br /><div><a href="https://1.bp.blogspot.com/-Dw1mB9am1l8/V3MgtOzp3uI/AAAAAAAABGs/mP-3nZQCjWwdk6qCa5WraSpK8A7rSPj3ACLcB/s1600/image04.png"><img border="0" height="142" src="https://1.bp.blogspot.com/-Dw1mB9am1l8/V3MgtOzp3uI/AAAAAAAABGs/mP-3nZQCjWwdk6qCa5WraSpK8A7rSPj3ACLcB/s640/image04.png" width="640"></a></div>Today we&#8217;re open-sourcing our implementation of Wide &#38; Deep Learning as part of the <a href="https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/learn/python/learn">TF.Learn API</a> so that you can easily train a model yourself. Please check out the TensorFlow tutorials on <a href="https://www.tensorflow.org/tutorials/wide/">Linear Models</a> and <a href="https://www.tensorflow.org/tutorials/wide_and_deep/">Wide &#38; Deep Learning</a>, as well as our <a href="http://arxiv.org/abs/1606.07792">research paper</a> to learn more.<br /><div></div><br /><b>How Wide &#38; Deep Learning works.</b><br />Let's say one day you wake up with an idea for a new app called <i>FoodIO</i><a href="http://research.googleblog.com/#1" name="top1"><sup>*</sup></a>. A user of the app just needs to say out loud what kind of food he/she is craving for (the <i>query</i>). The app magically predicts the dish that the user will like best, and the dish gets delivered to the user's front door (the <i>item</i>). Your key metric is consumption rate&#8212;if a dish was eaten by the user, the score is 1; otherwise it's 0 (the <i>label</i>).<br /><br />You come up with some simple rules to start, like returning the items that match the most characters in the query, and you release the first version of FoodIO. Unfortunately, you find that the consumption rate is pretty low because the matches are too crude to be really useful (people shouting &#8220;fried chicken&#8221; end up getting &#8220;chicken fried rice&#8221;), so you decide to add machine learning to learn from the data.<br /><br /><div></div><b>The Wide model.</b><br />In the 2nd version, you want to memorize what items work the best for each query. So, you train a linear model in TensorFlow with a <b>wide</b> set of cross-product feature transformations to capture how the co-occurrence of a query-item feature pair correlates with the target label (whether or not an item is consumed). The model predicts the probability of consumption P(consumption &#124; query, item) for each item, and FoodIO delivers the top item with the highest predicted consumption rate. For example, the model learns that feature <span>AND(query="fried chicken", item="chicken and waffles")</span> is a huge win, while <span>AND(query="fried chicken", item="chicken fried rice")</span> doesn't get as much love even though the character match is higher. In other words, FoodIO 2.0 does a pretty good job <b>memorizing</b> what users like, and it starts to get more traction.<br /><div><a href="https://2.bp.blogspot.com/-I_YshHCoxNs/V3Mg5QG4s-I/AAAAAAAABG8/6hHCKiUhcF03kJrLTVJd6Al-MX4sR_bUACKgB/s1600/image02.png"><img border="0" height="178" src="https://2.bp.blogspot.com/-I_YshHCoxNs/V3Mg5QG4s-I/AAAAAAAABG8/6hHCKiUhcF03kJrLTVJd6Al-MX4sR_bUACKgB/s640/image02.png" width="640"></a></div><b>The Deep model.</b><br />Later on you discover that many users are saying that they're tired of the recommendations. They're eager to discover similar but different cuisines with a &#8220;surprise me&#8221; state of mind. So you brush up on your TensorFlow toolkit again and train a <b>deep</b> feed-forward neural network for FoodIO 3.0. With your deep model, you're learning lower-dimensional dense representations (usually called embedding vectors) for every query and item. With that, FoodIO is able to <b>generalize</b> by matching items to queries that are close to each other in the embedding space. For example, you find that people who asked for &#8220;fried chicken&#8221; often don't mind having &#8220;burgers&#8221; as well.<br /><div><a href="https://3.bp.blogspot.com/-O6Ssu0m0_O8/V3MhQWN10AI/AAAAAAAABHE/V1PtDHKp2MQQ9jfuyHxs2HHR7Ovg5M6LQCLcB/s1600/image01.png"><img border="0" height="274" src="https://3.bp.blogspot.com/-O6Ssu0m0_O8/V3MhQWN10AI/AAAAAAAABHE/V1PtDHKp2MQQ9jfuyHxs2HHR7Ovg5M6LQCLcB/s640/image01.png" width="640"></a></div><b>Combining Wide and Deep models.</b><br />However, you discover that the deep neural network sometimes generalizes too much and recommends irrelevant dishes. You dig into the historic traffic, and find that there are actually two distinct types of query-item relationships in the data.<br /><br />The first type of queries is very targeted. People shouting very specific items like &#8220;iced decaf latte with nonfat milk&#8221; really mean it. Just because it's pretty close to &#8220;hot latte with whole milk&#8221; in the embedding space doesn't mean it's an acceptable alternative. And there are millions of these rules where the transitivity of embeddings may actually do more harm than good. On the other hand, queries that are more exploratory like &#8220;seafood&#8221; or &#8220;italian food&#8221; may be open to more generalization and discovering a diverse set of related items. Having realized these, you have an epiphany: Why do I have to choose either wide or deep models? Why not both?<br /><div><a href="https://2.bp.blogspot.com/-wkrmRibw_GM/V3Mg3O3Q0-I/AAAAAAAABG0/Jm3Nl4-VcYIJ44dA5nSz6vpTyCKF2KWQgCKgB/s1600/image03.png"><img border="0" height="274" src="https://2.bp.blogspot.com/-wkrmRibw_GM/V3Mg3O3Q0-I/AAAAAAAABG0/Jm3Nl4-VcYIJ44dA5nSz6vpTyCKF2KWQgCKgB/s640/image03.png" width="640"></a></div>Finally, you build FoodIO 4.0 with Wide &#38; Deep Learning in TensorFlow. As shown in the graph above, the sparse features like&#160;<span><span><span>query="fried chicken"</span></span></span>&#160;and <span>item="chicken fried rice"</span> are used in both the wide part (left) and&#160;the deep part (right) of the model. During training, the prediction errors are backpropagated to both sides to train the model parameters. The cross-feature transformation in the wide model component can memorize all those sparse, specific rules, while the deep model component can generalize to similar items via embeddings.<br /><br /><b>Wider. Deeper. Together.</b><br />We're excited to share the TensorFlow API and implementation of Wide &#38; Deep Learning with you, so you can try out your ideas with it and share your findings with everyone else. To get started, check out the code on <a href="https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/learn/python/learn">GitHub</a> and our TensorFlow tutorials on <a href="https://www.tensorflow.org/tutorials/wide/">Linear Models</a> and <a href="https://www.tensorflow.org/tutorials/wide_and_deep/">Wide &#38; Deep Learning</a>.<br /><br /><b>Acknowledgement</b><br />Bringing Wide &#38; Deep from idea, research to implementation has been a huge team effort. We'd to like to thank all the people who have contributed to the project or have given us advice, including: Heng-Tze Cheng, Mustafa Ispir, Zakaria Haque, Lichan Hong, Rohan Anil, Denis Baylor, Vihan Jain, Salem Haykal, Robson Araujo, Xiaobing Liu, Yonghui Wu, Thomas Strohmann, Tal Shaked, Jeremiah Harmsen, Greg Corrado, Glen Anderson, D. Sculley, Tushar Chandra, Ed Chi, Rajat Monga, Rob von Behren, Jarek Wilkiewicz, Christine Robson, Illia Polosukhin, Martin Wicke, Gus Katsiapis, Alexandre Passos, Olivier Chapelle, Levent Koc, Akshay Naresh Modi, Wei Chai, Hrishi Aradhye, Othar Hansson, Xinran He, Martin Zinkevich, Joe Toth, Anton Rusanov, Hemal Shah, Petros Mol, Frank Li, Yutaka Suematsu, Sameer Ahuja, Eugene Brevdo, Philip Tucker, Shanqing Cai, Kester Tong, and more.<br /><span><br /><a name="1"><b>* </b></a>For illustration only. FoodIO is not a real app.<a href="http://research.googleblog.com/#top1"><sup>&#8617;</sup></a><br /></span>]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Heng-Tze Cheng, Senior Software Engineer, Google Research</span><br /><br />The human brain is a sophisticated learning machine, forming rules by memorizing everyday events (“sparrows can fly” and “pigeons can fly”) and generalizing those learnings to apply to things we haven't seen before (“animals with wings can fly”). Perhaps more powerfully, memorization also allows us to further refine our generalized rules with exceptions (“penguins can't fly”). As we were exploring how to advance machine intelligence, we asked ourselves the question—can we teach computers to learn like humans do, by combining the power of memorization and generalization?<br /><br />It's not an easy question to answer, but by jointly training a wide linear model (for memorization) alongside a deep neural network (for generalization), one can combine the strengths of both to bring us one step closer. At Google, we call it <b>Wide &amp; Deep Learning</b>. It's useful for generic large-scale regression and classification problems with sparse inputs (<a href="https://en.wikipedia.org/wiki/Categorical_variable">categorical features</a> with a large number of possible feature values), such as recommender systems, search, and ranking problems.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-Dw1mB9am1l8/V3MgtOzp3uI/AAAAAAAABGs/mP-3nZQCjWwdk6qCa5WraSpK8A7rSPj3ACLcB/s1600/image04.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="142" src="https://1.bp.blogspot.com/-Dw1mB9am1l8/V3MgtOzp3uI/AAAAAAAABGs/mP-3nZQCjWwdk6qCa5WraSpK8A7rSPj3ACLcB/s640/image04.png" width="640" /></a></div>Today we’re open-sourcing our implementation of Wide &amp; Deep Learning as part of the <a href="https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/learn/python/learn">TF.Learn API</a> so that you can easily train a model yourself. Please check out the TensorFlow tutorials on <a href="https://www.tensorflow.org/tutorials/wide/">Linear Models</a> and <a href="https://www.tensorflow.org/tutorials/wide_and_deep/">Wide &amp; Deep Learning</a>, as well as our <a href="http://arxiv.org/abs/1606.07792">research paper</a> to learn more.<br /><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/Xmw9SWJ0L50/0.jpg" frameborder="0" height="360" src="https://www.youtube.com/embed/Xmw9SWJ0L50?rel=0&amp;feature=player_embedded" width="640"></iframe></div><br /><b>How Wide &amp; Deep Learning works.</b><br />Let's say one day you wake up with an idea for a new app called <i>FoodIO</i><a href="http://research.googleblog.com/2016/06/wide-deep-learning-better-together-with.html#1" name="top1"><sup>*</sup></a>. A user of the app just needs to say out loud what kind of food he/she is craving for (the <i>query</i>). The app magically predicts the dish that the user will like best, and the dish gets delivered to the user's front door (the <i>item</i>). Your key metric is consumption rate—if a dish was eaten by the user, the score is 1; otherwise it's 0 (the <i>label</i>).<br /><br />You come up with some simple rules to start, like returning the items that match the most characters in the query, and you release the first version of FoodIO. Unfortunately, you find that the consumption rate is pretty low because the matches are too crude to be really useful (people shouting “fried chicken” end up getting “chicken fried rice”), so you decide to add machine learning to learn from the data.<br /><br /><div class="separator" style="clear: both; text-align: center;"></div><b>The Wide model.</b><br />In the 2nd version, you want to memorize what items work the best for each query. So, you train a linear model in TensorFlow with a <b>wide</b> set of cross-product feature transformations to capture how the co-occurrence of a query-item feature pair correlates with the target label (whether or not an item is consumed). The model predicts the probability of consumption P(consumption | query, item) for each item, and FoodIO delivers the top item with the highest predicted consumption rate. For example, the model learns that feature <span style="font-family: &quot;courier new&quot; , &quot;courier&quot; , monospace;">AND(query="fried chicken", item="chicken and waffles")</span> is a huge win, while <span style="font-family: &quot;courier new&quot; , &quot;courier&quot; , monospace;">AND(query="fried chicken", item="chicken fried rice")</span> doesn't get as much love even though the character match is higher. In other words, FoodIO 2.0 does a pretty good job <b>memorizing</b> what users like, and it starts to get more traction.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-I_YshHCoxNs/V3Mg5QG4s-I/AAAAAAAABG8/6hHCKiUhcF03kJrLTVJd6Al-MX4sR_bUACKgB/s1600/image02.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="178" src="https://2.bp.blogspot.com/-I_YshHCoxNs/V3Mg5QG4s-I/AAAAAAAABG8/6hHCKiUhcF03kJrLTVJd6Al-MX4sR_bUACKgB/s640/image02.png" width="640" /></a></div><b>The Deep model.</b><br />Later on you discover that many users are saying that they're tired of the recommendations. They're eager to discover similar but different cuisines with a “surprise me” state of mind. So you brush up on your TensorFlow toolkit again and train a <b>deep</b> feed-forward neural network for FoodIO 3.0. With your deep model, you're learning lower-dimensional dense representations (usually called embedding vectors) for every query and item. With that, FoodIO is able to <b>generalize</b> by matching items to queries that are close to each other in the embedding space. For example, you find that people who asked for “fried chicken” often don't mind having “burgers” as well.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-O6Ssu0m0_O8/V3MhQWN10AI/AAAAAAAABHE/V1PtDHKp2MQQ9jfuyHxs2HHR7Ovg5M6LQCLcB/s1600/image01.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="274" src="https://3.bp.blogspot.com/-O6Ssu0m0_O8/V3MhQWN10AI/AAAAAAAABHE/V1PtDHKp2MQQ9jfuyHxs2HHR7Ovg5M6LQCLcB/s640/image01.png" width="640" /></a></div><b>Combining Wide and Deep models.</b><br />However, you discover that the deep neural network sometimes generalizes too much and recommends irrelevant dishes. You dig into the historic traffic, and find that there are actually two distinct types of query-item relationships in the data.<br /><br />The first type of queries is very targeted. People shouting very specific items like “iced decaf latte with nonfat milk” really mean it. Just because it's pretty close to “hot latte with whole milk” in the embedding space doesn't mean it's an acceptable alternative. And there are millions of these rules where the transitivity of embeddings may actually do more harm than good. On the other hand, queries that are more exploratory like “seafood” or “italian food” may be open to more generalization and discovering a diverse set of related items. Having realized these, you have an epiphany: Why do I have to choose either wide or deep models? Why not both?<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-wkrmRibw_GM/V3Mg3O3Q0-I/AAAAAAAABG0/Jm3Nl4-VcYIJ44dA5nSz6vpTyCKF2KWQgCKgB/s1600/image03.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="274" src="https://2.bp.blogspot.com/-wkrmRibw_GM/V3Mg3O3Q0-I/AAAAAAAABG0/Jm3Nl4-VcYIJ44dA5nSz6vpTyCKF2KWQgCKgB/s640/image03.png" width="640" /></a></div>Finally, you build FoodIO 4.0 with Wide &amp; Deep Learning in TensorFlow. As shown in the graph above, the sparse features like&nbsp;<span id="docs-internal-guid-0298828a-99c1-5961-d83f-fdfa6eb23129"><span style="vertical-align: baseline; white-space: pre-wrap;"><span style="font-family: &quot;courier new&quot; , &quot;courier&quot; , monospace;">query="fried chicken"</span></span></span>&nbsp;and <span style="font-family: &quot;courier new&quot; , &quot;courier&quot; , monospace;">item="chicken fried rice"</span> are used in both the wide part (left) and&nbsp;the deep part (right) of the model. During training, the prediction errors are backpropagated to both sides to train the model parameters. The cross-feature transformation in the wide model component can memorize all those sparse, specific rules, while the deep model component can generalize to similar items via embeddings.<br /><br /><b>Wider. Deeper. Together.</b><br />We're excited to share the TensorFlow API and implementation of Wide &amp; Deep Learning with you, so you can try out your ideas with it and share your findings with everyone else. To get started, check out the code on <a href="https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/learn/python/learn">GitHub</a> and our TensorFlow tutorials on <a href="https://www.tensorflow.org/tutorials/wide/">Linear Models</a> and <a href="https://www.tensorflow.org/tutorials/wide_and_deep/">Wide &amp; Deep Learning</a>.<br /><br /><b>Acknowledgement</b><br />Bringing Wide &amp; Deep from idea, research to implementation has been a huge team effort. We'd to like to thank all the people who have contributed to the project or have given us advice, including: Heng-Tze Cheng, Mustafa Ispir, Zakaria Haque, Lichan Hong, Rohan Anil, Denis Baylor, Vihan Jain, Salem Haykal, Robson Araujo, Xiaobing Liu, Yonghui Wu, Thomas Strohmann, Tal Shaked, Jeremiah Harmsen, Greg Corrado, Glen Anderson, D. Sculley, Tushar Chandra, Ed Chi, Rajat Monga, Rob von Behren, Jarek Wilkiewicz, Christine Robson, Illia Polosukhin, Martin Wicke, Gus Katsiapis, Alexandre Passos, Olivier Chapelle, Levent Koc, Akshay Naresh Modi, Wei Chai, Hrishi Aradhye, Othar Hansson, Xinran He, Martin Zinkevich, Joe Toth, Anton Rusanov, Hemal Shah, Petros Mol, Frank Li, Yutaka Suematsu, Sameer Ahuja, Eugene Brevdo, Philip Tucker, Shanqing Cai, Kester Tong, and more.<br /><span class="Apple-style-span" style="font-size: small;"><br /><a name="1"><b>* </b></a>For illustration only. FoodIO is not a real app.<a href="http://research.googleblog.com/2016/06/wide-deep-learning-better-together-with.html#top1"><sup>↩</sup></a><br /></span>]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/wide-deep-learning-better-together-with-tensorflow/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>CVPR 2016 &amp; Research at Google</title>
		<link>https://googledata.org/google-research/cvpr-2016-research-at-google/</link>
		<comments>https://googledata.org/google-research/cvpr-2016-research-at-google/#comments</comments>
		<pubDate>Tue, 28 Jun 2016 16:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=89f132acd381e89f76d646d19d14356e</guid>
		<description><![CDATA[<span>Posted by Rahul Sukthankar, Research Scientist</span><br /><br />This week, Las Vegas hosts the <a href="http://cvpr2016.thecvf.com/">2016 Conference on Computer Vision and Pattern Recognition</a> (CVPR 2016), the premier annual computer vision event comprising the main conference and several co-located workshops and short courses. As a leader in computer vision research, Google has a strong presence at CVPR 2016, with many Googlers presenting papers and invited talks at the conference, tutorials and workshops.<br /><br />We congratulate Google Research Scientist Ce Liu and Google Faculty Advisor <a href="http://www.cs.cmu.edu/~abhinavg/">Abhinav Gupta</a>, who were selected as this year&#8217;s recipients of the <a href="https://www.computer.org/web/tcpami/young-researcher-award">PAMI Young Researcher Award</a> for outstanding research contributions within computer vision. We also congratulate Googler Henrik Stewenius for receiving the <a href="https://en.wikipedia.org/wiki/Longuet-Higgins_Prize">Longuet-Higgins Prize</a>, a retrospective award that recognizes up to two CVPR papers from ten years ago that have made a significant impact on computer vision research, for his 2006 CVPR paper &#8220;<a href="http://dl.acm.org/citation.cfm?id=1153548">Scalable Recognition with a Vocabulary Tree</a>&#8221;, co-authored with David Nister, during their time at University of Kentucky.<br /><br />If you are attending CVPR this year, please stop by our booth and chat with our researchers about the projects and opportunities at Google that go into solving interesting problems for hundreds of millions of people. The Google booth will also showcase several recent efforts, including the technology behind <a href="https://research.googleblog.com/2016/06/motion-stills-create-beautiful-gifs.html">Motion Stills</a>, a live demo of neural network-based image compression and <a href="https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim">TensorFlow-Slim</a>, the lightweight library for defining, training and evaluating models in TensorFlow. Learn more about our research being presented at CVPR 2016 in the list below (Googlers highlighted in <span>blue</span>).<br /><br /><b><u>Oral Presentations</u></b><br /><a href="http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Mao_Generation_and_Comprehension_CVPR_2016_paper.pdf">Generation and Comprehension of Unambiguous Object Descriptions</a><br /><i>Junhua Mao, <span>Jonathan Huang</span>, <span>Alexander Toshev</span>, Oana Camburu, Alan L. Yuille, <span>Kevin Murphy</span></i><br /><a href="http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Ramanathan_Detecting_Events_and_CVPR_2016_paper.pdf"><br /></a> <a href="http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Ramanathan_Detecting_Events_and_CVPR_2016_paper.pdf">Detecting Events and Key Actors in Multi-Person Videos</a><br /><i>Vignesh Ramanathan, <span>Jonathan Huang</span></i><i>,</i><i><span>&#160;Sami Abu-El-Haija</span></i><i>,</i><i><span>&#160;Alexander Gorban</span></i><i>,</i><i><span>&#160;Kevin Murphy</span>, Li Fei-Fei</i><br /><br /><b><u>Spotlight Session: 3D Reconstruction</u></b><br /><a href="http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Flynn_DeepStereo_Learning_to_CVPR_2016_paper.pdf">DeepStereo: Learning to Predict New Views From the World&#8217;s Imagery</a><br /><i>John Flynn, <span>Ivan Neulander</span>, James Philbin, <span>Noah Snavely</span></i><br /><br /><b><u>Posters</u></b><br /><a href="http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Del_Pero_Discovering_the_Physical_CVPR_2016_paper.pdf">Discovering the Physical Parts of an Articulated Object Class From Multiple Videos</a><br /><i>Luca Del Pero, <span>Susanna Ricco</span>, <span>Rahul Sukthankar</span>, Vittorio Ferrari</i><br /><a href="http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Murdock_Blockout_Dynamic_Model_CVPR_2016_paper.pdf"><br /></a> <a href="http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Murdock_Blockout_Dynamic_Model_CVPR_2016_paper.pdf">Blockout: Dynamic Model Selection for Hierarchical Deep Networks</a><br /><i>Calvin Murdock, <span>Zhen Li</span>, <span>Howard Zhou</span>, <span>Tom Duerig</span></i><br /><br /><a href="http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Szegedy_Rethinking_the_Inception_CVPR_2016_paper.pdf">Rethinking the Inception Architecture for Computer Vision</a><br /><i><span>Christian Szegedy</span></i><i>,</i><i><span>&#160;Vincent Vanhoucke</span></i><i>,</i><i><span>&#160;Sergey Ioffe</span></i><i>,</i><i><span>&#160;Jon Shlens</span>, Zbigniew Wojna</i><br /><br /><a href="http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Zheng_Improving_the_Robustness_CVPR_2016_paper.pdf">Improving the Robustness of Deep Neural Networks via Stability Training</a><br /><i>Stephan Zheng, <span>Yang Song</span></i><i>,</i><i><span>&#160;Thomas Leung</span></i><i>,</i><i><span>&#160;Ian Goodfellow</span></i><br /><br /><a href="http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Chen_Semantic_Image_Segmentation_CVPR_2016_paper.pdf">Semantic Image Segmentation With Task-Specific Edge Detection Using CNNs and a Discriminatively Trained Domain Transform</a><br /><i>Liang-Chieh Chen, <span>Jonathan T. Barron</span></i><i>,</i><i><span>&#160;George Papandreou</span></i><i>,</i><i><span>&#160;Kevin Murphy</span>, Alan L. Yuille</i><br /><br /><b><u>Tutorial</u></b><br /><a href="http://www.ccs.neu.edu/home/eelhami/tutorial_cvpr2016.htm">Optimization Algorithms for Subset Selection and Summarization in Large Data Sets</a><br /><i>Ehsan Elhamifar, Jeff Bilmes, <span>Alex Kulesza</span>, Michael Gygli</i><br /><br /><b><u>Workshops</u></b><br /><a href="http://pocv16.eecs.berkeley.edu/">Perceptual Organization in Computer Vision: The Role of Feedback in Recognition and Reorganization</a><br />Organizers: <i><span>Katerina Fragkiadaki</span>, Phillip Isola, <span>Joao Carreira</span></i><br />Invited talks: <i><span>Viren Jain</span>, <span>Jitendra Malik</span></i><br /><br /><a href="http://visualqa.org/workshop.html">VQA Challenge Workshop</a><br />Invited talks: <i><span>Jitendra Malik</span>, <span>Kevin Murphy</span></i><br /><br /><a href="https://sites.google.com/site/wicv2016/home">Women in Computer Vision</a><br />Invited talk: <i><span>Caroline Pantofaru</span></i><br /><br /><a href="http://www.cs.ucf.edu/~smkhan/CMLA2016/">Computational Models for Learning Systems and Educational Assessment</a><br />Invited talk: <i><span>Jonathan Huang</span></i><br /><br /><a href="http://lsun.cs.princeton.edu/2016/">Large-Scale Scene Understanding (LSUN) Challenge</a><br />Invited talk: <i><span>Jitendra Malik</span></i><br /><br /><a href="http://www.cs.virginia.edu/~vicente/bigvision2016/">Large Scale Visual Recognition and Retrieval: BigVision 2016</a><br />General Chairs: <i>Jason Corso, Fei-Fei Li, <span>Samy Bengio</span></i><br /><br /><a href="http://gesture.chalearn.org/2016-looking-at-people-cvpr-challenge/workshop">ChaLearn Looking at People</a><br />Invited talk: <i><span>Florian Schroff</span></i><br /><br /><a href="https://sites.google.com/site/cvprmcv16/">Medical Computer Vision</a><br />Invited talk: <span><span><i>Ramin Zabih</i></span></span>]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Rahul Sukthankar, Research Scientist</span><br /><br />This week, Las Vegas hosts the <a href="http://cvpr2016.thecvf.com/">2016 Conference on Computer Vision and Pattern Recognition</a> (CVPR 2016), the premier annual computer vision event comprising the main conference and several co-located workshops and short courses. As a leader in computer vision research, Google has a strong presence at CVPR 2016, with many Googlers presenting papers and invited talks at the conference, tutorials and workshops.<br /><br />We congratulate Google Research Scientist Ce Liu and Google Faculty Advisor <a href="http://www.cs.cmu.edu/~abhinavg/">Abhinav Gupta</a>, who were selected as this year’s recipients of the <a href="https://www.computer.org/web/tcpami/young-researcher-award">PAMI Young Researcher Award</a> for outstanding research contributions within computer vision. We also congratulate Googler Henrik Stewenius for receiving the <a href="https://en.wikipedia.org/wiki/Longuet-Higgins_Prize">Longuet-Higgins Prize</a>, a retrospective award that recognizes up to two CVPR papers from ten years ago that have made a significant impact on computer vision research, for his 2006 CVPR paper “<a href="http://dl.acm.org/citation.cfm?id=1153548">Scalable Recognition with a Vocabulary Tree</a>”, co-authored with David Nister, during their time at University of Kentucky.<br /><br />If you are attending CVPR this year, please stop by our booth and chat with our researchers about the projects and opportunities at Google that go into solving interesting problems for hundreds of millions of people. The Google booth will also showcase several recent efforts, including the technology behind <a href="https://research.googleblog.com/2016/06/motion-stills-create-beautiful-gifs.html">Motion Stills</a>, a live demo of neural network-based image compression and <a href="https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim">TensorFlow-Slim</a>, the lightweight library for defining, training and evaluating models in TensorFlow. Learn more about our research being presented at CVPR 2016 in the list below (Googlers highlighted in <span style="color: #3d85c6;">blue</span>).<br /><br /><b><u>Oral Presentations</u></b><br /><a href="http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Mao_Generation_and_Comprehension_CVPR_2016_paper.pdf">Generation and Comprehension of Unambiguous Object Descriptions</a><br /><i>Junhua Mao, <span style="color: #3d85c6;">Jonathan Huang</span>, <span style="color: #3d85c6;">Alexander Toshev</span>, Oana Camburu, Alan L. Yuille, <span style="color: #3d85c6;">Kevin Murphy</span></i><br /><a href="http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Ramanathan_Detecting_Events_and_CVPR_2016_paper.pdf"><br /></a> <a href="http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Ramanathan_Detecting_Events_and_CVPR_2016_paper.pdf">Detecting Events and Key Actors in Multi-Person Videos</a><br /><i>Vignesh Ramanathan, <span style="color: #3d85c6;">Jonathan Huang</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Sami Abu-El-Haija</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Alexander Gorban</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Kevin Murphy</span>, Li Fei-Fei</i><br /><br /><b><u>Spotlight Session: 3D Reconstruction</u></b><br /><a href="http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Flynn_DeepStereo_Learning_to_CVPR_2016_paper.pdf">DeepStereo: Learning to Predict New Views From the World’s Imagery</a><br /><i>John Flynn, <span style="color: #3d85c6;">Ivan Neulander</span>, James Philbin, <span style="color: #3d85c6;">Noah Snavely</span></i><br /><br /><b><u>Posters</u></b><br /><a href="http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Del_Pero_Discovering_the_Physical_CVPR_2016_paper.pdf">Discovering the Physical Parts of an Articulated Object Class From Multiple Videos</a><br /><i>Luca Del Pero, <span style="color: #3d85c6;">Susanna Ricco</span>, <span style="color: #3d85c6;">Rahul Sukthankar</span>, Vittorio Ferrari</i><br /><a href="http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Murdock_Blockout_Dynamic_Model_CVPR_2016_paper.pdf"><br /></a> <a href="http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Murdock_Blockout_Dynamic_Model_CVPR_2016_paper.pdf">Blockout: Dynamic Model Selection for Hierarchical Deep Networks</a><br /><i>Calvin Murdock, <span style="color: #3d85c6;">Zhen Li</span>, <span style="color: #3d85c6;">Howard Zhou</span>, <span style="color: #3d85c6;">Tom Duerig</span></i><br /><br /><a href="http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Szegedy_Rethinking_the_Inception_CVPR_2016_paper.pdf">Rethinking the Inception Architecture for Computer Vision</a><br /><i><span style="color: #3d85c6;">Christian Szegedy</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Vincent Vanhoucke</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Sergey Ioffe</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Jon Shlens</span>, Zbigniew Wojna</i><br /><br /><a href="http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Zheng_Improving_the_Robustness_CVPR_2016_paper.pdf">Improving the Robustness of Deep Neural Networks via Stability Training</a><br /><i>Stephan Zheng, <span style="color: #3d85c6;">Yang Song</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Thomas Leung</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Ian Goodfellow</span></i><br /><br /><a href="http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Chen_Semantic_Image_Segmentation_CVPR_2016_paper.pdf">Semantic Image Segmentation With Task-Specific Edge Detection Using CNNs and a Discriminatively Trained Domain Transform</a><br /><i>Liang-Chieh Chen, <span style="color: #3d85c6;">Jonathan T. Barron</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;George Papandreou</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Kevin Murphy</span>, Alan L. Yuille</i><br /><br /><b><u>Tutorial</u></b><br /><a href="http://www.ccs.neu.edu/home/eelhami/tutorial_cvpr2016.htm">Optimization Algorithms for Subset Selection and Summarization in Large Data Sets</a><br /><i>Ehsan Elhamifar, Jeff Bilmes, <span style="color: #3d85c6;">Alex Kulesza</span>, Michael Gygli</i><br /><br /><b><u>Workshops</u></b><br /><a href="http://pocv16.eecs.berkeley.edu/">Perceptual Organization in Computer Vision: The Role of Feedback in Recognition and Reorganization</a><br />Organizers: <i><span style="color: #3d85c6;">Katerina Fragkiadaki</span>, Phillip Isola, <span style="color: #3d85c6;">Joao Carreira</span></i><br />Invited talks: <i><span style="color: #3d85c6;">Viren Jain</span>, <span style="color: #3d85c6;">Jitendra Malik</span></i><br /><br /><a href="http://visualqa.org/workshop.html">VQA Challenge Workshop</a><br />Invited talks: <i><span style="color: #3d85c6;">Jitendra Malik</span>, <span style="color: #3d85c6;">Kevin Murphy</span></i><br /><br /><a href="https://sites.google.com/site/wicv2016/home">Women in Computer Vision</a><br />Invited talk: <i><span style="color: #3d85c6;">Caroline Pantofaru</span></i><br /><br /><a href="http://www.cs.ucf.edu/~smkhan/CMLA2016/">Computational Models for Learning Systems and Educational Assessment</a><br />Invited talk: <i><span style="color: #3d85c6;">Jonathan Huang</span></i><br /><br /><a href="http://lsun.cs.princeton.edu/2016/">Large-Scale Scene Understanding (LSUN) Challenge</a><br />Invited talk: <i><span style="color: #3d85c6;">Jitendra Malik</span></i><br /><br /><a href="http://www.cs.virginia.edu/~vicente/bigvision2016/">Large Scale Visual Recognition and Retrieval: BigVision 2016</a><br />General Chairs: <i>Jason Corso, Fei-Fei Li, <span style="color: #3d85c6;">Samy Bengio</span></i><br /><br /><a href="http://gesture.chalearn.org/2016-looking-at-people-cvpr-challenge/workshop">ChaLearn Looking at People</a><br />Invited talk: <i><span style="color: #3d85c6;">Florian Schroff</span></i><br /><br /><a href="https://sites.google.com/site/cvprmcv16/">Medical Computer Vision</a><br />Invited talk: <span style="background-color: white;"><span style="color: #3d85c6;"><i>Ramin Zabih</i></span></span>]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/cvpr-2016-research-at-google/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Project Bloks: Making code physical for kids</title>
		<link>https://googledata.org/google-research/project-bloks-making-code-physical-for-kids/</link>
		<comments>https://googledata.org/google-research/project-bloks-making-code-physical-for-kids/#comments</comments>
		<pubDate>Mon, 27 Jun 2016 15:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>
		<category><![CDATA[education]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=06ed262edc0d0d086edc30f0d564b09b</guid>
		<description><![CDATA[<span>Posted by Steve Vranakis and Jayme Goldstein, Executive Creative Director and Project Lead, Google Creative Lab</span><br /><br />At Google, we&#8217;re passionate about empowering children to create and explore with technology. We believe that when children learn to code, they&#8217;re not just learning how to program a computer&#8212;they&#8217;re learning a new language for creative expression and are developing computational thinking: a skillset for solving problems of all kinds. <br /><br />In fact, it&#8217;s a skillset whose importance is being recognised around the world&#8212;from President Obama&#8217;s <a href="https://chooseyourfuture.cps.edu/computer-science-for-all/what-is-cs4all/">CS4All program</a> to the inclusion of <a href="https://www.gov.uk/government/publications/national-curriculum-in-england-computing-programmes-of-study/national-curriculum-in-england-computing-programmes-of-study">Computer Science in the UK National Curriculum</a>. We&#8217;ve long supported and advocated the furthering of CS education through programs and platforms such as <a href="https://developers.google.com/blockly/">Blockly</a>, <a href="https://developers.googleblog.com/2016/05/scratch-and-google-introduce-scratch-blocks.html">Scratch Blocks</a>, <a href="https://www.cs-first.com/">CS First</a> and <a href="https://www.madewithcode.com/">Made w/ Code</a>.<br /><br />Today, we&#8217;re happy to announce <a href="http://g.co/projectbloks">Project Bloks</a>, a research collaboration between Google, <a href="https://tltl.stanford.edu/people/paulo-blikstein">Paulo Blikstein</a> (Stanford University) and <a href="http://www.ideo.com/">IDEO</a> with the goal of creating an open hardware platform that researchers, developers and designers can use to build physical coding experiences. As a first step, we&#8217;ve created a system for tangible programming and built a working prototype with it. We&#8217;re sharing our progress before conducting more research over the summer to inform what comes next.<br /><br /><b>Physical coding</b><br />Kids are inherently playful and social. They naturally play and learn by using their hands, building stuff and doing things together. Making code physical - known as tangible programming - offers a unique way to combine the way children innately play and learn with computational thinking.<br /><br />Project Bloks is preceded and shaped by a long history of educational theory and research in the area of hands-on learning. From <a href="https://en.wikipedia.org/wiki/Friedrich_Fr%C3%B6bel">Friedrich Froebel</a>, <a href="https://en.wikipedia.org/wiki/Maria_Montessori">Maria Montessori</a>&#160;and <a href="https://en.wikipedia.org/wiki/Jean_Piaget">Jean Piaget&#8217;s</a> pioneering work in the area of learning by experience, exploration and manipulation, to the research started in the 1970s by Seymour Papert and Radia Perlman with <a href="http://cyberneticzoo.com/cyberneticanimals/1969-the-logo-turtle-seymour-papert-marvin-minsky-et-al-american/">LOGO and TORTIS</a>. This exploration has continued to grow and includes a <a href="http://cosmo.nyu.edu/hogg/lego/braitenberg_vehicles.pdf">wide</a> <a href="https://www.researchgate.net/publication/242383829_Algoblock_a_tangible_programming_language_a_tool_for_collaborative_learning">range</a> <a href="http://link.springer.com/article/10.1007%2Fs00779-004-0295-6">of</a> <a href="http://hci.cs.tufts.edu/tern/">research</a> <a href="http://tmg-trackr.media.mit.edu/publishedmedia/Papers/187-Topobo%20A%20Constructive%20Assembly/Published/PDF">and</a> <a href="https://www.media.mit.edu/sponsorship/getting-value/collaborations/mindstorms">platforms</a>.<br /><br />However, designing kits for tangible programming is challenging&#8212;requiring the resources and time to develop both the software and the hardware. Our goal is to remove those barriers. By creating an open platform, Project Bloks will allow designers, developers and researchers to focus on innovating, experimenting and creating new ways to help kids develop computational thinking. Our vision is that, one day, the Project Bloks platform becomes for tangible programming what <a href="https://developers.google.com/blockly/">Blockly</a> is for on-screen programming.<br /><div></div><b>The Project Bloks system</b><br />We&#8217;ve designed a system that developers can customise, reconfigure and rearrange to create all kinds of different tangible programming experiences.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://3.bp.blogspot.com/-q0wtBwn_WI0/V23HNIu56FI/AAAAAAAABFE/BPZ8GeEceNM8pivB7M0Bx4-ybqEmXX8DgCLcB/s1600/image02.jpg"><img border="0" height="426" src="https://3.bp.blogspot.com/-q0wtBwn_WI0/V23HNIu56FI/AAAAAAAABFE/BPZ8GeEceNM8pivB7M0Bx4-ybqEmXX8DgCLcB/s640/image02.jpg" width="640"></a></td></tr><tr><td>A birdseye view of the customisable and reconfigurable Project Bloks system</td></tr></tbody></table>The Project Bloks system is made up of three core components the &#8220;Brain Board&#8221;, &#8220;Base Boards&#8221; and &#8220;Pucks&#8221;. When connected together they create a set of instructions which can be sent to connected devices, things like toys or tablets, over wifi or Bluetooth. <br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://4.bp.blogspot.com/-SWpg16IKOeI/V23HjiIyjcI/AAAAAAAABFM/3XIywyXghfMiXVfYYhH0Axes8dt4BdCkQCLcB/s1600/image11.jpg"><img border="0" height="426" src="https://4.bp.blogspot.com/-SWpg16IKOeI/V23HjiIyjcI/AAAAAAAABFM/3XIywyXghfMiXVfYYhH0Axes8dt4BdCkQCLcB/s640/image11.jpg" width="640"></a></td></tr><tr><td>The three core components of the Project Bloks system</td></tr></tbody></table><b>Pucks: abundant, inexpensive, customisable physical instructions</b><br />Pucks are what make the Project Bloks system so versatile. They help bring the infinite flexibility of software programming commands to tangible programming experiences. Pucks can be programmed with different instructions, such as &#8216;turn on or off&#8217;, &#8216;move left&#8217; or &#8216;jump&#8217;. They can also take the shape of many different interactive forms&#8212;like switches, dials or buttons. With no active electronic components, they&#8217;re also incredibly cheap and easy to make. At a minimum, all you'd need to make a puck is a piece of paper and some <a href="https://en.wikipedia.org/wiki/Conductive_ink">conductive ink</a>. <br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://1.bp.blogspot.com/-bY1H0QnZ-go/V23H2KV7vlI/AAAAAAAABFY/j46AMqQmxjs1ZeatEQ0oh4VPLZGGnVOIACLcB/s1600/image12.jpg"><img border="0" height="426" src="https://1.bp.blogspot.com/-bY1H0QnZ-go/V23H2KV7vlI/AAAAAAAABFY/j46AMqQmxjs1ZeatEQ0oh4VPLZGGnVOIACLcB/s640/image12.jpg" width="640"></a></td></tr><tr><td>Pucks allow for the creation and customisation of endless amount of different domain-specific physical instructions cheaply and easily.</td></tr></tbody></table><b>Base Boards: a modular design for diverse tangible programming experiences</b><br />Base Boards read a Puck&#8217;s instruction through a capacitive sensor. They act as a conduit for a Puck&#8217;s command to the Brain Board. Base Boards are modular and can be connected in sequence and in different orientations to create different programming flows and experiences.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://4.bp.blogspot.com/-g-g4hfL3OjE/V3E58dnMPSI/AAAAAAAABGU/RVojHxOteV8WTdBy0_-D6LKsMf_NFO28gCLcB/s1600/image08.gif"><img border="0" height="360" src="https://4.bp.blogspot.com/-g-g4hfL3OjE/V3E58dnMPSI/AAAAAAAABGU/RVojHxOteV8WTdBy0_-D6LKsMf_NFO28gCLcB/s640/image08.gif" width="640"></a></td></tr><tr><td>The modularity of the Base Boards means they can be arranged in different configurations and flows</td></tr></tbody></table>Each Base Board is fitted with a haptic motor and LEDs that can be used to give end-users real time feedback on their programming experience. The Base Boards can also trigger audio feedback from the Brain Board&#8217;s built-in speaker.<br /><br /><b>Brain Board: control any device that has an API over WiFi or Bluetooth</b><br />The Brain Board is the processing unit of the system, built on a <a href="https://www.raspberrypi.org/products/pi-zero/">Raspberry Pi Zero</a>. It also provides the other boards with power, and contains an API to receive and send data to the Base Boards. It sends the Base Boards&#8217; instructions to any device with WiFi or Bluetooth connectivity and an API.<br /><br />As a whole, the Project Bloks system can take on different form factors and be made out of different materials. This means developers have the flexibility to create diverse experiences that can help kids develop computational thinking: from composing music using functions to playing around with sensors or anything else they care to invent.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://1.bp.blogspot.com/-s0y2qud5fZk/V23IbBVwKJI/AAAAAAAABFo/vWtklTyGROY3WxKqbImrWeOgniqTc3ozwCLcB/s1600/image04.gif"><img border="0" height="452" src="https://1.bp.blogspot.com/-s0y2qud5fZk/V23IbBVwKJI/AAAAAAAABFo/vWtklTyGROY3WxKqbImrWeOgniqTc3ozwCLcB/s640/image04.gif" width="640"></a></td></tr><tr><td>The Project Bloks system can be used to create all sorts of different physical programming experiences for kids</td></tr></tbody></table><b>The Coding Kit</b><br />To show how designers, developers, and researchers might make use of system, the Project Bloks team worked with IDEO to create a reference device, called the Coding Kit. It lets kids learn basic concepts of programming by allowing them to put code bricks together to create a set of instructions that can be sent to control connected toys and devices&#8212;anything from a tablet, to a <a href="http://mirobot.io/">drawing robot</a> or educational tools for exploring science like <a href="https://education.lego.com/en-us/elementary/explore/science">LEGO&#174; Education WeDo 2.0</a>.<br /><div></div><b>What&#8217;s next?</b><br />We are looking for participants (educators, developers, parents and researchers) from around the world who would like to help shape the future of Computer Science education by remotely taking part in our research studies later in the year. If you would like to be part of our research study or simply receive updates on the project, please <a href="http://projectbloks.withgoogle.com/register-interest/">sign up</a>.<br /><br />If you want more context and detail on Project Bloks, you can read our <a href="http://projectbloks.withgoogle.com/static/Project_Bloks_position_paper_June_2016.pdf">position paper. </a><br /><br />Finally, a big thank you to the team beyond Google who&#8217;ve helped us get this far&#8212;including the pioneers of tangible learning and programming who&#8217;ve inspired us and informed so much of our thinking. <br /><br /><br />]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Steve Vranakis and Jayme Goldstein, Executive Creative Director and Project Lead, Google Creative Lab</span><br /><br />At Google, we’re passionate about empowering children to create and explore with technology. We believe that when children learn to code, they’re not just learning how to program a computer—they’re learning a new language for creative expression and are developing computational thinking: a skillset for solving problems of all kinds. <br /><br />In fact, it’s a skillset whose importance is being recognised around the world—from President Obama’s <a href="https://chooseyourfuture.cps.edu/computer-science-for-all/what-is-cs4all/">CS4All program</a> to the inclusion of <a href="https://www.gov.uk/government/publications/national-curriculum-in-england-computing-programmes-of-study/national-curriculum-in-england-computing-programmes-of-study">Computer Science in the UK National Curriculum</a>. We’ve long supported and advocated the furthering of CS education through programs and platforms such as <a href="https://developers.google.com/blockly/">Blockly</a>, <a href="https://developers.googleblog.com/2016/05/scratch-and-google-introduce-scratch-blocks.html">Scratch Blocks</a>, <a href="https://www.cs-first.com/">CS First</a> and <a href="https://www.madewithcode.com/">Made w/ Code</a>.<br /><br />Today, we’re happy to announce <a href="http://g.co/projectbloks">Project Bloks</a>, a research collaboration between Google, <a href="https://tltl.stanford.edu/people/paulo-blikstein">Paulo Blikstein</a> (Stanford University) and <a href="http://www.ideo.com/">IDEO</a> with the goal of creating an open hardware platform that researchers, developers and designers can use to build physical coding experiences. As a first step, we’ve created a system for tangible programming and built a working prototype with it. We’re sharing our progress before conducting more research over the summer to inform what comes next.<br /><br /><b>Physical coding</b><br />Kids are inherently playful and social. They naturally play and learn by using their hands, building stuff and doing things together. Making code physical - known as tangible programming - offers a unique way to combine the way children innately play and learn with computational thinking.<br /><br />Project Bloks is preceded and shaped by a long history of educational theory and research in the area of hands-on learning. From <a href="https://en.wikipedia.org/wiki/Friedrich_Fr%C3%B6bel">Friedrich Froebel</a>, <a href="https://en.wikipedia.org/wiki/Maria_Montessori">Maria Montessori</a>&nbsp;and <a href="https://en.wikipedia.org/wiki/Jean_Piaget">Jean Piaget’s</a> pioneering work in the area of learning by experience, exploration and manipulation, to the research started in the 1970s by Seymour Papert and Radia Perlman with <a href="http://cyberneticzoo.com/cyberneticanimals/1969-the-logo-turtle-seymour-papert-marvin-minsky-et-al-american/">LOGO and TORTIS</a>. This exploration has continued to grow and includes a <a href="http://cosmo.nyu.edu/hogg/lego/braitenberg_vehicles.pdf">wide</a> <a href="https://www.researchgate.net/publication/242383829_Algoblock_a_tangible_programming_language_a_tool_for_collaborative_learning">range</a> <a href="http://link.springer.com/article/10.1007%2Fs00779-004-0295-6">of</a> <a href="http://hci.cs.tufts.edu/tern/">research</a> <a href="http://tmg-trackr.media.mit.edu/publishedmedia/Papers/187-Topobo%20A%20Constructive%20Assembly/Published/PDF">and</a> <a href="https://www.media.mit.edu/sponsorship/getting-value/collaborations/mindstorms">platforms</a>.<br /><br />However, designing kits for tangible programming is challenging—requiring the resources and time to develop both the software and the hardware. Our goal is to remove those barriers. By creating an open platform, Project Bloks will allow designers, developers and researchers to focus on innovating, experimenting and creating new ways to help kids develop computational thinking. Our vision is that, one day, the Project Bloks platform becomes for tangible programming what <a href="https://developers.google.com/blockly/">Blockly</a> is for on-screen programming.<br /><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/AuRTS35ouTs/0.jpg" frameborder="0" height="360" src="https://www.youtube.com/embed/AuRTS35ouTs?rel=0&amp;feature=player_embedded" width="640"></iframe></div><b>The Project Bloks system</b><br />We’ve designed a system that developers can customise, reconfigure and rearrange to create all kinds of different tangible programming experiences.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://3.bp.blogspot.com/-q0wtBwn_WI0/V23HNIu56FI/AAAAAAAABFE/BPZ8GeEceNM8pivB7M0Bx4-ybqEmXX8DgCLcB/s1600/image02.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="426" src="https://3.bp.blogspot.com/-q0wtBwn_WI0/V23HNIu56FI/AAAAAAAABFE/BPZ8GeEceNM8pivB7M0Bx4-ybqEmXX8DgCLcB/s640/image02.jpg" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">A birdseye view of the customisable and reconfigurable Project Bloks system</td></tr></tbody></table>The Project Bloks system is made up of three core components the “Brain Board”, “Base Boards” and “Pucks”. When connected together they create a set of instructions which can be sent to connected devices, things like toys or tablets, over wifi or Bluetooth. <br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-SWpg16IKOeI/V23HjiIyjcI/AAAAAAAABFM/3XIywyXghfMiXVfYYhH0Axes8dt4BdCkQCLcB/s1600/image11.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="426" src="https://4.bp.blogspot.com/-SWpg16IKOeI/V23HjiIyjcI/AAAAAAAABFM/3XIywyXghfMiXVfYYhH0Axes8dt4BdCkQCLcB/s640/image11.jpg" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The three core components of the Project Bloks system</td></tr></tbody></table><b>Pucks: abundant, inexpensive, customisable physical instructions</b><br />Pucks are what make the Project Bloks system so versatile. They help bring the infinite flexibility of software programming commands to tangible programming experiences. Pucks can be programmed with different instructions, such as ‘turn on or off’, ‘move left’ or ‘jump’. They can also take the shape of many different interactive forms—like switches, dials or buttons. With no active electronic components, they’re also incredibly cheap and easy to make. At a minimum, all you'd need to make a puck is a piece of paper and some <a href="https://en.wikipedia.org/wiki/Conductive_ink">conductive ink</a>. <br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-bY1H0QnZ-go/V23H2KV7vlI/AAAAAAAABFY/j46AMqQmxjs1ZeatEQ0oh4VPLZGGnVOIACLcB/s1600/image12.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="426" src="https://1.bp.blogspot.com/-bY1H0QnZ-go/V23H2KV7vlI/AAAAAAAABFY/j46AMqQmxjs1ZeatEQ0oh4VPLZGGnVOIACLcB/s640/image12.jpg" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Pucks allow for the creation and customisation of endless amount of different domain-specific physical instructions cheaply and easily.</td></tr></tbody></table><b>Base Boards: a modular design for diverse tangible programming experiences</b><br />Base Boards read a Puck’s instruction through a capacitive sensor. They act as a conduit for a Puck’s command to the Brain Board. Base Boards are modular and can be connected in sequence and in different orientations to create different programming flows and experiences.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-g-g4hfL3OjE/V3E58dnMPSI/AAAAAAAABGU/RVojHxOteV8WTdBy0_-D6LKsMf_NFO28gCLcB/s1600/image08.gif" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="360" src="https://4.bp.blogspot.com/-g-g4hfL3OjE/V3E58dnMPSI/AAAAAAAABGU/RVojHxOteV8WTdBy0_-D6LKsMf_NFO28gCLcB/s640/image08.gif" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The modularity of the Base Boards means they can be arranged in different configurations and flows</td></tr></tbody></table>Each Base Board is fitted with a haptic motor and LEDs that can be used to give end-users real time feedback on their programming experience. The Base Boards can also trigger audio feedback from the Brain Board’s built-in speaker.<br /><br /><b>Brain Board: control any device that has an API over WiFi or Bluetooth</b><br />The Brain Board is the processing unit of the system, built on a <a href="https://www.raspberrypi.org/products/pi-zero/">Raspberry Pi Zero</a>. It also provides the other boards with power, and contains an API to receive and send data to the Base Boards. It sends the Base Boards’ instructions to any device with WiFi or Bluetooth connectivity and an API.<br /><br />As a whole, the Project Bloks system can take on different form factors and be made out of different materials. This means developers have the flexibility to create diverse experiences that can help kids develop computational thinking: from composing music using functions to playing around with sensors or anything else they care to invent.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-s0y2qud5fZk/V23IbBVwKJI/AAAAAAAABFo/vWtklTyGROY3WxKqbImrWeOgniqTc3ozwCLcB/s1600/image04.gif" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="452" src="https://1.bp.blogspot.com/-s0y2qud5fZk/V23IbBVwKJI/AAAAAAAABFo/vWtklTyGROY3WxKqbImrWeOgniqTc3ozwCLcB/s640/image04.gif" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The Project Bloks system can be used to create all sorts of different physical programming experiences for kids</td></tr></tbody></table><b>The Coding Kit</b><br />To show how designers, developers, and researchers might make use of system, the Project Bloks team worked with IDEO to create a reference device, called the Coding Kit. It lets kids learn basic concepts of programming by allowing them to put code bricks together to create a set of instructions that can be sent to control connected toys and devices—anything from a tablet, to a <a href="http://mirobot.io/">drawing robot</a> or educational tools for exploring science like <a href="https://education.lego.com/en-us/elementary/explore/science">LEGO® Education WeDo 2.0</a>.<br /><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/bi9lpZUGvZw/0.jpg" frameborder="0" height="360" src="https://www.youtube.com/embed/bi9lpZUGvZw?rel=0&amp;feature=player_embedded" width="640"></iframe></div><b>What’s next?</b><br />We are looking for participants (educators, developers, parents and researchers) from around the world who would like to help shape the future of Computer Science education by remotely taking part in our research studies later in the year. If you would like to be part of our research study or simply receive updates on the project, please <a href="http://projectbloks.withgoogle.com/register-interest/">sign up</a>.<br /><br />If you want more context and detail on Project Bloks, you can read our <a href="http://projectbloks.withgoogle.com/static/Project_Bloks_position_paper_June_2016.pdf">position paper. </a><br /><br />Finally, a big thank you to the team beyond Google who’ve helped us get this far—including the pioneers of tangible learning and programming who’ve inspired us and informed so much of our thinking. <br /><br /><br />]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/project-bloks-making-code-physical-for-kids/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Bringing Precision to the AI Safety Discussion</title>
		<link>https://googledata.org/google-research/bringing-precision-to-the-ai-safety-discussion/</link>
		<comments>https://googledata.org/google-research/bringing-precision-to-the-ai-safety-discussion/#comments</comments>
		<pubDate>Wed, 22 Jun 2016 00:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=8d9874602bef11fc07cade07227abc89</guid>
		<description><![CDATA[<span>Posted by Chris Olah, Google Research</span><br /><br />We believe that AI technologies are likely to be overwhelmingly useful and beneficial for humanity. But part of being a responsible steward of any new technology is thinking through potential challenges and how best to address any associated risks. So today we&#8217;re publishing a technical paper, <i><a href="https://arxiv.org/abs/1606.06565">Concrete Problems in AI Safety</a></i>, a collaboration among scientists at Google, OpenAI, Stanford and Berkeley.<br /><br />While possible AI safety risks have received a lot of public attention, most previous discussion has been very hypothetical and speculative. We believe it&#8217;s essential to ground concerns in real machine learning research, and to start developing practical approaches for engineering AI systems that operate safely and reliably.<br /><br />We&#8217;ve outlined five problems we think will be very important as we apply AI in more general circumstances. These are all forward thinking, long-term research questions -- minor issues today, but important to address for future systems:<br /><br /><ul><li><b>Avoiding Negative Side Effects:</b> How can we ensure that an AI system will not disturb its environment in negative ways while pursuing its goals, e.g. a cleaning robot knocking over a vase because it can clean faster by doing so?</li><li><b>Avoiding Reward Hacking:</b> How can we avoid gaming of the reward function? For example, we don&#8217;t want this cleaning robot simply covering over messes with materials it can&#8217;t see through.</li><li><b>Scalable Oversight:</b> How can we efficiently ensure that a given AI system respects aspects of the objective that are too expensive to be frequently evaluated during training? For example, if an AI system gets human feedback as it performs a task, it needs to use that feedback efficiently because asking too often would be annoying.</li><li><b>Safe Exploration:</b> How do we ensure that an AI system doesn&#8217;t make exploratory moves with very negative repercussions? For example, maybe a cleaning robot should experiment with mopping strategies, but clearly it shouldn&#8217;t try putting a wet mop in an electrical outlet.</li><li><b>Robustness to Distributional Shift:</b> How do we ensure that an AI system recognizes, and behaves robustly, when it&#8217;s in an environment very different from its training environment? For example, heuristics learned for a factory workfloor may not be safe enough for an office.</li></ul><br />We go into more technical detail <a href="https://arxiv.org/abs/1606.06565">in the paper</a>. The machine learning research community has already thought quite a bit about most of these problems and many related issues, but we think there&#8217;s a lot more work to be done.<br /><br />We believe in rigorous, open, cross-institution work on how to build machine learning systems that work as intended. We&#8217;re eager to continue our collaborations with other research groups to make positive progress on AI.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Chris Olah, Google Research</span><br /><br />We believe that AI technologies are likely to be overwhelmingly useful and beneficial for humanity. But part of being a responsible steward of any new technology is thinking through potential challenges and how best to address any associated risks. So today we’re publishing a technical paper, <i><a href="https://arxiv.org/abs/1606.06565">Concrete Problems in AI Safety</a></i>, a collaboration among scientists at Google, OpenAI, Stanford and Berkeley.<br /><br />While possible AI safety risks have received a lot of public attention, most previous discussion has been very hypothetical and speculative. We believe it’s essential to ground concerns in real machine learning research, and to start developing practical approaches for engineering AI systems that operate safely and reliably.<br /><br />We’ve outlined five problems we think will be very important as we apply AI in more general circumstances. These are all forward thinking, long-term research questions -- minor issues today, but important to address for future systems:<br /><br /><ul><li><b>Avoiding Negative Side Effects:</b> How can we ensure that an AI system will not disturb its environment in negative ways while pursuing its goals, e.g. a cleaning robot knocking over a vase because it can clean faster by doing so?</li><li><b>Avoiding Reward Hacking:</b> How can we avoid gaming of the reward function? For example, we don’t want this cleaning robot simply covering over messes with materials it can’t see through.</li><li><b>Scalable Oversight:</b> How can we efficiently ensure that a given AI system respects aspects of the objective that are too expensive to be frequently evaluated during training? For example, if an AI system gets human feedback as it performs a task, it needs to use that feedback efficiently because asking too often would be annoying.</li><li><b>Safe Exploration:</b> How do we ensure that an AI system doesn’t make exploratory moves with very negative repercussions? For example, maybe a cleaning robot should experiment with mopping strategies, but clearly it shouldn’t try putting a wet mop in an electrical outlet.</li><li><b>Robustness to Distributional Shift:</b> How do we ensure that an AI system recognizes, and behaves robustly, when it’s in an environment very different from its training environment? For example, heuristics learned for a factory workfloor may not be safe enough for an office.</li></ul><br />We go into more technical detail <a href="https://arxiv.org/abs/1606.06565">in the paper</a>. The machine learning research community has already thought quite a bit about most of these problems and many related issues, but we think there’s a lot more work to be done.<br /><br />We believe in rigorous, open, cross-institution work on how to build machine learning systems that work as intended. We’re eager to continue our collaborations with other research groups to make positive progress on AI.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/bringing-precision-to-the-ai-safety-discussion/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>ICML 2016 &amp; Research at Google</title>
		<link>https://googledata.org/google-research/icml-2016-research-at-google/</link>
		<comments>https://googledata.org/google-research/icml-2016-research-at-google/#comments</comments>
		<pubDate>Mon, 20 Jun 2016 12:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=a415bb83dfe2d526946759fd44613550</guid>
		<description><![CDATA[<span>Posted by Afshin Rostamizadeh, Research Scientist</span><br /><br />This week, New York hosts the <a href="http://icml.cc/2016/">2016 International Conference on Machine Learning</a> (ICML 2016), a premier annual Machine Learning event supported by the <a href="http://www.machinelearning.org/">International Machine Learning Society</a> (IMLS). Machine Learning is a key focus area at Google, with highly active research groups exploring virtually all aspects of the field, including deep learning and more classical algorithms. <br /><br />We work on an extremely wide variety of machine learning problems that arise from a broad range of applications at Google. One particularly important setting is that of large-scale learning, where we utilize scalable tools and architectures to build machine learning systems that work with large volumes of data that often preclude the use of standard single-machine training algorithms. In doing so, we are able to solve deep scientific problems and engineering challenges, exploring theory as well as application, in areas of language, speech, translation, music, visual processing and more.<br /><br />As Gold Sponsor, Google has a strong presence at ICML 2016 with many Googlers publishing their research and hosting workshops. If you&#8217;re attending, we hope you&#8217;ll visit the Google booth and talk with our researchers to learn more about the exciting work, creativity and fun that goes into solving interesting ML problems that impact millions of people. You can also learn more about our research being presented at ICML 2016 in the list below (Googlers highlighted in <span>blue</span>).<br /><br /><b><u>ICML 2016 Organizing Committee</u></b><br />Area Chairs include: <i><span>Corinna Cortes</span>, <span>John Blitzer</span>, <span>Maya Gupta</span>, <span>Moritz Hardt</span>, <span>Samy Bengio</span></i><br /><br /><b><u>IMLS</u></b><br />Board Members include: <i><span>Corinna Cortes</span></i><br /><br /><b><u>Accepted Papers</u></b><br /><a href="http://jmlr.org/proceedings/papers/v48/cisse16.html">ADIOS: Architectures Deep In Output Space</a><br /><i>Moustapha Cisse, Maruan Al-Shedivat, <span>Samy Bengio</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/danihelka16.html">Associative Long Short-Term Memory</a><br /><i><span>Ivo Danihelka (Google DeepMind)</span>, <span>Greg Wayne&#160;</span></i><i><span>(Google DeepMind)</span></i><i>, <span>Benigno Uria&#160;</span></i><i><span>(Google DeepMind)</span></i><i>, <span>Nal Kalchbrenner&#160;</span></i><i><span>(Google DeepMind)</span></i><i>, <span>Alex Graves&#160;</span></i><i><span>(Google DeepMind)</span></i><br /><a href="http://jmlr.org/proceedings/papers/v48/mniha16.html"><br /></a> <a href="http://jmlr.org/proceedings/papers/v48/mniha16.html">Asynchronous Methods for Deep Reinforcement Learning</a><br /><i><span>Volodymyr Mnih</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>, <span>Adria Puigdomenech Badia</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>, Mehdi Mirza, <span>Alex Graves</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>, <span>Timothy Lillicrap</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>, <span>Tim Harley</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>, <span>David Silver</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>, <span>Koray Kavukcuoglu</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/choromanska16.html">Binary embeddings with structured hashed projections</a><br /><i>Anna Choromanska, <span>Krzysztof Choromanski</span>, Mariusz Bojarski, Tony Jebara, <span>Sanjiv Kumar</span>, Yann LeCun</i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/kairouz16.html">Discrete Distribution Estimation Under Local Privacy</a><br /><i>Peter Kairouz, <span>Keith Bonawitz</span>, <span>Daniel Ramage</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/wangf16.html">Dueling Network Architectures for Deep Reinforcement Learning</a> <b><i>(Best Paper Award recipient)</i></b><br /><i><span>Ziyu Wang</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>,</i><i><span>&#160;Nando de Freitas</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>,</i><i><span>&#160;Tom Schaul</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>,</i><i><span>&#160;Matteo Hessel</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>,</i><i><span>&#160;Hado van Hasselt</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>,</i><i><span>&#160;Marc Lanctot</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/dieleman16.html">Exploiting Cyclic Symmetry in Convolutional Neural Networks</a><br /><i><span>Sander Dieleman</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>,</i><i><span>&#160;Jeffrey De Fauw</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>,</i><i><span>&#160;Koray Kavukcuoglu</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/mirzasoleiman16.html">Fast Constrained Submodular Maximization: Personalized Data Summarization</a><br /><i>Baharan Mirzasoleiman, <span>Ashwinkumar Badanidiyuru</span>, Amin Karbasi</i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/altschuler16.html">Greedy Column Subset Selection: New Bounds and Distributed Algorithms</a><br /><i>Jason Altschuler, Aditya Bhaskara, <span>Gang Fu</span></i><i>,</i><i><span>&#160;Vahab Mirrokni</span></i><i>,</i><i><span>&#160;Afshin Rostamizadeh</span></i><i>,</i><i><span>&#160;Morteza Zadimoghaddam</span></i><br /><a href="http://jmlr.org/proceedings/papers/v48/lucic16.html"><br /></a> <a href="http://jmlr.org/proceedings/papers/v48/lucic16.html">Horizontally Scalable Submodular Maximization</a><br /><i>Mario Lucic, Olivier Bachem, <span>Morteza Zadimoghaddam</span>, Andreas Krause</i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/gu16.html">Continuous Deep Q-Learning with Model-based Acceleration</a><br /><i>Shixiang Gu, <span>Timothy Lillicrap</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>,</i><i><span>&#160;Ilya Sutskever</span></i><i>,</i><i><span>&#160;Sergey Levine</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/santoro16.html">Meta-Learning with Memory-Augmented Neural Networks</a><br /><i><span>Adam Santoro</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>, Sergey Bartunov,<span> Matthew Botvinick</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>,</i><i><span>&#160;Daan Wierstra</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>,</i><i><span>&#160;Timothy Lillicrap</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/rezende16.html">One-Shot Generalization in Deep Generative Models</a><br /><i><span>Danilo Rezende</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>,</i><i><span>&#160;Shakir Mohamed</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>,</i><i><span>&#160;Daan Wierstra</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/oord16.html">Pixel Recurrent Neural Networks</a> <b><i>(Best Paper Award recipient)</i></b><br /><i><span>Aaron Van den Oord</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>,</i><i><span>&#160;Nal Kalchbrenner</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>,</i><i><span>&#160;Koray Kavukcuoglu</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/heidari16.html">Pricing a low-regret seller</a><br /><i>Hoda Heidari, <span>Mohammad Mahdian</span></i><i>,</i><i><span>&#160;Umar Syed</span></i><i>,</i><i><span>&#160;Sergei Vassilvitskii</span>, Sadra Yazdanbod</i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/dunner16.html">Primal-Dual Rates and Certificates</a><br /><i>Celestine D&#252;nner, <span>Simone Forte</span>, Martin Takac, Martin Jaggi</i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/schnabel16.html">Recommendations as Treatments: Debiasing Learning and Evaluation</a><br /><i>Tobias Schnabel, Thorsten Joachims, Adith Swaminathan, Ashudeep Singh, <span>Navin Chandak</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/choromanski16.html">Recycling Randomness with Structure for Sublinear Time Kernel Expansions</a><br /><i><span>Krzysztof Choromanski</span>, <span>Vikas Sindhwani</span></i><br /><a href="http://jmlr.org/proceedings/papers/v48/hardt16.html"><br /></a> <a href="http://jmlr.org/proceedings/papers/v48/hardt16.html">Train faster, generalize better: Stability of stochastic gradient descent</a><br /><i><span>Moritz Hardt</span>, Ben Recht, <span>Yoram Singer</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/mnihb16.html">Variational Inference for Monte Carlo Objectives</a><br /><i><span>Andriy Mnih</span>&#160;</i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>, <span>Danilo Rezende</span>&#160;</i><i><span>(Google DeepMind)</span></i><br /><br /><b><u>Workshops</u></b><br /><a href="http://rlabstraction2016.wix.com/icml">Abstraction in Reinforcement Learning</a><br />Organizing Committee: <i>Daniel Mankowitz, <span>Timothy Mann</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>, Shie Mannor</i><br />Invited Speaker: <i><span>David Silver</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><br /><br /><a href="https://sites.google.com/site/dlworkshop16/">Deep Learning Workshop</a><br />Organizers: <i>Antoine Bordes, Kyunghyun Cho, Emily Denton, <span>Nando de Freitas</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>, Rob Fergus</i><br />Invited Speaker: <i><span>Raia Hadsell</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><br /><br /><a href="https://sites.google.com/site/nnb2tf/">Neural Networks Back To The Future</a><br />Organizers: <i>L&#233;on Bottou, David Grangier, Tomas Mikolov, <span>John Platt</span></i><br /><br /><a href="https://sites.google.com/site/dataefficientml/">Data-Efficient Machine Learning</a><br />Organizers: <i>Marc Deisenroth, <span>Shakir Mohamed</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><i>, Finale Doshi-Velez, Andreas Krause, Max Welling</i><br /><br /><a href="https://sites.google.com/site/ondeviceintelligence/icml2016">On-Device Intelligence</a><br />Organizers: <i><span>Vikas Sindhwani</span></i><i>,</i><i><span>&#160;Daniel Ramage</span></i><i>,</i><i><span>&#160;Keith Bonawitz</span>, Suyog Gupta, Sachin Talathi</i><br />Invited Speakers: <i><span>Hartwig Adam</span>, <span>H. Brendan McMahan</span></i><br /><br /><a href="https://sites.google.com/site/admlsystemsworkshop/">Online Advertising Systems</a><br />Organizing Committee: <i>Sharat Chikkerur, <span>Hossein Azari</span>, Edoardo Airoldi</i><br />Opening Remarks: <i><span>Hossein Azari</span></i><br />Invited Speakers: <i><span>Martin P&#225;l</span>, <span>Todd Phillips</span></i><br /><br /><a href="https://sites.google.com/site/icmlworkshoponanomalydetection/">Anomaly Detection 2016</a><br />Organizing Committee: <i>Nico Goernitz, Marius Kloft, <span>Vitaly Kuznetsov</span></i><br /><br /><b><u>Tutorials</u></b><br /><a href="http://icml.cc/2016/?page_id=97">Deep Reinforcement Learning</a><br /><i><span>David Silver</span></i><i><span>&#160;</span></i><i><span>(Google DeepMind)</span></i><br /><a href="http://icml.cc/2016/?page_id=97"><br /></a> <a href="http://icml.cc/2016/?page_id=97">Rigorous Data Dredging: Theory and Tools for Adaptive Data Analysis</a><br /><i><span>Moritz Hardt</span>, Aaron Roth</i>]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Afshin Rostamizadeh, Research Scientist</span><br /><br />This week, New York hosts the <a href="http://icml.cc/2016/">2016 International Conference on Machine Learning</a> (ICML 2016), a premier annual Machine Learning event supported by the <a href="http://www.machinelearning.org/">International Machine Learning Society</a> (IMLS). Machine Learning is a key focus area at Google, with highly active research groups exploring virtually all aspects of the field, including deep learning and more classical algorithms. <br /><br />We work on an extremely wide variety of machine learning problems that arise from a broad range of applications at Google. One particularly important setting is that of large-scale learning, where we utilize scalable tools and architectures to build machine learning systems that work with large volumes of data that often preclude the use of standard single-machine training algorithms. In doing so, we are able to solve deep scientific problems and engineering challenges, exploring theory as well as application, in areas of language, speech, translation, music, visual processing and more.<br /><br />As Gold Sponsor, Google has a strong presence at ICML 2016 with many Googlers publishing their research and hosting workshops. If you’re attending, we hope you’ll visit the Google booth and talk with our researchers to learn more about the exciting work, creativity and fun that goes into solving interesting ML problems that impact millions of people. You can also learn more about our research being presented at ICML 2016 in the list below (Googlers highlighted in <span style="color: #3d85c6;">blue</span>).<br /><br /><b><u>ICML 2016 Organizing Committee</u></b><br />Area Chairs include: <i><span style="color: #3d85c6;">Corinna Cortes</span>, <span style="color: #3d85c6;">John Blitzer</span>, <span style="color: #3d85c6;">Maya Gupta</span>, <span style="color: #3d85c6;">Moritz Hardt</span>, <span style="color: #3d85c6;">Samy Bengio</span></i><br /><br /><b><u>IMLS</u></b><br />Board Members include: <i><span style="color: #3d85c6;">Corinna Cortes</span></i><br /><br /><b><u>Accepted Papers</u></b><br /><a href="http://jmlr.org/proceedings/papers/v48/cisse16.html">ADIOS: Architectures Deep In Output Space</a><br /><i>Moustapha Cisse, Maruan Al-Shedivat, <span style="color: #3d85c6;">Samy Bengio</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/danihelka16.html">Associative Long Short-Term Memory</a><br /><i><span style="color: #3d85c6;">Ivo Danihelka (Google DeepMind)</span>, <span style="color: #3d85c6;">Greg Wayne&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>, <span style="color: #3d85c6;">Benigno Uria&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>, <span style="color: #3d85c6;">Nal Kalchbrenner&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>, <span style="color: #3d85c6;">Alex Graves&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><br /><a href="http://jmlr.org/proceedings/papers/v48/mniha16.html"><br /></a> <a href="http://jmlr.org/proceedings/papers/v48/mniha16.html">Asynchronous Methods for Deep Reinforcement Learning</a><br /><i><span style="color: #3d85c6;">Volodymyr Mnih</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>, <span style="color: #3d85c6;">Adria Puigdomenech Badia</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>, Mehdi Mirza, <span style="color: #3d85c6;">Alex Graves</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>, <span style="color: #3d85c6;">Timothy Lillicrap</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>, <span style="color: #3d85c6;">Tim Harley</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>, <span style="color: #3d85c6;">David Silver</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>, <span style="color: #3d85c6;">Koray Kavukcuoglu</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/choromanska16.html">Binary embeddings with structured hashed projections</a><br /><i>Anna Choromanska, <span style="color: #3d85c6;">Krzysztof Choromanski</span>, Mariusz Bojarski, Tony Jebara, <span style="color: #3d85c6;">Sanjiv Kumar</span>, Yann LeCun</i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/kairouz16.html">Discrete Distribution Estimation Under Local Privacy</a><br /><i>Peter Kairouz, <span style="color: #3d85c6;">Keith Bonawitz</span>, <span style="color: #3d85c6;">Daniel Ramage</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/wangf16.html">Dueling Network Architectures for Deep Reinforcement Learning</a> <b><i>(Best Paper Award recipient)</i></b><br /><i><span style="color: #3d85c6;">Ziyu Wang</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Nando de Freitas</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Tom Schaul</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Matteo Hessel</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Hado van Hasselt</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Marc Lanctot</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/dieleman16.html">Exploiting Cyclic Symmetry in Convolutional Neural Networks</a><br /><i><span style="color: #3d85c6;">Sander Dieleman</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Jeffrey De Fauw</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Koray Kavukcuoglu</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/mirzasoleiman16.html">Fast Constrained Submodular Maximization: Personalized Data Summarization</a><br /><i>Baharan Mirzasoleiman, <span style="color: #3d85c6;">Ashwinkumar Badanidiyuru</span>, Amin Karbasi</i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/altschuler16.html">Greedy Column Subset Selection: New Bounds and Distributed Algorithms</a><br /><i>Jason Altschuler, Aditya Bhaskara, <span style="color: #3d85c6;">Gang Fu</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Vahab Mirrokni</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Afshin Rostamizadeh</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Morteza Zadimoghaddam</span></i><br /><a href="http://jmlr.org/proceedings/papers/v48/lucic16.html"><br /></a> <a href="http://jmlr.org/proceedings/papers/v48/lucic16.html">Horizontally Scalable Submodular Maximization</a><br /><i>Mario Lucic, Olivier Bachem, <span style="color: #3d85c6;">Morteza Zadimoghaddam</span>, Andreas Krause</i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/gu16.html">Continuous Deep Q-Learning with Model-based Acceleration</a><br /><i>Shixiang Gu, <span style="color: #3d85c6;">Timothy Lillicrap</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Ilya Sutskever</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Sergey Levine</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/santoro16.html">Meta-Learning with Memory-Augmented Neural Networks</a><br /><i><span style="color: #3d85c6;">Adam Santoro</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>, Sergey Bartunov,<span style="color: #3d85c6;"> Matthew Botvinick</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Daan Wierstra</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Timothy Lillicrap</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/rezende16.html">One-Shot Generalization in Deep Generative Models</a><br /><i><span style="color: #3d85c6;">Danilo Rezende</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Shakir Mohamed</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Daan Wierstra</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/oord16.html">Pixel Recurrent Neural Networks</a> <b><i>(Best Paper Award recipient)</i></b><br /><i><span style="color: #3d85c6;">Aaron Van den Oord</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Nal Kalchbrenner</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Koray Kavukcuoglu</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/heidari16.html">Pricing a low-regret seller</a><br /><i>Hoda Heidari, <span style="color: #3d85c6;">Mohammad Mahdian</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Umar Syed</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Sergei Vassilvitskii</span>, Sadra Yazdanbod</i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/dunner16.html">Primal-Dual Rates and Certificates</a><br /><i>Celestine Dünner, <span style="color: #3d85c6;">Simone Forte</span>, Martin Takac, Martin Jaggi</i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/schnabel16.html">Recommendations as Treatments: Debiasing Learning and Evaluation</a><br /><i>Tobias Schnabel, Thorsten Joachims, Adith Swaminathan, Ashudeep Singh, <span style="color: #3d85c6;">Navin Chandak</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/choromanski16.html">Recycling Randomness with Structure for Sublinear Time Kernel Expansions</a><br /><i><span style="color: #3d85c6;">Krzysztof Choromanski</span>, <span style="color: #3d85c6;">Vikas Sindhwani</span></i><br /><a href="http://jmlr.org/proceedings/papers/v48/hardt16.html"><br /></a> <a href="http://jmlr.org/proceedings/papers/v48/hardt16.html">Train faster, generalize better: Stability of stochastic gradient descent</a><br /><i><span style="color: #3d85c6;">Moritz Hardt</span>, Ben Recht, <span style="color: #3d85c6;">Yoram Singer</span></i><br /><br /><a href="http://jmlr.org/proceedings/papers/v48/mnihb16.html">Variational Inference for Monte Carlo Objectives</a><br /><i><span style="color: #3d85c6;">Andriy Mnih</span>&nbsp;</i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>, <span style="color: #3d85c6;">Danilo Rezende</span>&nbsp;</i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><br /><br /><b><u>Workshops</u></b><br /><a href="http://rlabstraction2016.wix.com/icml">Abstraction in Reinforcement Learning</a><br />Organizing Committee: <i>Daniel Mankowitz, <span style="color: #3d85c6;">Timothy Mann</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>, Shie Mannor</i><br />Invited Speaker: <i><span style="color: #3d85c6;">David Silver</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><br /><br /><a href="https://sites.google.com/site/dlworkshop16/">Deep Learning Workshop</a><br />Organizers: <i>Antoine Bordes, Kyunghyun Cho, Emily Denton, <span style="color: #3d85c6;">Nando de Freitas</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>, Rob Fergus</i><br />Invited Speaker: <i><span style="color: #3d85c6;">Raia Hadsell</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><br /><br /><a href="https://sites.google.com/site/nnb2tf/">Neural Networks Back To The Future</a><br />Organizers: <i>Léon Bottou, David Grangier, Tomas Mikolov, <span style="color: #3d85c6;">John Platt</span></i><br /><br /><a href="https://sites.google.com/site/dataefficientml/">Data-Efficient Machine Learning</a><br />Organizers: <i>Marc Deisenroth, <span style="color: #3d85c6;">Shakir Mohamed</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><i>, Finale Doshi-Velez, Andreas Krause, Max Welling</i><br /><br /><a href="https://sites.google.com/site/ondeviceintelligence/icml2016">On-Device Intelligence</a><br />Organizers: <i><span style="color: #3d85c6;">Vikas Sindhwani</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Daniel Ramage</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Keith Bonawitz</span>, Suyog Gupta, Sachin Talathi</i><br />Invited Speakers: <i><span style="color: #3d85c6;">Hartwig Adam</span>, <span style="color: #3d85c6;">H. Brendan McMahan</span></i><br /><br /><a href="https://sites.google.com/site/admlsystemsworkshop/">Online Advertising Systems</a><br />Organizing Committee: <i>Sharat Chikkerur, <span style="color: #3d85c6;">Hossein Azari</span>, Edoardo Airoldi</i><br />Opening Remarks: <i><span style="color: #3d85c6;">Hossein Azari</span></i><br />Invited Speakers: <i><span style="color: #3d85c6;">Martin Pál</span>, <span style="color: #3d85c6;">Todd Phillips</span></i><br /><br /><a href="https://sites.google.com/site/icmlworkshoponanomalydetection/">Anomaly Detection 2016</a><br />Organizing Committee: <i>Nico Goernitz, Marius Kloft, <span style="color: #3d85c6;">Vitaly Kuznetsov</span></i><br /><br /><b><u>Tutorials</u></b><br /><a href="http://icml.cc/2016/?page_id=97">Deep Reinforcement Learning</a><br /><i><span style="color: #3d85c6;">David Silver</span></i><i><span style="color: #3d85c6;">&nbsp;</span></i><i><span style="color: #3d85c6;">(Google DeepMind)</span></i><br /><a href="http://icml.cc/2016/?page_id=97"><br /></a> <a href="http://icml.cc/2016/?page_id=97">Rigorous Data Dredging: Theory and Tools for Adaptive Data Analysis</a><br /><i><span style="color: #3d85c6;">Moritz Hardt</span>, Aaron Roth</i>]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/icml-2016-research-at-google/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Announcing Google Research, Europe</title>
		<link>https://googledata.org/google-research/announcing-google-research-europe/</link>
		<comments>https://googledata.org/google-research/announcing-google-research-europe/#comments</comments>
		<pubDate>Thu, 16 Jun 2016 09:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=dd343867a31ef414b59133c5c30cdefb</guid>
		<description><![CDATA[<span>Posted by Emmanuel Mogenet, Head of Google Research, Europe</span><br /><br />Google&#8217;s ongoing research in Machine Intelligence is what powers many of the products being used by hundreds of millions of people a day - from <a href="https://research.googleblog.com/2015/07/how-google-translate-squeezes-deep.html">Translate</a> to <a href="https://research.googleblog.com/2013/06/improving-photo-search-step-across.html">Photo Search</a> to <a href="https://research.googleblog.com/2015/11/computer-respond-to-this-email.html">Smart Reply for Inbox</a>. One of the things that enables these advances is the extensive collaboration between the Google researchers in our offices across the world, all contributing their unique knowledge and disseminating ideas in state-of-the-art Machine Learning (ML) technologies and techniques in order to develop useful tools and products.<br /><br />Today, we&#8217;re excited to announce a dedicated Machine Learning research group in Europe, based in our <a href="https://picasaweb.google.com/115256683569729121957/ZurichOfficePhotos">Zurich office</a>. Google Research, Europe, will foster an environment where software engineers and researchers specialising in ML will have the opportunity to develop products and conduct research right here in Europe, as part of the wider efforts at Google.<br /><div><a href="https://2.bp.blogspot.com/-VwZx6LYgkCw/V2GCpt6Q2OI/AAAAAAAABEc/HH9OBN25gDINnNdmLE2R1vAOSxFo9YKSgCLcB/s1600/image00.png"><img border="0" height="360" src="https://2.bp.blogspot.com/-VwZx6LYgkCw/V2GCpt6Q2OI/AAAAAAAABEc/HH9OBN25gDINnNdmLE2R1vAOSxFo9YKSgCLcB/s640/image00.png" width="640"></a></div>Zurich is already the home of Google&#8217;s largest engineering office outside the US, and is responsible for developing the engine that powers <a href="https://www.google.com/intl/es419/insidesearch/features/search/knowledge.html">Knowledge Graph</a>, as well as the conversation engine that powers the <a href="https://googleblog.blogspot.com/2016/05/allo-duo-apps-messaging-video.html">Google Assistant in Allo</a>. In addition to continued collaboration with Google&#8217;s various research teams, Google Research, Europe will be focused on three key areas:<br /><ul><li><a href="http://research.google.com/pubs/MachineIntelligence.html">Machine Intelligence</a></li><li><a href="http://research.google.com/pubs/NaturalLanguageProcessing.html">Natural Language Processing &#38; Understanding</a></li><li><a href="http://research.google.com/pubs/MachinePerception.html">Machine Perception</a></li></ul>In pursuit of these areas, the team will actively research ways in which to improve ML infrastructure, broadly facilitating research for the community, and enabling it to be put to practical use. Furthermore, researchers in the Zurich office will be uniquely able to work closely with team linguists, advancing Natural Language Understanding in collaboration with Google Research groups across the world, all while enjoying Mountain Views of a different kind. <br /><div><a href="https://4.bp.blogspot.com/-9o1md_jJjxw/V2GCurvn8yI/AAAAAAAABEk/3PMBKsbjIRMIjktqxLBmw-GCjQmyFiDaACLcB/s1600/image01.png"><img border="0" height="425" src="https://4.bp.blogspot.com/-9o1md_jJjxw/V2GCurvn8yI/AAAAAAAABEk/3PMBKsbjIRMIjktqxLBmw-GCjQmyFiDaACLcB/s640/image01.png" width="640"></a></div>Europe is home to some of the world&#8217;s premier technical universities, making it an ideal place to build a top-notch research team. We look forward to collaborating with all the excellent Computer Science research that is coming from the region, and hope to contribute towards the wider academic community through <a href="http://research.google.com/pubs/papers.html">our publications</a> and <a href="http://research.google.com/research-outreach.html#/research-outreach">academic support</a>.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Emmanuel Mogenet, Head of Google Research, Europe</span><br /><br />Google’s ongoing research in Machine Intelligence is what powers many of the products being used by hundreds of millions of people a day - from <a href="https://research.googleblog.com/2015/07/how-google-translate-squeezes-deep.html">Translate</a> to <a href="https://research.googleblog.com/2013/06/improving-photo-search-step-across.html">Photo Search</a> to <a href="https://research.googleblog.com/2015/11/computer-respond-to-this-email.html">Smart Reply for Inbox</a>. One of the things that enables these advances is the extensive collaboration between the Google researchers in our offices across the world, all contributing their unique knowledge and disseminating ideas in state-of-the-art Machine Learning (ML) technologies and techniques in order to develop useful tools and products.<br /><br />Today, we’re excited to announce a dedicated Machine Learning research group in Europe, based in our <a href="https://picasaweb.google.com/115256683569729121957/ZurichOfficePhotos">Zurich office</a>. Google Research, Europe, will foster an environment where software engineers and researchers specialising in ML will have the opportunity to develop products and conduct research right here in Europe, as part of the wider efforts at Google.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-VwZx6LYgkCw/V2GCpt6Q2OI/AAAAAAAABEc/HH9OBN25gDINnNdmLE2R1vAOSxFo9YKSgCLcB/s1600/image00.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="360" src="https://2.bp.blogspot.com/-VwZx6LYgkCw/V2GCpt6Q2OI/AAAAAAAABEc/HH9OBN25gDINnNdmLE2R1vAOSxFo9YKSgCLcB/s640/image00.png" width="640" /></a></div>Zurich is already the home of Google’s largest engineering office outside the US, and is responsible for developing the engine that powers <a href="https://www.google.com/intl/es419/insidesearch/features/search/knowledge.html">Knowledge Graph</a>, as well as the conversation engine that powers the <a href="https://googleblog.blogspot.com/2016/05/allo-duo-apps-messaging-video.html">Google Assistant in Allo</a>. In addition to continued collaboration with Google’s various research teams, Google Research, Europe will be focused on three key areas:<br /><ul><li><a href="http://research.google.com/pubs/MachineIntelligence.html">Machine Intelligence</a></li><li><a href="http://research.google.com/pubs/NaturalLanguageProcessing.html">Natural Language Processing &amp; Understanding</a></li><li><a href="http://research.google.com/pubs/MachinePerception.html">Machine Perception</a></li></ul>In pursuit of these areas, the team will actively research ways in which to improve ML infrastructure, broadly facilitating research for the community, and enabling it to be put to practical use. Furthermore, researchers in the Zurich office will be uniquely able to work closely with team linguists, advancing Natural Language Understanding in collaboration with Google Research groups across the world, all while enjoying Mountain Views of a different kind. <br /><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-9o1md_jJjxw/V2GCurvn8yI/AAAAAAAABEk/3PMBKsbjIRMIjktqxLBmw-GCjQmyFiDaACLcB/s1600/image01.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="425" src="https://4.bp.blogspot.com/-9o1md_jJjxw/V2GCurvn8yI/AAAAAAAABEk/3PMBKsbjIRMIjktqxLBmw-GCjQmyFiDaACLcB/s640/image01.png" width="640" /></a></div>Europe is home to some of the world’s premier technical universities, making it an ideal place to build a top-notch research team. We look forward to collaborating with all the excellent Computer Science research that is coming from the region, and hope to contribute towards the wider academic community through <a href="http://research.google.com/pubs/papers.html">our publications</a> and <a href="http://research.google.com/research-outreach.html#/research-outreach">academic support</a>.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/announcing-google-research-europe/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Quantum annealing with a digital twist</title>
		<link>https://googledata.org/google-research/quantum-annealing-with-a-digital-twist/</link>
		<comments>https://googledata.org/google-research/quantum-annealing-with-a-digital-twist/#comments</comments>
		<pubDate>Wed, 08 Jun 2016 17:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=5702367a12adc5edc82f6ec92cab80bd</guid>
		<description><![CDATA[<span>Posted by Rami Barends and Alireza Shabani, Quantum Electronics Engineers</span><br /><br />One of the key benefits of quantum computing is that it has the potential to solve some of the most complex problems in nature, from physics to chemistry to biology. For example, when attempting to <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2443096/">calculate protein folding</a>, or when exploring <a href="https://www.youtube.com/watch?v=98wILB5sZ5w">reaction catalysts and &#8220;designer&#8221; molecules</a>, one can look at computational challenges as optimization problems, and represent the different configurations of a molecule as an energy landscape in a quantum computer.  By letting the system cool, or &#8220;<a href="https://en.wikipedia.org/wiki/Quantum_annealing">anneal</a>&#8221;,  one finds the lowest energy state in the landscape - the most stable form of the molecule. Thanks to the peculiarities of quantum mechanics, the correct answer simply drops out at the end of the quantum computation. In fact, many tough problems can be dealt with this way, this combination of simplicity and generality makes it appealing.<br /><br />But finding the lowest energy state in a system is like being put in the Alps, and being told to find the lowest elevation - it&#8217;s easy to get stuck in a &#8220;local&#8221; valley, and not know that there is an even lower point elsewhere. Therefore, we use a different approach: We start with a very simple energy landscape - a flat meadow - and initialize the system of quantum bits (qubits) to represent the known lowest energy point, or &#8220;ground state&#8221;, in that landscape. We then begin to adjust the simple landscape towards one that represents the problem we are trying to solve - from the smooth meadow to the highly uneven terrain of the Alps. Here&#8217;s the fun part: if one evolves the landscape very slowly, the ground state of the qubits also evolves, so that they stay in the ground state of the changing system. This is called &#8220;<a href="https://en.wikipedia.org/wiki/Adiabatic_quantum_computation">adiabatic quantum computing</a>&#8221;, and qubits exploit quantum tunneling to ensure they always find the lowest energy "valley" in the changing system. <br /><br />While this is great in theory, getting this to work in practice is challenging, as you have to set up the energy landscape using the available qubit interactions. Ideally you&#8217;d have multiple interactions going on between all of the qubits, but for a large-scale solver the requirements to accurately keep track of these interactions become enormous. Realistically, the connectivity has to be reduced, but this presents a major limitation for the computational possibilities.<br /><br />In "<a href="http://www.nature.com/nature/journal/v534/n7606/full/nature17658.html">Digitized adiabatic quantum computing with a superconducting circuit</a>", published in <a href="http://www.nature.com/index.html">Nature</a>, we&#8217;ve overcome this obstacle by giving quantum annealing a digital twist. With a limited connectivity between qubits you can still construct any of the desired interactions: Whether the interaction is ferromagnetic (the quantum bits prefer an aligned) or antiferromagnetic (anti-aligned orientation), or even defined along an arbitrary different direction, you can make it happen using easy to combine discrete building blocks. In this case, the blocks we use are the logic gates that we've been developing with <a href="http://googleresearch.blogspot.com/2015/03/a-step-closer-to-quantum-computation.html">our superconducting architecture</a>. <br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://3.bp.blogspot.com/-FT9Wm1KLUQs/V1c0dBvWYFI/AAAAAAAABEA/TL82c7WzUUYe0j6tSqL44mJM-uFXUr3ugCLcB/s1600/image03.jpg"><img border="0" height="406" src="https://3.bp.blogspot.com/-FT9Wm1KLUQs/V1c0dBvWYFI/AAAAAAAABEA/TL82c7WzUUYe0j6tSqL44mJM-uFXUr3ugCLcB/s640/image03.jpg" width="640"></a></td></tr><tr><td>Superconducting quantum chip with nine qubits. Each qubit (cross-shaped structures in the center) is connected to its neighbors and individually controlled. Photo credit: Julian Kelly.</td></tr></tbody></table>The key is controllability. Qubits, like other physical objects in nature, have a <a href="https://en.wikipedia.org/wiki/Resonance">resonance frequency</a>, and can be addressed individually with short voltage and current pulses. In our architecture we can steer this frequency, much like you would tune a radio to a broadcast.  We can even tune one qubit to the frequency of another one. By moving qubit frequencies to or away from each other, interactions can be turned on or off. The exchange of quantum information resembles a relay race, where the baton can be handed down when the runners meet.<br /><br />You can see the algorithm in action below. Any problem is encoded as local &#8220;directions&#8221; we want qubits to point to - like a weathervane pointing into the wind - and interactions, depicted here as links between the balls. We start by aligning all qubits into the same direction, and the interactions between the qubits turned off - this is the simplest ground state of the system. Next, we turn on interactions and change qubit directions to start evolving towards the energy landscape we wish to solve. The algorithmic steps are implemented with many control pulses, illustrating how the problem gets solved in a giant dance of <a href="https://en.wikipedia.org/wiki/Quantum_entanglement">quantum entanglement</a>.<br /><div></div><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td>Top: Depiction of the problem, with the gold arrows in the blue balls representing the directions we&#8217;d like each qubit to align to, like a weathervane pointing to the wind. The thickness of the link between the balls indicates the strength of the interaction - red denotes a ferromagnetic link, and blue an antiferromagnetic link. Middle: Implementation with qubits (yellow crosses) with control pulses (red) and steering the frequency (vertical direction). Qubits turn blue when there is interaction. The qubits turn green when they are being measured. Bottom: Zoom in of the physical device, showing the corresponding nine qubits (cross-shaped).</td></tr></tbody></table>To run the adiabatic quantum computation efficiently and design a set of test experiments we teamed up with the <a href="http://www.qutisgroup.com/">QUTIS group</a> at the University of the Basque Country in Bilbao, Spain, led by <a href="http://www.qutisgroup.com/prof-enrique-solano/">Prof. E. Solano</a> and <a href="http://www.qutisgroup.com/dr-lucas-lamata">Dr. L. Lamata</a>, who are experts in synthesizing digital algorithms. It&#8217;s the largest digital algorithm to date, with up to nine qubits and using over one thousand logic gates.<br /><br />The crucial advantage for the future is that this digital implementation is fully compatible with known <a href="http://googleresearch.blogspot.com/2015/03/a-step-closer-to-quantum-computation.html">quantum error correction techniques</a>, and can therefore be protected from the effects of noise. Otherwise, the noise will set a hard limit, as even the slightest amount can derail the state from following the fragile path to the solution. Since each quantum bit and interaction element can add noise to the system, some of the most important problems are well beyond reach, as they have many degrees of freedom and need a high connectivity. But with error correction, this approach becomes a general-purpose algorithm which can be scaled to an arbitrarily large quantum computer.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Rami Barends and Alireza Shabani, Quantum Electronics Engineers</span><br /><br />One of the key benefits of quantum computing is that it has the potential to solve some of the most complex problems in nature, from physics to chemistry to biology. For example, when attempting to <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2443096/">calculate protein folding</a>, or when exploring <a href="https://www.youtube.com/watch?v=98wILB5sZ5w">reaction catalysts and “designer” molecules</a>, one can look at computational challenges as optimization problems, and represent the different configurations of a molecule as an energy landscape in a quantum computer.  By letting the system cool, or “<a href="https://en.wikipedia.org/wiki/Quantum_annealing">anneal</a>”,  one finds the lowest energy state in the landscape - the most stable form of the molecule. Thanks to the peculiarities of quantum mechanics, the correct answer simply drops out at the end of the quantum computation. In fact, many tough problems can be dealt with this way, this combination of simplicity and generality makes it appealing.<br /><br />But finding the lowest energy state in a system is like being put in the Alps, and being told to find the lowest elevation - it’s easy to get stuck in a “local” valley, and not know that there is an even lower point elsewhere. Therefore, we use a different approach: We start with a very simple energy landscape - a flat meadow - and initialize the system of quantum bits (qubits) to represent the known lowest energy point, or “ground state”, in that landscape. We then begin to adjust the simple landscape towards one that represents the problem we are trying to solve - from the smooth meadow to the highly uneven terrain of the Alps. Here’s the fun part: if one evolves the landscape very slowly, the ground state of the qubits also evolves, so that they stay in the ground state of the changing system. This is called “<a href="https://en.wikipedia.org/wiki/Adiabatic_quantum_computation">adiabatic quantum computing</a>”, and qubits exploit quantum tunneling to ensure they always find the lowest energy "valley" in the changing system. <br /><br />While this is great in theory, getting this to work in practice is challenging, as you have to set up the energy landscape using the available qubit interactions. Ideally you’d have multiple interactions going on between all of the qubits, but for a large-scale solver the requirements to accurately keep track of these interactions become enormous. Realistically, the connectivity has to be reduced, but this presents a major limitation for the computational possibilities.<br /><br />In "<a href="http://www.nature.com/nature/journal/v534/n7606/full/nature17658.html">Digitized adiabatic quantum computing with a superconducting circuit</a>", published in <a href="http://www.nature.com/index.html">Nature</a>, we’ve overcome this obstacle by giving quantum annealing a digital twist. With a limited connectivity between qubits you can still construct any of the desired interactions: Whether the interaction is ferromagnetic (the quantum bits prefer an aligned) or antiferromagnetic (anti-aligned orientation), or even defined along an arbitrary different direction, you can make it happen using easy to combine discrete building blocks. In this case, the blocks we use are the logic gates that we've been developing with <a href="http://googleresearch.blogspot.com/2015/03/a-step-closer-to-quantum-computation.html">our superconducting architecture</a>. <br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://3.bp.blogspot.com/-FT9Wm1KLUQs/V1c0dBvWYFI/AAAAAAAABEA/TL82c7WzUUYe0j6tSqL44mJM-uFXUr3ugCLcB/s1600/image03.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="406" src="https://3.bp.blogspot.com/-FT9Wm1KLUQs/V1c0dBvWYFI/AAAAAAAABEA/TL82c7WzUUYe0j6tSqL44mJM-uFXUr3ugCLcB/s640/image03.jpg" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Superconducting quantum chip with nine qubits. Each qubit (cross-shaped structures in the center) is connected to its neighbors and individually controlled. Photo credit: Julian Kelly.</td></tr></tbody></table>The key is controllability. Qubits, like other physical objects in nature, have a <a href="https://en.wikipedia.org/wiki/Resonance">resonance frequency</a>, and can be addressed individually with short voltage and current pulses. In our architecture we can steer this frequency, much like you would tune a radio to a broadcast.  We can even tune one qubit to the frequency of another one. By moving qubit frequencies to or away from each other, interactions can be turned on or off. The exchange of quantum information resembles a relay race, where the baton can be handed down when the runners meet.<br /><br />You can see the algorithm in action below. Any problem is encoded as local “directions” we want qubits to point to - like a weathervane pointing into the wind - and interactions, depicted here as links between the balls. We start by aligning all qubits into the same direction, and the interactions between the qubits turned off - this is the simplest ground state of the system. Next, we turn on interactions and change qubit directions to start evolving towards the energy landscape we wish to solve. The algorithmic steps are implemented with many control pulses, illustrating how the problem gets solved in a giant dance of <a href="https://en.wikipedia.org/wiki/Quantum_entanglement">quantum entanglement</a>.<br /><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/qpKy6jIhRCo/0.jpg" frameborder="0" height="320" src="https://www.youtube.com/embed/qpKy6jIhRCo?rel=0&amp;feature=player_embedded" width="640"></iframe></div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td class="tr-caption" style="text-align: center;">Top: Depiction of the problem, with the gold arrows in the blue balls representing the directions we’d like each qubit to align to, like a weathervane pointing to the wind. The thickness of the link between the balls indicates the strength of the interaction - red denotes a ferromagnetic link, and blue an antiferromagnetic link. Middle: Implementation with qubits (yellow crosses) with control pulses (red) and steering the frequency (vertical direction). Qubits turn blue when there is interaction. The qubits turn green when they are being measured. Bottom: Zoom in of the physical device, showing the corresponding nine qubits (cross-shaped).</td></tr></tbody></table>To run the adiabatic quantum computation efficiently and design a set of test experiments we teamed up with the <a href="http://www.qutisgroup.com/">QUTIS group</a> at the University of the Basque Country in Bilbao, Spain, led by <a href="http://www.qutisgroup.com/prof-enrique-solano/">Prof. E. Solano</a> and <a href="http://www.qutisgroup.com/dr-lucas-lamata">Dr. L. Lamata</a>, who are experts in synthesizing digital algorithms. It’s the largest digital algorithm to date, with up to nine qubits and using over one thousand logic gates.<br /><br />The crucial advantage for the future is that this digital implementation is fully compatible with known <a href="http://googleresearch.blogspot.com/2015/03/a-step-closer-to-quantum-computation.html">quantum error correction techniques</a>, and can therefore be protected from the effects of noise. Otherwise, the noise will set a hard limit, as even the slightest amount can derail the state from following the fragile path to the solution. Since each quantum bit and interaction element can add noise to the system, some of the most important problems are well beyond reach, as they have many degrees of freedom and need a high connectivity. But with error correction, this approach becomes a general-purpose algorithm which can be scaled to an arbitrarily large quantum computer.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/quantum-annealing-with-a-digital-twist/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Motion Stills – Create beautiful GIFs from Live Photos</title>
		<link>https://googledata.org/google-research/motion-stills-create-beautiful-gifs-from-live-photos/</link>
		<comments>https://googledata.org/google-research/motion-stills-create-beautiful-gifs-from-live-photos/#comments</comments>
		<pubDate>Tue, 07 Jun 2016 18:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=9277fd67e8960677f3aee8e7fe74252f</guid>
		<description><![CDATA[<span>Posted by Ken Conley and Matthias Grundmann, Machine Perception</span><br /><br />Today we are releasing <a href="https://itunes.apple.com/us/app/motion-stills-create-live/id1086172168?ls=1&#38;mt=8">Motion Stills</a>, an iOS app from Google Research that acts as a virtual camera operator for your <a href="http://www.apple.com/ios/photos/">Apple Live Photos</a>. We use our <a href="https://research.googleblog.com/2012/05/video-stabilization-on-youtube.html">video stabilization</a> technology to freeze the background into a still photo or create sweeping cinematic pans. The resulting looping GIFs and movies come alive, and can easily be shared via messaging or on social media.<br /><div><a href="https://4.bp.blogspot.com/-kcbkeaAqbfQ/V1bwLbBXX2I/AAAAAAAABCc/OnA4JVjUVRkr1wQVNo9pIVXPnDp9E2fLwCLcB/s1600/image03.gif"><img border="0" height="640" src="https://4.bp.blogspot.com/-kcbkeaAqbfQ/V1bwLbBXX2I/AAAAAAAABCc/OnA4JVjUVRkr1wQVNo9pIVXPnDp9E2fLwCLcB/s640/image03.gif" width="312"></a></div>With Motion Stills, we provide an immersive stream experience that makes your clips fun to watch and share. You can also tell stories of your adventures by combining multiple clips into a movie montage. All of this works right on your phone, no Internet connection needed.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://2.bp.blogspot.com/-2lfr9mlSznk/V1bwR3DmA-I/AAAAAAAABCk/fLLmm_lUWPsDoMJrcPbkVHEqxvfwEZqVgCLcB/s1600/image00.gif"><img border="0" height="480" src="https://2.bp.blogspot.com/-2lfr9mlSznk/V1bwR3DmA-I/AAAAAAAABCk/fLLmm_lUWPsDoMJrcPbkVHEqxvfwEZqVgCLcB/s640/image00.gif" width="640"></a></td></tr><tr><td>A Live Photo before and after stabilization with Motion Stills</td></tr></tbody></table><b>How does it work?</b><br />We pioneered this technology by stabilizing <a href="http://googleresearch.blogspot.com/2012/05/video-stabilization-on-youtube.html">hundreds of millions of videos</a> and creating <a href="https://googleblog.blogspot.com/2015/10/11-things-to-know-about-google-photos.html">GIF animations from photo bursts</a>. Our algorithm uses <a href="https://en.wikipedia.org/wiki/Linear_programming">linear programming</a> to compute a virtual camera path that is optimized to recast videos and bursts as if they were filmed using stabilization equipment, yielding a still background or creating cinematic pans to remove shakiness.<br /><br />Our challenge was to take technology designed to run distributed in a data center and shrink it down to run even faster on your mobile phone. We achieved a 40x speedup by using techniques such as temporal subsampling, decoupling of motion parameters, and using Google Research&#8217;s <a href="https://developers.google.com/optimization/lp/glop">custom linear solver, GLOP</a>. We obtain further speedup and conserve storage by computing low-resolution warp textures to perform real-time GPU rendering, just like in a videogame.<br /><div><a href="https://3.bp.blogspot.com/-flKgnxgTuvo/V1bwhEY1KTI/AAAAAAAABCs/wNqY3EfpYygawlBe31_a1Wb8Dem4_RtxQCLcB/s1600/image01.gif"><img border="0" height="480" src="https://3.bp.blogspot.com/-flKgnxgTuvo/V1bwhEY1KTI/AAAAAAAABCs/wNqY3EfpYygawlBe31_a1Wb8Dem4_RtxQCLcB/s640/image01.gif" width="640"></a></div><b>Making it loop</b><br />Short videos are perfect for creating loops, so we added <i>loop optimization</i> to bring out the best in your captures. Our approach identifies optimal start and end points, and also discards blurry frames. As an added benefit, this fixes &#8220;pocket shots&#8221; (footage of the phone being put back into the pocket).<br /><br />To keep the background steady while looping, Motion Stills has to separate the background from the rest of the scene. This is a difficult task when foreground elements occlude significant portions of the video, as in the example below. Our novel method classifies motion vectors into foreground (red) and background (green) in a temporally consistent manner. We use a cascade of motion models, moving our motion estimation from simple to more complex models and biasing our results along the way.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://1.bp.blogspot.com/-azSVp0WlrHo/V1bxQWmCMRI/AAAAAAAABC4/LK4jZoYdWFYN8ijr_UVB8vQq00e_YoCxgCLcB/s1600/image04.gif"><img border="0" height="426" src="https://1.bp.blogspot.com/-azSVp0WlrHo/V1bxQWmCMRI/AAAAAAAABC4/LK4jZoYdWFYN8ijr_UVB8vQq00e_YoCxgCLcB/s640/image04.gif" width="640"></a></td></tr><tr><td>Left: Original with virtual camera path (red rectangle) and motion classification; foreground(red) vs. background(green) Right: Motion Stills result</td></tr></tbody></table><b>Try it out</b><br />We&#8217;re excited to see what you can create with this app. From fun family moments to exciting adventures with friends, try it out and let us know what you think. Motion Stills is an on-device experience with no sign-in: even if you&#8217;re on top of a glacier without signal, you can see your results immediately. You can show us your favorite clips by using <a href="https://www.instagram.com/explore/tags/motionstills/">#motionstills</a>&#160;on social media. <br /><br />This app is a way for us to experiment and iterate quickly on the technology needed for short video creation. Based on the feedback we receive, we hope to integrate this feature into existing products like Google Photos. <br /><br />Motion Stills is available on the <a href="https://itunes.apple.com/us/app/motion-stills-create-live/id1086172168?ls=1&#38;mt=8">App Store</a>. <br /><div><a href="https://4.bp.blogspot.com/-Elm2TIB6DLA/V1bxdFXrHoI/AAAAAAAABDI/F-nbOpDBW3s8SP4y_GCJAAsURzrnUp4IQCLcB/s1600/image02.gif"><img border="0" height="300" src="https://4.bp.blogspot.com/-Elm2TIB6DLA/V1bxdFXrHoI/AAAAAAAABDI/F-nbOpDBW3s8SP4y_GCJAAsURzrnUp4IQCLcB/s400/image02.gif" width="400"></a></div><br />]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Ken Conley and Matthias Grundmann, Machine Perception</span><br /><br />Today we are releasing <a href="https://itunes.apple.com/us/app/motion-stills-create-live/id1086172168?ls=1&amp;mt=8">Motion Stills</a>, an iOS app from Google Research that acts as a virtual camera operator for your <a href="http://www.apple.com/ios/photos/">Apple Live Photos</a>. We use our <a href="https://research.googleblog.com/2012/05/video-stabilization-on-youtube.html">video stabilization</a> technology to freeze the background into a still photo or create sweeping cinematic pans. The resulting looping GIFs and movies come alive, and can easily be shared via messaging or on social media.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-kcbkeaAqbfQ/V1bwLbBXX2I/AAAAAAAABCc/OnA4JVjUVRkr1wQVNo9pIVXPnDp9E2fLwCLcB/s1600/image03.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="640" src="https://4.bp.blogspot.com/-kcbkeaAqbfQ/V1bwLbBXX2I/AAAAAAAABCc/OnA4JVjUVRkr1wQVNo9pIVXPnDp9E2fLwCLcB/s640/image03.gif" width="312" /></a></div>With Motion Stills, we provide an immersive stream experience that makes your clips fun to watch and share. You can also tell stories of your adventures by combining multiple clips into a movie montage. All of this works right on your phone, no Internet connection needed.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://2.bp.blogspot.com/-2lfr9mlSznk/V1bwR3DmA-I/AAAAAAAABCk/fLLmm_lUWPsDoMJrcPbkVHEqxvfwEZqVgCLcB/s1600/image00.gif" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="480" src="https://2.bp.blogspot.com/-2lfr9mlSznk/V1bwR3DmA-I/AAAAAAAABCk/fLLmm_lUWPsDoMJrcPbkVHEqxvfwEZqVgCLcB/s640/image00.gif" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">A Live Photo before and after stabilization with Motion Stills</td></tr></tbody></table><b>How does it work?</b><br />We pioneered this technology by stabilizing <a href="http://googleresearch.blogspot.com/2012/05/video-stabilization-on-youtube.html">hundreds of millions of videos</a> and creating <a href="https://googleblog.blogspot.com/2015/10/11-things-to-know-about-google-photos.html">GIF animations from photo bursts</a>. Our algorithm uses <a href="https://en.wikipedia.org/wiki/Linear_programming">linear programming</a> to compute a virtual camera path that is optimized to recast videos and bursts as if they were filmed using stabilization equipment, yielding a still background or creating cinematic pans to remove shakiness.<br /><br />Our challenge was to take technology designed to run distributed in a data center and shrink it down to run even faster on your mobile phone. We achieved a 40x speedup by using techniques such as temporal subsampling, decoupling of motion parameters, and using Google Research’s <a href="https://developers.google.com/optimization/lp/glop">custom linear solver, GLOP</a>. We obtain further speedup and conserve storage by computing low-resolution warp textures to perform real-time GPU rendering, just like in a videogame.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-flKgnxgTuvo/V1bwhEY1KTI/AAAAAAAABCs/wNqY3EfpYygawlBe31_a1Wb8Dem4_RtxQCLcB/s1600/image01.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="480" src="https://3.bp.blogspot.com/-flKgnxgTuvo/V1bwhEY1KTI/AAAAAAAABCs/wNqY3EfpYygawlBe31_a1Wb8Dem4_RtxQCLcB/s640/image01.gif" width="640" /></a></div><b>Making it loop</b><br />Short videos are perfect for creating loops, so we added <i>loop optimization</i> to bring out the best in your captures. Our approach identifies optimal start and end points, and also discards blurry frames. As an added benefit, this fixes “pocket shots” (footage of the phone being put back into the pocket).<br /><br />To keep the background steady while looping, Motion Stills has to separate the background from the rest of the scene. This is a difficult task when foreground elements occlude significant portions of the video, as in the example below. Our novel method classifies motion vectors into foreground (red) and background (green) in a temporally consistent manner. We use a cascade of motion models, moving our motion estimation from simple to more complex models and biasing our results along the way.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-azSVp0WlrHo/V1bxQWmCMRI/AAAAAAAABC4/LK4jZoYdWFYN8ijr_UVB8vQq00e_YoCxgCLcB/s1600/image04.gif" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="426" src="https://1.bp.blogspot.com/-azSVp0WlrHo/V1bxQWmCMRI/AAAAAAAABC4/LK4jZoYdWFYN8ijr_UVB8vQq00e_YoCxgCLcB/s640/image04.gif" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Left: Original with virtual camera path (red rectangle) and motion classification; foreground(red) vs. background(green) Right: Motion Stills result</td></tr></tbody></table><b>Try it out</b><br />We’re excited to see what you can create with this app. From fun family moments to exciting adventures with friends, try it out and let us know what you think. Motion Stills is an on-device experience with no sign-in: even if you’re on top of a glacier without signal, you can see your results immediately. You can show us your favorite clips by using <a href="https://www.instagram.com/explore/tags/motionstills/">#motionstills</a>&nbsp;on social media. <br /><br />This app is a way for us to experiment and iterate quickly on the technology needed for short video creation. Based on the feedback we receive, we hope to integrate this feature into existing products like Google Photos. <br /><br />Motion Stills is available on the <a href="https://itunes.apple.com/us/app/motion-stills-create-live/id1086172168?ls=1&amp;mt=8">App Store</a>. <br /><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-Elm2TIB6DLA/V1bxdFXrHoI/AAAAAAAABDI/F-nbOpDBW3s8SP4y_GCJAAsURzrnUp4IQCLcB/s1600/image02.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="300" src="https://4.bp.blogspot.com/-Elm2TIB6DLA/V1bxdFXrHoI/AAAAAAAABDI/F-nbOpDBW3s8SP4y_GCJAAsURzrnUp4IQCLcB/s400/image02.gif" width="400" /></a></div><br />]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/motion-stills-create-beautiful-gifs-from-live-photos/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>&quot;Aw, so cute!&quot;: Allo helps you respond to shared photos</title>
		<link>https://googledata.org/google-research/aw-so-cute-allo-helps-you-respond-to-shared-photos/</link>
		<comments>https://googledata.org/google-research/aw-so-cute-allo-helps-you-respond-to-shared-photos/#comments</comments>
		<pubDate>Wed, 18 May 2016 22:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=5d5d08b2d7ce4465da5b6bdc4d851da8</guid>
		<description><![CDATA[<span>by Ariel Fuxman, Research Scientist</span><br /><br />Today, Google <a href="https://googleblog.blogspot.com/2016/05/allo-duo-apps-messaging-video.html">announced Allo</a> &#8212; our new mobile messaging app.  From day one of the Allo development effort, we set out to build a truly special product that is powered by Google&#8217;s strengths in machine intelligence to make messaging easier, more efficient, and more expressive.  Photo Reply is a unique feature of Allo that just does that!  We use machine learning to understand what a shared photo depicts and to suggest rich natural language replies that the user can tap to send.  This makes it easier for users to sustain meaningful conversations while using small mobile keyboards.     <br /><br />Here is an example of the responses that Allo suggests when a friend shares a photo of his child.<br /><div></div><div><a href="https://4.bp.blogspot.com/-cAKNObZ8x_U/VzzkFEnGbSI/AAAAAAAABCE/juPRebiyXgUqFbEChenzOOVcFaQehFLqgCLcB/s1600/Screen%2BShot%2B2016-05-18%2Bat%2B2.50.07%2BPM.png"><img border="0" height="540" src="https://4.bp.blogspot.com/-cAKNObZ8x_U/VzzkFEnGbSI/AAAAAAAABCE/juPRebiyXgUqFbEChenzOOVcFaQehFLqgCLcB/s640/Screen%2BShot%2B2016-05-18%2Bat%2B2.50.07%2BPM.png" width="640"></a></div><b>Photo Reply &#8212; Under the Hood</b><br /><br />During the winter, our product managers, Patrick McGregor and Ryan Cassidy, challenged us to develop new approaches to simplify media sharing in messaging while simultaneously delighting users with Google insights.  With my colleagues Vivek Ramavajjala, Sergey Nazarov, and Sujith Ravi, we set out to build Photo Reply.<br /><br />We utilize Google's <a href="http://googleresearch.blogspot.com/2014/09/building-deeper-understanding-of-images.html">image recognition technology</a>, developed by our <a href="http://research.google.com/pubs/MachinePerception.html">Machine Perception</a> team, to associate images with <i>semantic entities</i> &#8212; people, animals, cars, etc.  We then apply a machine learned model that maps those recognized entities to actual natural language responses. Our system produces replies for thousands of entity types that  are drawn from a taxonomy that is a subset of Google's <a href="https://www.google.com/intl/es419/insidesearch/features/search/knowledge.html">Knowledge Graph</a> and may be at different granularity levels. For example, when you receive a photo of a dog, the system may detect that the dog is actually a labrador and suggest "Love that lab!". Or given a photo of a pasta dish, it may detect the type of pasta ("Yum linguine!") and even the cuisine ("I love Italian food!").<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://2.bp.blogspot.com/-OTFHk4Z_bZs/VzzLk8J7CsI/AAAAAAAABBA/tLRFP9qyrZo4zoPSNi8k6NrCoy_W-1WvQCLcB/s1600/Screen%2BShot%2B2016-05-18%2Bat%2B12.57.01%2BPM.png"><img border="0" height="260" src="https://2.bp.blogspot.com/-OTFHk4Z_bZs/VzzLk8J7CsI/AAAAAAAABBA/tLRFP9qyrZo4zoPSNi8k6NrCoy_W-1WvQCLcB/s640/Screen%2BShot%2B2016-05-18%2Bat%2B12.57.01%2BPM.png" width="640"></a></td></tr><tr><td>Examples of response suggestions reflecting fine-grained object classes</td></tr></tbody></table>One aspect of the system that we find very useful is that it can suggest responses not just for physical objects but also for abstract concepts. It can produce suggestions for events (birthday parties, weddings, etc.), nature (sunrises, mountains, etc.), recreational activities (hiking, camping, etc.), and many more categories.  Also, the system can generate responses that reflect the emotions that might be associated with an image, such as &#8220;happiness&#8221;.  Here are some examples of responses for abstract concepts:<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://1.bp.blogspot.com/-rNsycTKcWLc/VzzL0_3zwyI/AAAAAAAABBE/V5crK_ktmbMAI9IupZ-crPyjj4NzvTz9gCLcB/s1600/Screen%2BShot%2B2016-05-18%2Bat%2B12.57.25%2BPM.png"><img border="0" height="260" src="https://1.bp.blogspot.com/-rNsycTKcWLc/VzzL0_3zwyI/AAAAAAAABBE/V5crK_ktmbMAI9IupZ-crPyjj4NzvTz9gCLcB/s640/Screen%2BShot%2B2016-05-18%2Bat%2B12.57.25%2BPM.png" width="640"></a></td></tr><tr><td>Response suggestions reflecting abstract concepts</td></tr></tbody></table><b>Learning entity-response associations</b><br /><br />At runtime, Photo Reply recognizes entities in the shared photo and triggers responses for the entities. The model that maps entities to natural language responses is learned offline using <a href="http://arxiv.org/abs/1512.01752">Expander</a>, which is a large-scale graph-based <a href="https://en.wikipedia.org/wiki/Semi-supervised_learning">semi-supervised learning</a> platform at Google. We built a massive a graph where nodes correspond to photos, semantic entities, and textual responses.  Edges in the graph indicate when an entity was recognized for a photo, when a specific response was given for a photo, and visual similarities between photos. Some of the nodes are "labeled" and we learn associations for the unlabeled nodes by propagating label information across the graph. <br /><br />To illustrate this, consider the graph below. There are two labels: the red label corresponds to the response "yummy" and the blue label corresponds to "delicious". The nodes for "spaghetti" and "linguine" are unlabeled, but from the fact that they are close to the red and blue nodes, the algorithm can learn that they should be associated to the "yummy" and "delicious" responses. Notice that in this way, we are associating the entity "linguine" to the response "yummy" even though none of the linguine photos in the graph are directly connected to this answer. Expander can perform this kind of learning at very large scale, for graphs containing billions of nodes and hundred of billions of edges.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://4.bp.blogspot.com/-d9yM95LtOq8/VzzM5kN3dWI/AAAAAAAABBY/GIQPQGMZtKsyPBN7UhPLS2PD9uyckoCYQCLcB/s1600/blog%2Bdiagrams-04.png"><img border="0" height="424" src="https://4.bp.blogspot.com/-d9yM95LtOq8/VzzM5kN3dWI/AAAAAAAABBY/GIQPQGMZtKsyPBN7UhPLS2PD9uyckoCYQCLcB/s640/blog%2Bdiagrams-04.png" width="640"></a></td></tr><tr><td>Graph of entities, photos, and responses</td></tr></tbody></table><div></div>Photo Reply is an exciting example of <a href="https://en.wikipedia.org/wiki/Multimodal_learning">multimodal learning</a>, where computer vision and natural language processing come together in order to create a compelling user experience.  Allo will be available on Android and iOS later this summer.  Be sure to check out what Allo sees in your beautiful photos!]]></description>
				<content:encoded><![CDATA[<span class="byline-author">by Ariel Fuxman, Research Scientist</span><br /><br />Today, Google <a href="https://googleblog.blogspot.com/2016/05/allo-duo-apps-messaging-video.html">announced Allo</a> — our new mobile messaging app.  From day one of the Allo development effort, we set out to build a truly special product that is powered by Google’s strengths in machine intelligence to make messaging easier, more efficient, and more expressive.  Photo Reply is a unique feature of Allo that just does that!  We use machine learning to understand what a shared photo depicts and to suggest rich natural language replies that the user can tap to send.  This makes it easier for users to sustain meaningful conversations while using small mobile keyboards.     <br /><br />Here is an example of the responses that Allo suggests when a friend shares a photo of his child.<br /><div class="separator" style="clear: both; text-align: center;"></div><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-cAKNObZ8x_U/VzzkFEnGbSI/AAAAAAAABCE/juPRebiyXgUqFbEChenzOOVcFaQehFLqgCLcB/s1600/Screen%2BShot%2B2016-05-18%2Bat%2B2.50.07%2BPM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="540" src="https://4.bp.blogspot.com/-cAKNObZ8x_U/VzzkFEnGbSI/AAAAAAAABCE/juPRebiyXgUqFbEChenzOOVcFaQehFLqgCLcB/s640/Screen%2BShot%2B2016-05-18%2Bat%2B2.50.07%2BPM.png" width="640" /></a></div><b>Photo Reply — Under the Hood</b><br /><br />During the winter, our product managers, Patrick McGregor and Ryan Cassidy, challenged us to develop new approaches to simplify media sharing in messaging while simultaneously delighting users with Google insights.  With my colleagues Vivek Ramavajjala, Sergey Nazarov, and Sujith Ravi, we set out to build Photo Reply.<br /><br />We utilize Google's <a href="http://googleresearch.blogspot.com/2014/09/building-deeper-understanding-of-images.html">image recognition technology</a>, developed by our <a href="http://research.google.com/pubs/MachinePerception.html">Machine Perception</a> team, to associate images with <i>semantic entities</i> — people, animals, cars, etc.  We then apply a machine learned model that maps those recognized entities to actual natural language responses. Our system produces replies for thousands of entity types that  are drawn from a taxonomy that is a subset of Google's <a href="https://www.google.com/intl/es419/insidesearch/features/search/knowledge.html">Knowledge Graph</a> and may be at different granularity levels. For example, when you receive a photo of a dog, the system may detect that the dog is actually a labrador and suggest "Love that lab!". Or given a photo of a pasta dish, it may detect the type of pasta ("Yum linguine!") and even the cuisine ("I love Italian food!").<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://2.bp.blogspot.com/-OTFHk4Z_bZs/VzzLk8J7CsI/AAAAAAAABBA/tLRFP9qyrZo4zoPSNi8k6NrCoy_W-1WvQCLcB/s1600/Screen%2BShot%2B2016-05-18%2Bat%2B12.57.01%2BPM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="260" src="https://2.bp.blogspot.com/-OTFHk4Z_bZs/VzzLk8J7CsI/AAAAAAAABBA/tLRFP9qyrZo4zoPSNi8k6NrCoy_W-1WvQCLcB/s640/Screen%2BShot%2B2016-05-18%2Bat%2B12.57.01%2BPM.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Examples of response suggestions reflecting fine-grained object classes</td></tr></tbody></table>One aspect of the system that we find very useful is that it can suggest responses not just for physical objects but also for abstract concepts. It can produce suggestions for events (birthday parties, weddings, etc.), nature (sunrises, mountains, etc.), recreational activities (hiking, camping, etc.), and many more categories.  Also, the system can generate responses that reflect the emotions that might be associated with an image, such as “happiness”.  Here are some examples of responses for abstract concepts:<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-rNsycTKcWLc/VzzL0_3zwyI/AAAAAAAABBE/V5crK_ktmbMAI9IupZ-crPyjj4NzvTz9gCLcB/s1600/Screen%2BShot%2B2016-05-18%2Bat%2B12.57.25%2BPM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="260" src="https://1.bp.blogspot.com/-rNsycTKcWLc/VzzL0_3zwyI/AAAAAAAABBE/V5crK_ktmbMAI9IupZ-crPyjj4NzvTz9gCLcB/s640/Screen%2BShot%2B2016-05-18%2Bat%2B12.57.25%2BPM.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Response suggestions reflecting abstract concepts</td></tr></tbody></table><b>Learning entity-response associations</b><br /><br />At runtime, Photo Reply recognizes entities in the shared photo and triggers responses for the entities. The model that maps entities to natural language responses is learned offline using <a href="http://arxiv.org/abs/1512.01752">Expander</a>, which is a large-scale graph-based <a href="https://en.wikipedia.org/wiki/Semi-supervised_learning">semi-supervised learning</a> platform at Google. We built a massive a graph where nodes correspond to photos, semantic entities, and textual responses.  Edges in the graph indicate when an entity was recognized for a photo, when a specific response was given for a photo, and visual similarities between photos. Some of the nodes are "labeled" and we learn associations for the unlabeled nodes by propagating label information across the graph. <br /><br />To illustrate this, consider the graph below. There are two labels: the red label corresponds to the response "yummy" and the blue label corresponds to "delicious". The nodes for "spaghetti" and "linguine" are unlabeled, but from the fact that they are close to the red and blue nodes, the algorithm can learn that they should be associated to the "yummy" and "delicious" responses. Notice that in this way, we are associating the entity "linguine" to the response "yummy" even though none of the linguine photos in the graph are directly connected to this answer. Expander can perform this kind of learning at very large scale, for graphs containing billions of nodes and hundred of billions of edges.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-d9yM95LtOq8/VzzM5kN3dWI/AAAAAAAABBY/GIQPQGMZtKsyPBN7UhPLS2PD9uyckoCYQCLcB/s1600/blog%2Bdiagrams-04.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="424" src="https://4.bp.blogspot.com/-d9yM95LtOq8/VzzM5kN3dWI/AAAAAAAABBY/GIQPQGMZtKsyPBN7UhPLS2PD9uyckoCYQCLcB/s640/blog%2Bdiagrams-04.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Graph of entities, photos, and responses</td></tr></tbody></table><div class="separator" style="clear: both; text-align: center;"></div>Photo Reply is an exciting example of <a href="https://en.wikipedia.org/wiki/Multimodal_learning">multimodal learning</a>, where computer vision and natural language processing come together in order to create a compelling user experience.  Allo will be available on Android and iOS later this summer.  Be sure to check out what Allo sees in your beautiful photos!]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/aw-so-cute-allo-helps-you-respond-to-shared-photos/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Chat Smarter with Allo</title>
		<link>https://googledata.org/google-research/chat-smarter-with-allo/</link>
		<comments>https://googledata.org/google-research/chat-smarter-with-allo/#comments</comments>
		<pubDate>Wed, 18 May 2016 17:30:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=901359cf26d06c1874f2e185f6ac412b</guid>
		<description><![CDATA[<span>Posted by Pranav Khaitan, Google Research</span><br /><br />At Google, we are continuously building products powered by <a href="https://en.wikipedia.org/wiki/Machine_learning">Machine Learning</a> to delight our users and simplify their lives. Today, we are excited to talk about the technology behind <a href="https://googleblog.blogspot.com/2016/05/allo-duo-apps-messaging-video.html">Allo</a>, a new smart messaging app that uses the power of <a href="https://en.wikipedia.org/wiki/Artificial_neural_network">neural networks</a> and Google Search to make your text conversations easier and more productive.<br /><br />Just like <a href="http://googleresearch.blogspot.com/2015/11/computer-respond-to-this-email.html">Smart Reply for Inbox</a>, Allo understands the conversation history to generate a set of suggestions that the user will likely want to respond with. In addition to understanding the context of your conversation, Allo learns your individual style, so the responses are personalized for you.<br /><div><a href="https://1.bp.blogspot.com/-alT-I567nTc/Vzyev853krI/AAAAAAAABAQ/KMxYIR15-2kLTrqioP5-C_gVUYhnwYHxgCLcB/s1600/Screen%2BShot%2B2016-05-18%2Bat%2B9.38.19%2BAM.png"><img border="0" height="540" src="https://1.bp.blogspot.com/-alT-I567nTc/Vzyev853krI/AAAAAAAABAQ/KMxYIR15-2kLTrqioP5-C_gVUYhnwYHxgCLcB/s640/Screen%2BShot%2B2016-05-18%2Bat%2B9.38.19%2BAM.png" width="640"></a></div><b>How does it work?</b><br /><br />About a year ago, we started exploring how we can make communication easier and more fun. The idea of Smart Reply for Allo came up in a brainstorming session with my teammates Sushant Prakash and Ori Gershony who then helped me lead our team to build this technology. We began by experimenting with neural network based model architectures which had proven to be successful for sequence prediction, including the encoder-decoder model used in Smart Reply for Inbox. <br /><br />One challenge we faced was that response generation in online conversations have very strict latency requirements.  To address this, Pavel Sountsov and Sushant came up with an innovative two-stage model that works as follows.  First, a <a href="https://en.wikipedia.org/wiki/Recurrent_neural_network">recurrent neural network</a> looks at the conversation context one word at a time and encodes it in the hidden state of a <a href="http://colah.github.io/posts/2015-08-Understanding-LSTMs/">long short term memory</a> (LSTM).  Below, we show an example with a context &#8216;Where are you?&#8217;.  The context has three tokens, each of which is embedded into a continuous space and input to the LSTM.   The LSTM state now encodes the context as a continuous vector.   This vector is used to generate the response as a discretized semantic class.  <br /><div><a href="https://2.bp.blogspot.com/-Y5VjTvnaez4/Vzye4y7uxrI/AAAAAAAABAU/MzHJolDQnVU5siTWDX2jH7VGMZWI4B8RgCLcB/s1600/image00.png"><img border="0" height="400" src="https://2.bp.blogspot.com/-Y5VjTvnaez4/Vzye4y7uxrI/AAAAAAAABAU/MzHJolDQnVU5siTWDX2jH7VGMZWI4B8RgCLcB/s400/image00.png" width="362"></a></div>Each semantic class is associated with a set of possible messages that belong to it. We use a second recurrent network to generate a specific message from that set.  This network also converts the context into a hidden LSTM state but this time the hidden state is used to generate the full message of the reply one token at a time.  For example, now the LSTM after seeing the context &#8220;Where are you?&#8221; generates the tokens in the response: &#8220;I&#8217;m at work&#8221;.  <br /><div><a href="https://3.bp.blogspot.com/-Yq1TRVEqKgw/VzyfB4lsqVI/AAAAAAAABAY/9STO-dzVWDUCL-9jzv3nJAKJ1BjmxzZYACLcB/s1600/image01.png"><img border="0" height="412" src="https://3.bp.blogspot.com/-Yq1TRVEqKgw/VzyfB4lsqVI/AAAAAAAABAY/9STO-dzVWDUCL-9jzv3nJAKJ1BjmxzZYACLcB/s640/image01.png" width="640"></a></div>A <a href="https://en.wikipedia.org/wiki/Beam_search">beam search</a> is used to efficiently select the top-N highest scoring responses from among the very large set of possible messages that a LSTM can generate. A snippet of the search space explored by such a beam-search technique is shown below.<br /><div><a href="https://3.bp.blogspot.com/-8P950NVeWIM/VzyfKG91BpI/AAAAAAAABAc/koVPi5wZvpQsdvAqHTPEQe2B-29PqZDOwCLcB/s1600/image04.png"><img border="0" height="360" src="https://3.bp.blogspot.com/-8P950NVeWIM/VzyfKG91BpI/AAAAAAAABAc/koVPi5wZvpQsdvAqHTPEQe2B-29PqZDOwCLcB/s400/image04.png" width="400"></a></div>As with any large-scale product, there were several engineering challenges we had to solve in generating a set of high-quality responses efficiently. For example, in spite of the two staged architecture, our first few networks were very slow and required about half a second to generate a response. This was obviously a deal breaker when we are talking about real time communication apps! So we had to evolve our neural network architecture further to reduce the latency to less than 200ms. We moved from using a softmax layer to a hierarchical softmax layer which traverses a tree of words instead of traversing a list of words thus making it more efficient.   <br /><br />Another interesting challenge we had to solve when generating predictions is controlling for message length. Sometimes none of the most probable responses are appropriate -  if the model predicts too short a message, it might not be useful to the user, and if we predict something too long, it might not fit on the phone screen. We solved this by biasing the beam search to follow paths that lead to higher utility responses instead of favoring just the responses that are most probable. That way,  we can efficiently generate appropriate length response predictions that are useful to our users.<br /><br /><b>Personalized for you</b><br /><br />The best part about these suggestions is that over time they are personalized to you so that your individual style is reflected in your conversations. For example, if you often reply to &#8220;How are you?&#8221; with &#8220;Fine.&#8221; instead of &#8220;I am good.&#8221;, it will learn your preference and your future suggestions will take that into account. This was accomplished by incorporating a user's "style" as one of the features in a Neural Network that is used to predict the next word in a response, resulting in suggestions that are customized for your personality and individual preferences. The user's style is captured in a sequence of numbers that we call the user embedding. These embeddings can be generated as part of the regular model training, but this approach requires waiting for many days for training to be complete and it cannot handle more than a handful of millions of users. To solve this issue, Alon Shafrir implemented a <a href="https://en.wikipedia.org/wiki/Limited-memory_BFGS">L-BFGS</a> based technique to generate user embeddings quickly and at scale. Now, you'll be able to enjoy personalized suggestions after only a short time of using Allo. <br /><br /><b>More than just English</b><br /><br />The neural network model described above is language agnostic so building separate prediction models for each language works quite well. To make sure that responses for each language benefit from our semantic understanding of other languages, Sujith Ravi came up with a graph-based machine learning technique that can connect possible responses across languages. Dana Movshovitz-Attias and Peter Young applied this technique to build a graph that connects responses to incoming messages and to other responses that have similar word embeddings and syntactic relationships. It also connects responses with similar meaning across languages based on the <a href="https://en.wikipedia.org/wiki/Machine_translation">machine translation</a> models developed by our <a href="http://research.google.com/pubs/MachineTranslation.html">Translate team</a>. <br /><br />With this graph, we use <a href="https://en.wikipedia.org/wiki/Semi-supervised_learning">semi-supervised learning</a>, as described in this <a href="http://arxiv.org/abs/1512.01752">paper</a>, to learn the semantic meaning of responses and determine which are the most useful clusters of possible responses. As a result, we can allow the LSTM to score many possible variants of each possible response meaning, allowing the personalization routines to select the best response for the user in the context of the conversation. This also helps enforce diversity as we can now pick the final set of responses from different semantic clusters.<br /><br />Here&#8217;s an example of how the graph might look for a set of messages related to greetings:<br /><div><a href="https://4.bp.blogspot.com/-S9T56cHYgMg/VzyfTP6XCPI/AAAAAAAABAw/268jVv02G6cglepQ2dVVHEmjuhsd2xXDACKgB/s1600/image02.png"><img border="0" height="230" src="https://4.bp.blogspot.com/-S9T56cHYgMg/VzyfTP6XCPI/AAAAAAAABAw/268jVv02G6cglepQ2dVVHEmjuhsd2xXDACKgB/s400/image02.png" width="400"></a></div><b>Beyond Smart Reply</b><br /><br />I am also very excited about the Google assistant in Allo with which you can converse and get information about anything that Google Search knows about. It understands your sentences and helps you accomplish tasks directly from the conversation. For example, the Google assistant can help you discover a restaurant and reserve a table from within the Allo app when chatting with your friends. This has been made possible because of the cutting-edge research in natural language understanding that we have been doing at Google. More details to follow soon!<br /><br />These smart features will be part of the Android and iOS apps for Allo that will be available later this summer. We can&#8217;t wait for you to try and enjoy it!<br /><br />We wish to acknowledge the hard work of the following  in building Smart Reply:<br /><i>Pranav Khaitan, Sushant Prakash, Pavel Sountsov,  Alon Shafrir, Max Gubin, Shu Zhang, Sunita Sarawagi, Ori Gershony, Sergey Nazarov, Hung Pham, Harini Krishnamurthy, Ryan Cassidy, Dave Citron, Patrick McGregor, Sujith Ravi, Dana Movshovitz-Attias, Peter Young, Vivek Ramavajjala</i>]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Pranav Khaitan, Google Research</span><br /><br />At Google, we are continuously building products powered by <a href="https://en.wikipedia.org/wiki/Machine_learning">Machine Learning</a> to delight our users and simplify their lives. Today, we are excited to talk about the technology behind <a href="https://googleblog.blogspot.com/2016/05/allo-duo-apps-messaging-video.html">Allo</a>, a new smart messaging app that uses the power of <a href="https://en.wikipedia.org/wiki/Artificial_neural_network">neural networks</a> and Google Search to make your text conversations easier and more productive.<br /><br />Just like <a href="http://googleresearch.blogspot.com/2015/11/computer-respond-to-this-email.html">Smart Reply for Inbox</a>, Allo understands the conversation history to generate a set of suggestions that the user will likely want to respond with. In addition to understanding the context of your conversation, Allo learns your individual style, so the responses are personalized for you.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-alT-I567nTc/Vzyev853krI/AAAAAAAABAQ/KMxYIR15-2kLTrqioP5-C_gVUYhnwYHxgCLcB/s1600/Screen%2BShot%2B2016-05-18%2Bat%2B9.38.19%2BAM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="540" src="https://1.bp.blogspot.com/-alT-I567nTc/Vzyev853krI/AAAAAAAABAQ/KMxYIR15-2kLTrqioP5-C_gVUYhnwYHxgCLcB/s640/Screen%2BShot%2B2016-05-18%2Bat%2B9.38.19%2BAM.png" width="640" /></a></div><b>How does it work?</b><br /><br />About a year ago, we started exploring how we can make communication easier and more fun. The idea of Smart Reply for Allo came up in a brainstorming session with my teammates Sushant Prakash and Ori Gershony who then helped me lead our team to build this technology. We began by experimenting with neural network based model architectures which had proven to be successful for sequence prediction, including the encoder-decoder model used in Smart Reply for Inbox. <br /><br />One challenge we faced was that response generation in online conversations have very strict latency requirements.  To address this, Pavel Sountsov and Sushant came up with an innovative two-stage model that works as follows.  First, a <a href="https://en.wikipedia.org/wiki/Recurrent_neural_network">recurrent neural network</a> looks at the conversation context one word at a time and encodes it in the hidden state of a <a href="http://colah.github.io/posts/2015-08-Understanding-LSTMs/">long short term memory</a> (LSTM).  Below, we show an example with a context ‘Where are you?’.  The context has three tokens, each of which is embedded into a continuous space and input to the LSTM.   The LSTM state now encodes the context as a continuous vector.   This vector is used to generate the response as a discretized semantic class.  <br /><div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-Y5VjTvnaez4/Vzye4y7uxrI/AAAAAAAABAU/MzHJolDQnVU5siTWDX2jH7VGMZWI4B8RgCLcB/s1600/image00.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="400" src="https://2.bp.blogspot.com/-Y5VjTvnaez4/Vzye4y7uxrI/AAAAAAAABAU/MzHJolDQnVU5siTWDX2jH7VGMZWI4B8RgCLcB/s400/image00.png" width="362" /></a></div>Each semantic class is associated with a set of possible messages that belong to it. We use a second recurrent network to generate a specific message from that set.  This network also converts the context into a hidden LSTM state but this time the hidden state is used to generate the full message of the reply one token at a time.  For example, now the LSTM after seeing the context “Where are you?” generates the tokens in the response: “I’m at work”.  <br /><div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-Yq1TRVEqKgw/VzyfB4lsqVI/AAAAAAAABAY/9STO-dzVWDUCL-9jzv3nJAKJ1BjmxzZYACLcB/s1600/image01.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="412" src="https://3.bp.blogspot.com/-Yq1TRVEqKgw/VzyfB4lsqVI/AAAAAAAABAY/9STO-dzVWDUCL-9jzv3nJAKJ1BjmxzZYACLcB/s640/image01.png" width="640" /></a></div>A <a href="https://en.wikipedia.org/wiki/Beam_search">beam search</a> is used to efficiently select the top-N highest scoring responses from among the very large set of possible messages that a LSTM can generate. A snippet of the search space explored by such a beam-search technique is shown below.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-8P950NVeWIM/VzyfKG91BpI/AAAAAAAABAc/koVPi5wZvpQsdvAqHTPEQe2B-29PqZDOwCLcB/s1600/image04.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="360" src="https://3.bp.blogspot.com/-8P950NVeWIM/VzyfKG91BpI/AAAAAAAABAc/koVPi5wZvpQsdvAqHTPEQe2B-29PqZDOwCLcB/s400/image04.png" width="400" /></a></div>As with any large-scale product, there were several engineering challenges we had to solve in generating a set of high-quality responses efficiently. For example, in spite of the two staged architecture, our first few networks were very slow and required about half a second to generate a response. This was obviously a deal breaker when we are talking about real time communication apps! So we had to evolve our neural network architecture further to reduce the latency to less than 200ms. We moved from using a softmax layer to a hierarchical softmax layer which traverses a tree of words instead of traversing a list of words thus making it more efficient.   <br /><br />Another interesting challenge we had to solve when generating predictions is controlling for message length. Sometimes none of the most probable responses are appropriate -  if the model predicts too short a message, it might not be useful to the user, and if we predict something too long, it might not fit on the phone screen. We solved this by biasing the beam search to follow paths that lead to higher utility responses instead of favoring just the responses that are most probable. That way,  we can efficiently generate appropriate length response predictions that are useful to our users.<br /><br /><b>Personalized for you</b><br /><br />The best part about these suggestions is that over time they are personalized to you so that your individual style is reflected in your conversations. For example, if you often reply to “How are you?” with “Fine.” instead of “I am good.”, it will learn your preference and your future suggestions will take that into account. This was accomplished by incorporating a user's "style" as one of the features in a Neural Network that is used to predict the next word in a response, resulting in suggestions that are customized for your personality and individual preferences. The user's style is captured in a sequence of numbers that we call the user embedding. These embeddings can be generated as part of the regular model training, but this approach requires waiting for many days for training to be complete and it cannot handle more than a handful of millions of users. To solve this issue, Alon Shafrir implemented a <a href="https://en.wikipedia.org/wiki/Limited-memory_BFGS">L-BFGS</a> based technique to generate user embeddings quickly and at scale. Now, you'll be able to enjoy personalized suggestions after only a short time of using Allo. <br /><br /><b>More than just English</b><br /><br />The neural network model described above is language agnostic so building separate prediction models for each language works quite well. To make sure that responses for each language benefit from our semantic understanding of other languages, Sujith Ravi came up with a graph-based machine learning technique that can connect possible responses across languages. Dana Movshovitz-Attias and Peter Young applied this technique to build a graph that connects responses to incoming messages and to other responses that have similar word embeddings and syntactic relationships. It also connects responses with similar meaning across languages based on the <a href="https://en.wikipedia.org/wiki/Machine_translation">machine translation</a> models developed by our <a href="http://research.google.com/pubs/MachineTranslation.html">Translate team</a>. <br /><br />With this graph, we use <a href="https://en.wikipedia.org/wiki/Semi-supervised_learning">semi-supervised learning</a>, as described in this <a href="http://arxiv.org/abs/1512.01752">paper</a>, to learn the semantic meaning of responses and determine which are the most useful clusters of possible responses. As a result, we can allow the LSTM to score many possible variants of each possible response meaning, allowing the personalization routines to select the best response for the user in the context of the conversation. This also helps enforce diversity as we can now pick the final set of responses from different semantic clusters.<br /><br />Here’s an example of how the graph might look for a set of messages related to greetings:<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-S9T56cHYgMg/VzyfTP6XCPI/AAAAAAAABAw/268jVv02G6cglepQ2dVVHEmjuhsd2xXDACKgB/s1600/image02.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="230" src="https://4.bp.blogspot.com/-S9T56cHYgMg/VzyfTP6XCPI/AAAAAAAABAw/268jVv02G6cglepQ2dVVHEmjuhsd2xXDACKgB/s400/image02.png" width="400" /></a></div><b>Beyond Smart Reply</b><br /><br />I am also very excited about the Google assistant in Allo with which you can converse and get information about anything that Google Search knows about. It understands your sentences and helps you accomplish tasks directly from the conversation. For example, the Google assistant can help you discover a restaurant and reserve a table from within the Allo app when chatting with your friends. This has been made possible because of the cutting-edge research in natural language understanding that we have been doing at Google. More details to follow soon!<br /><br />These smart features will be part of the Android and iOS apps for Allo that will be available later this summer. We can’t wait for you to try and enjoy it!<br /><br />We wish to acknowledge the hard work of the following  in building Smart Reply:<br /><i>Pranav Khaitan, Sushant Prakash, Pavel Sountsov,  Alon Shafrir, Max Gubin, Shu Zhang, Sunita Sarawagi, Ori Gershony, Sergey Nazarov, Hung Pham, Harini Krishnamurthy, Ryan Cassidy, Dave Citron, Patrick McGregor, Sujith Ravi, Dana Movshovitz-Attias, Peter Young, Vivek Ramavajjala</i>]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/chat-smarter-with-allo/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Announcing SyntaxNet: The World’s Most Accurate Parser Goes Open Source</title>
		<link>https://googledata.org/google-research/announcing-syntaxnet-the-worlds-most-accurate-parser-goes-open-source/</link>
		<comments>https://googledata.org/google-research/announcing-syntaxnet-the-worlds-most-accurate-parser-goes-open-source/#comments</comments>
		<pubDate>Thu, 12 May 2016 19:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=037cca4f4103252832fc0db1d8575394</guid>
		<description><![CDATA[<span>Posted by Slav Petrov, Senior Staff Research Scientist</span><br /><br />At Google, we spend a lot of time thinking about how <a href="http://googleresearch.blogspot.com/2013/12/free-language-lessons-for-computers.html">computer systems</a> can <a href="http://googleresearch.blogspot.com/2015/06/a-multilingual-corpus-of-automatically.html">read</a> and <a href="http://googleresearch.blogspot.com/2014/11/a-picture-is-worth-thousand-coherent.html">understand</a> <a href="http://googleresearch.blogspot.com/2014/08/teaching-machines-to-read-between-lines.html">human language</a> in order <a href="http://googleresearch.blogspot.com/2014/04/a-billion-words-because-todays-language.html">to process it</a> in <a href="http://googleresearch.blogspot.com/2015/11/computer-respond-to-this-email.html">intelligent ways</a>. Today, we are excited to share the fruits of our research with the broader community by releasing <a href="https://github.com/tensorflow/models/tree/master/syntaxnet">SyntaxNet</a>, an open-source neural network framework implemented in <a href="https://www.tensorflow.org/">TensorFlow</a> that provides a foundation for <a href="https://en.wikipedia.org/wiki/Natural_language_understanding">Natural Language Understanding</a> (NLU) systems. Our release includes all the code needed to train new SyntaxNet models on your own data, as well as <i>Parsey McParseface</i>, an English parser that we have trained for you and that you can use to analyze English text.<br /><br />Parsey McParseface is built on powerful machine learning algorithms that learn to analyze the linguistic structure of language, and that can explain the functional role of each word in a given sentence. Because Parsey McParseface is the <a href="http://arxiv.org/abs/1603.06042">most accurate such model in the world</a>, we hope that it will be useful to developers and researchers interested in automatic extraction of information, translation, and other core applications of NLU. <br /><br /><b>How does SyntaxNet work?</b><br /><br />SyntaxNet is a framework for what&#8217;s known in academic circles as a <a href="https://en.wikipedia.org/wiki/Parsing"><i>syntactic parser</i></a>, which is a key first component in many NLU systems. Given a sentence as input, it tags each word with a part-of-speech (POS) tag that describes the word's syntactic function, and it determines the syntactic relationships between words in the sentence, represented in the dependency parse tree. These syntactic relationships are directly related to the underlying meaning of the sentence in question. To take a very simple example, consider the following dependency tree for <i>Alice saw Bob</i>:<br /><br /><div></div><div><a href="https://3.bp.blogspot.com/-M-7PIST2hq8/VzTFhESMeuI/AAAAAAAAA_s/k4wOQe0UlnwmoVnZtuU6CNHw6xLQRN7egCLcB/s1600/asawb.png"><img border="0" height="262" src="https://3.bp.blogspot.com/-M-7PIST2hq8/VzTFhESMeuI/AAAAAAAAA_s/k4wOQe0UlnwmoVnZtuU6CNHw6xLQRN7egCLcB/s320/asawb.png" width="320"></a></div><br />This structure encodes that <i>Alice</i> and <i>Bob</i> are nouns and <i>saw</i> is a verb. The main verb <i>saw</i> is the root of the sentence and <i>Alice</i> is the subject (nsubj) of <i>saw</i>, while <i>Bob</i> is its direct object (dobj). As expected, Parsey McParseface analyzes this sentence correctly, but also understands the following more complex example:<br /><br /><div></div><div><a href="https://4.bp.blogspot.com/-1Ntx47T1WvU/VzTF2HgbqrI/AAAAAAAAA_w/UWofRQPhqU0ITD5HPQmEVCrwsEroCN8PQCLcB/s1600/long.png"><img border="0" height="202" src="https://4.bp.blogspot.com/-1Ntx47T1WvU/VzTF2HgbqrI/AAAAAAAAA_w/UWofRQPhqU0ITD5HPQmEVCrwsEroCN8PQCLcB/s640/long.png" width="640"></a></div><br />This structure again encodes the fact that <i>Alice</i> and <i>Bob</i> are the subject and object respectively of <i>saw</i>, in addition that <i>Alice</i> is modified by a relative clause with the verb <i>reading</i>, that <i>saw</i> is modified by the temporal modifier <i>yesterday</i>, and so on. The grammatical relationships encoded in dependency structures allow us to easily recover the answers to various questions, for example <i>whom did Alice see?</i>, <i>who saw Bob?</i>, <i>what had Alice been reading about?</i> or <i>when did Alice see Bob?</i>. <br /><br /><b>Why is Parsing So Hard For Computers to Get Right?</b><br /><br />One of the main problems that makes parsing so challenging is that human languages show remarkable levels of ambiguity. It is not uncommon for moderate length sentences - say 20 or 30 words in length - to have hundreds, thousands, or even tens of thousands of possible syntactic structures. A natural language parser must somehow search through all of these alternatives, and find the most plausible structure given the context. As a very simple example, the sentence <i>Alice drove down the street in her car</i> has at least two possible dependency parses:<br /><br /><div><a href="https://2.bp.blogspot.com/-cXYL6RGkV_g/VzTDzbh6yEI/AAAAAAAAA_Q/1c-76sGQ124oE9njB2E6QzU6KcxDCn0KgCLcB/s1600/drovedown.png"><img border="0" height="136" src="https://2.bp.blogspot.com/-cXYL6RGkV_g/VzTDzbh6yEI/AAAAAAAAA_Q/1c-76sGQ124oE9njB2E6QzU6KcxDCn0KgCLcB/s640/drovedown.png" width="640"></a></div><br />The first corresponds to the (correct) interpretation where Alice is driving in her car; the second corresponds to the (absurd, but possible) interpretation where the street is located in her car. The ambiguity arises because the preposition <i>in</i> can either modify <i>drove</i> or <i>street</i>; this example is an instance of what is called <i>prepositional phrase attachment ambiguity</i>. <br /><br />Humans do a remarkable job of dealing with ambiguity, almost to the point where the problem is unnoticeable; the challenge is for computers to do the same. Multiple ambiguities such as these in longer sentences conspire to give a combinatorial explosion in the number of possible structures for a sentence. Usually the vast majority of these structures are wildly implausible, but are nevertheless possible and must be somehow discarded by a parser. <br /><br />SyntaxNet applies neural networks to the ambiguity problem. An input sentence is processed from left to right, with dependencies between words being incrementally added as each word in the sentence is considered. At each point in processing many decisions may be possible&#8212;due to ambiguity&#8212;and a neural network gives scores for competing decisions based on their plausibility. For this reason, it is very important to use <i><a href="https://en.wikipedia.org/wiki/Beam_search">beam search</a></i> in the model. Instead of simply taking the first-best decision at each point, multiple partial hypotheses are kept at each step, with hypotheses only being discarded when there are several other higher-ranked hypotheses under consideration. An example of a left-to-right sequence of decisions that produces a simple parse is shown below for the sentence <i>I booked a ticket to Google</i>.<br /><div><a href="https://2.bp.blogspot.com/-fqtmVS97tOs/VzTEAI9BQ8I/AAAAAAAAA_U/xPj0Av64sGseS0rF4Z1BbhmS77J-HuEvwCLcB/s1600/image04.gif"><img border="0" height="336" src="https://2.bp.blogspot.com/-fqtmVS97tOs/VzTEAI9BQ8I/AAAAAAAAA_U/xPj0Av64sGseS0rF4Z1BbhmS77J-HuEvwCLcB/s640/image04.gif" width="640"></a></div>Furthermore, as described in our <a href="http://arxiv.org/abs/1603.06042">paper</a>, it is critical to tightly <i>integrate learning and search</i> in order to achieve the highest prediction accuracy. Parsey McParseface and other <a href="https://github.com/tensorflow/models/tree/master/syntaxnet">SyntaxNet</a> models are some of the most complex networks that we have trained with the <a href="https://www.tensorflow.org/">TensorFlow</a> framework at Google. Given some data from the Google supported <a href="http://universaldependencies.org/">Universal Dependencies</a> project, you can train a parsing model on your own machine.<br /><br /><b>So How Accurate is Parsey McParseface?</b><br /><br />On a standard benchmark consisting of randomly drawn English newswire sentences (the 20 year old <a href="https://www.cis.upenn.edu/~treebank/">Penn Treebank</a>), Parsey McParseface recovers individual dependencies between words with over 94% accuracy, beating our own previous state-of-the-art results, which were already <a href="http://arxiv.org/abs/1603.06042">better than any previous approach</a>. While there are no explicit studies in the literature about human performance, we know from our in-house annotation projects that linguists trained for this task agree in 96-97% of the cases. This suggests that we are approaching human performance&#8212;but only on well-formed text. Sentences drawn from the web are a lot harder to analyze, as we learned from the <a href="http://googleresearch.blogspot.com/2011/03/building-resources-to-syntactically.html">Google WebTreebank</a> (released in 2011). Parsey McParseface achieves just over 90% of parse accuracy on this dataset. <br /><br />While the accuracy is not perfect, it&#8217;s certainly high enough to be useful in many applications. The major source of errors at this point are examples such as the prepositional phrase attachment ambiguity described above, which require real world knowledge (e.g. that a street is not likely to be located in a car) and deep contextual reasoning. Machine learning (and in particular, neural networks) have made significant progress in resolving these ambiguities. But our work is still cut out for us: we would like to develop methods that can learn world knowledge and enable equal understanding of natural language across <i>all</i> languages and contexts.<br /><br />To get started, see the <a href="https://github.com/tensorflow/models/tree/master/syntaxnet">SyntaxNet</a> code and download the Parsey McParseface parser model. Happy parsing from the main developers, Chris Alberti, David Weiss, Daniel Andor, Michael Collins &#38; Slav Petrov.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Slav Petrov, Senior Staff Research Scientist</span><br /><br />At Google, we spend a lot of time thinking about how <a href="http://googleresearch.blogspot.com/2013/12/free-language-lessons-for-computers.html">computer systems</a> can <a href="http://googleresearch.blogspot.com/2015/06/a-multilingual-corpus-of-automatically.html">read</a> and <a href="http://googleresearch.blogspot.com/2014/11/a-picture-is-worth-thousand-coherent.html">understand</a> <a href="http://googleresearch.blogspot.com/2014/08/teaching-machines-to-read-between-lines.html">human language</a> in order <a href="http://googleresearch.blogspot.com/2014/04/a-billion-words-because-todays-language.html">to process it</a> in <a href="http://googleresearch.blogspot.com/2015/11/computer-respond-to-this-email.html">intelligent ways</a>. Today, we are excited to share the fruits of our research with the broader community by releasing <a href="https://github.com/tensorflow/models/tree/master/syntaxnet">SyntaxNet</a>, an open-source neural network framework implemented in <a href="https://www.tensorflow.org/">TensorFlow</a> that provides a foundation for <a href="https://en.wikipedia.org/wiki/Natural_language_understanding">Natural Language Understanding</a> (NLU) systems. Our release includes all the code needed to train new SyntaxNet models on your own data, as well as <i>Parsey McParseface</i>, an English parser that we have trained for you and that you can use to analyze English text.<br /><br />Parsey McParseface is built on powerful machine learning algorithms that learn to analyze the linguistic structure of language, and that can explain the functional role of each word in a given sentence. Because Parsey McParseface is the <a href="http://arxiv.org/abs/1603.06042">most accurate such model in the world</a>, we hope that it will be useful to developers and researchers interested in automatic extraction of information, translation, and other core applications of NLU. <br /><br /><b>How does SyntaxNet work?</b><br /><br />SyntaxNet is a framework for what’s known in academic circles as a <a href="https://en.wikipedia.org/wiki/Parsing"><i>syntactic parser</i></a>, which is a key first component in many NLU systems. Given a sentence as input, it tags each word with a part-of-speech (POS) tag that describes the word's syntactic function, and it determines the syntactic relationships between words in the sentence, represented in the dependency parse tree. These syntactic relationships are directly related to the underlying meaning of the sentence in question. To take a very simple example, consider the following dependency tree for <i>Alice saw Bob</i>:<br /><br /><div class="separator" style="clear: both; text-align: center;"></div><div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-M-7PIST2hq8/VzTFhESMeuI/AAAAAAAAA_s/k4wOQe0UlnwmoVnZtuU6CNHw6xLQRN7egCLcB/s1600/asawb.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="262" src="https://3.bp.blogspot.com/-M-7PIST2hq8/VzTFhESMeuI/AAAAAAAAA_s/k4wOQe0UlnwmoVnZtuU6CNHw6xLQRN7egCLcB/s320/asawb.png" width="320" /></a></div><br />This structure encodes that <i>Alice</i> and <i>Bob</i> are nouns and <i>saw</i> is a verb. The main verb <i>saw</i> is the root of the sentence and <i>Alice</i> is the subject (nsubj) of <i>saw</i>, while <i>Bob</i> is its direct object (dobj). As expected, Parsey McParseface analyzes this sentence correctly, but also understands the following more complex example:<br /><br /><div class="separator" style="clear: both; text-align: center;"></div><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-1Ntx47T1WvU/VzTF2HgbqrI/AAAAAAAAA_w/UWofRQPhqU0ITD5HPQmEVCrwsEroCN8PQCLcB/s1600/long.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="202" src="https://4.bp.blogspot.com/-1Ntx47T1WvU/VzTF2HgbqrI/AAAAAAAAA_w/UWofRQPhqU0ITD5HPQmEVCrwsEroCN8PQCLcB/s640/long.png" width="640" /></a></div><br />This structure again encodes the fact that <i>Alice</i> and <i>Bob</i> are the subject and object respectively of <i>saw</i>, in addition that <i>Alice</i> is modified by a relative clause with the verb <i>reading</i>, that <i>saw</i> is modified by the temporal modifier <i>yesterday</i>, and so on. The grammatical relationships encoded in dependency structures allow us to easily recover the answers to various questions, for example <i>whom did Alice see?</i>, <i>who saw Bob?</i>, <i>what had Alice been reading about?</i> or <i>when did Alice see Bob?</i>. <br /><br /><b>Why is Parsing So Hard For Computers to Get Right?</b><br /><br />One of the main problems that makes parsing so challenging is that human languages show remarkable levels of ambiguity. It is not uncommon for moderate length sentences - say 20 or 30 words in length - to have hundreds, thousands, or even tens of thousands of possible syntactic structures. A natural language parser must somehow search through all of these alternatives, and find the most plausible structure given the context. As a very simple example, the sentence <i>Alice drove down the street in her car</i> has at least two possible dependency parses:<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-cXYL6RGkV_g/VzTDzbh6yEI/AAAAAAAAA_Q/1c-76sGQ124oE9njB2E6QzU6KcxDCn0KgCLcB/s1600/drovedown.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="136" src="https://2.bp.blogspot.com/-cXYL6RGkV_g/VzTDzbh6yEI/AAAAAAAAA_Q/1c-76sGQ124oE9njB2E6QzU6KcxDCn0KgCLcB/s640/drovedown.png" width="640" /></a></div><br />The first corresponds to the (correct) interpretation where Alice is driving in her car; the second corresponds to the (absurd, but possible) interpretation where the street is located in her car. The ambiguity arises because the preposition <i>in</i> can either modify <i>drove</i> or <i>street</i>; this example is an instance of what is called <i>prepositional phrase attachment ambiguity</i>. <br /><br />Humans do a remarkable job of dealing with ambiguity, almost to the point where the problem is unnoticeable; the challenge is for computers to do the same. Multiple ambiguities such as these in longer sentences conspire to give a combinatorial explosion in the number of possible structures for a sentence. Usually the vast majority of these structures are wildly implausible, but are nevertheless possible and must be somehow discarded by a parser. <br /><br />SyntaxNet applies neural networks to the ambiguity problem. An input sentence is processed from left to right, with dependencies between words being incrementally added as each word in the sentence is considered. At each point in processing many decisions may be possible—due to ambiguity—and a neural network gives scores for competing decisions based on their plausibility. For this reason, it is very important to use <i><a href="https://en.wikipedia.org/wiki/Beam_search">beam search</a></i> in the model. Instead of simply taking the first-best decision at each point, multiple partial hypotheses are kept at each step, with hypotheses only being discarded when there are several other higher-ranked hypotheses under consideration. An example of a left-to-right sequence of decisions that produces a simple parse is shown below for the sentence <i>I booked a ticket to Google</i>.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-fqtmVS97tOs/VzTEAI9BQ8I/AAAAAAAAA_U/xPj0Av64sGseS0rF4Z1BbhmS77J-HuEvwCLcB/s1600/image04.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="336" src="https://2.bp.blogspot.com/-fqtmVS97tOs/VzTEAI9BQ8I/AAAAAAAAA_U/xPj0Av64sGseS0rF4Z1BbhmS77J-HuEvwCLcB/s640/image04.gif" width="640" /></a></div>Furthermore, as described in our <a href="http://arxiv.org/abs/1603.06042">paper</a>, it is critical to tightly <i>integrate learning and search</i> in order to achieve the highest prediction accuracy. Parsey McParseface and other <a href="https://github.com/tensorflow/models/tree/master/syntaxnet">SyntaxNet</a> models are some of the most complex networks that we have trained with the <a href="https://www.tensorflow.org/">TensorFlow</a> framework at Google. Given some data from the Google supported <a href="http://universaldependencies.org/">Universal Dependencies</a> project, you can train a parsing model on your own machine.<br /><br /><b>So How Accurate is Parsey McParseface?</b><br /><br />On a standard benchmark consisting of randomly drawn English newswire sentences (the 20 year old <a href="https://www.cis.upenn.edu/~treebank/">Penn Treebank</a>), Parsey McParseface recovers individual dependencies between words with over 94% accuracy, beating our own previous state-of-the-art results, which were already <a href="http://arxiv.org/abs/1603.06042">better than any previous approach</a>. While there are no explicit studies in the literature about human performance, we know from our in-house annotation projects that linguists trained for this task agree in 96-97% of the cases. This suggests that we are approaching human performance—but only on well-formed text. Sentences drawn from the web are a lot harder to analyze, as we learned from the <a href="http://googleresearch.blogspot.com/2011/03/building-resources-to-syntactically.html">Google WebTreebank</a> (released in 2011). Parsey McParseface achieves just over 90% of parse accuracy on this dataset. <br /><br />While the accuracy is not perfect, it’s certainly high enough to be useful in many applications. The major source of errors at this point are examples such as the prepositional phrase attachment ambiguity described above, which require real world knowledge (e.g. that a street is not likely to be located in a car) and deep contextual reasoning. Machine learning (and in particular, neural networks) have made significant progress in resolving these ambiguities. But our work is still cut out for us: we would like to develop methods that can learn world knowledge and enable equal understanding of natural language across <i>all</i> languages and contexts.<br /><br />To get started, see the <a href="https://github.com/tensorflow/models/tree/master/syntaxnet">SyntaxNet</a> code and download the Parsey McParseface parser model. Happy parsing from the main developers, Chris Alberti, David Weiss, Daniel Andor, Michael Collins &amp; Slav Petrov.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/announcing-syntaxnet-the-worlds-most-accurate-parser-goes-open-source/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Research at Google and ICLR 2016</title>
		<link>https://googledata.org/google-research/research-at-google-and-iclr-2016/</link>
		<comments>https://googledata.org/google-research/research-at-google-and-iclr-2016/#comments</comments>
		<pubDate>Sun, 01 May 2016 18:49:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=608993a5737f006c216c7109379ff4e1</guid>
		<description><![CDATA[<span>Posted by Dumitru Erhan, Gentleman Scientist</span><br /><br />This week, San Juan, Puerto Rico hosts the <a href="http://www.iclr.cc/doku.php?id=iclr2016:main">4th International Conference on Learning Representations</a> (ICLR 2016), a conference focused on how one can learn meaningful and useful representations of data for <a href="https://en.wikipedia.org/wiki/Machine_learning">Machine Learning</a>. ICLR includes conference and workshop tracks, with invited talks along with oral and poster presentations of some of the latest research on deep learning, metric learning, kernel learning, compositional models, non-linear structured prediction, and issues regarding non-convex optimization. <br /><br />At the forefront of innovation in cutting-edge technology in <a href="https://en.wikipedia.org/wiki/Artificial_neural_network">Neural Networks</a> and <a href="https://en.wikipedia.org/wiki/Deep_learning">Deep Learning</a>, Google focuses on both theory and application, developing learning approaches to understand and generalize. As Platinum Sponsor of ICLR 2016, Google will have a strong presence with over 40 researchers attending (many from the <a href="https://research.google.com/teams/brain/">Google Brain team</a> and <a href="https://deepmind.com/">Google DeepMind</a>), contributing to and learning from the broader academic research community by presenting papers and posters, in addition to participating on organizing committees and in workshops. <br /><br />If you are attending ICLR 2016, we hope you&#8217;ll stop by our booth and chat with our researchers about the projects and opportunities at Google that go into solving interesting problems for billions of people. You can also learn more about our research being presented at ICLR 2016 in the list below (Googlers highlighted in <span>blue</span>).<br /><br /><b><u>Organizing Committee</u></b><br /><br /><b>Program Chairs</b><br /><i><span>Samy Bengio</span>, Brian Kingsbury</i><br /><br /><b>Area Chairs include:</b><br /><i><span>John Platt</span>, <span>Tara Sanaith</span></i><br /><br /><b><u>Oral Sessions</u></b><br /><br /><a href="http://arxiv.org/abs/1511.06279">Neural Programmer-Interpreters</a>&#160;<i>(Best Paper Award Recipient)</i><br /><i>Scott Reed, <span>Nando de Freitas</span></i><br /><br /><a href="http://arxiv.org/abs/1511.05641">Net2Net: Accelerating Learning via Knowledge Transfer </a><br /><i>Tianqi Chen, <span>Ian Goodfellow</span>, <span>Jon Shlens</span></i><br /><br /><b><u>Conference Track Posters</u></b><br /><a href="http://arxiv.org/abs/1511.05952"><br /></a> <a href="http://arxiv.org/abs/1511.05952">Prioritized Experience Replay </a><br /><i><span>Tom Schau</span></i><i>,</i><i><span>&#160;John Quan</span></i><i>,</i><i><span>&#160;Ioannis Antonoglou</span></i><i>,</i><i><span>&#160;David Silver</span></i><br /><a href="http://arxiv.org/abs/1509.06664"><br /></a> <a href="http://arxiv.org/abs/1509.06664">Reasoning about Entailment with Neural Attention </a><br /><i>Tim Rockt&#228;schel, <span>Edward Grefenstette</span></i><i>,&#160;</i><i><span>Karl Moritz Hermann</span></i><i>,</i><i><span>&#160;Tom&#225;&#353; Ko&#269;isk&#253;</span></i><i>,</i><i><span>&#160;Phil Blunsom</span></i><br /><br /><a href="http://arxiv.org/abs/1511.04834">Neural Programmer: Inducing Latent Programs With Gradient Descent </a><br /><i>Arvind Neelakantan, <span>Quoc Le</span></i><i>,</i><i><span>&#160;Ilya Sutskever</span></i><br /><a href="http://arxiv.org/abs/1511.05176"><br /></a> <a href="http://arxiv.org/abs/1511.05176">MuProp: Unbiased Backpropagation For Stochastic Neural Networks</a><br /><i>Shixiang Gu, <span>Sergey Levine</span></i><i>,</i><i><span>&#160;Ilya Sutskever</span></i><i>,</i><i><span>&#160;Andriy Mnih</span></i><br /><a href="http://arxiv.org/abs/1511.06114"><br /></a> <a href="http://arxiv.org/abs/1511.06114">Multi-Task Sequence to Sequence Learning </a><br /><i>Minh-Thang Luong, <span>Quoc Le</span></i><i>,&#160;</i><i><span>Ilya Sutskever</span></i><i>,</i><i><span>&#160;Oriol Vinyals</span></i><i>,</i><i><span>&#160;Lukasz Kaiser</span></i><br /><br /><a href="http://arxiv.org/abs/1511.04581">A Test of Relative Similarity for Model Selection in Generative Models </a><br /><i>Eugene Belilovsky, Wacha Bounliphone, Matthew Blaschko, <span>Ioannis Antonoglou</span>, Arthur Gretton</i><br /><br /><a href="http://arxiv.org/abs/1509.02971">Continuous control with deep reinforcement learning</a><br /><i><span>Timothy Lillicrap</span></i><i>,</i><i><span>&#160;Jonathan Hunt</span></i><i>,&#160;</i><i><span>Alexander Pritzel</span></i><i>,</i><i><span>&#160;Nicolas Heess</span></i><i>,</i><i><span>&#160;Tom Erez</span></i><i>,</i><i><span>&#160;Yuval Tassa</span></i><i>,</i><i><span>&#160;David Silver</span></i><i>,</i><i><span>&#160;Daan Wierstra</span></i><br /><br /><a href="http://arxiv.org/abs/1511.06295">Policy Distillation</a><br /><i><span>Andrei Rusu</span></i><i>,</i><i><span>&#160;Sergio Gomez</span></i><i>,</i><i><span>&#160;</span>Caglar Gulcehre,<span> Guillaume Desjardins</span></i><i>,</i><i><span>&#160;James Kirkpatrick</span></i><i>,</i><i><span>&#160;Razvan Pascanu</span></i><i>,</i><i><span>&#160;Volodymyr Mnih</span></i><i>,</i><i><span>&#160;Koray Kavukcuoglu</span></i><i>,</i><i><span>&#160;Raia Hadsell</span></i><br /><br /><a href="http://arxiv.org/abs/1511.06392">Neural Random-Access Machines</a><br /><i><span>Karol Kurach</span></i><i>,</i><i><span>&#160;Marcin Andrychowicz</span></i><i>,</i><i><span>&#160;Ilya Sutskever</span></i><br /><br /><a href="http://arxiv.org/abs/1511.06085">Variable Rate Image Compression with Recurrent Neural Networks </a><br /><i><span>George Toderici</span></i><i>,</i><i><span>&#160;Sean O'Malley</span></i><i>,</i><i><span>&#160;Damien Vincent</span></i><i>,</i><i><span>&#160;Sung Jin Hwang</span></i><i>,</i><i><span>&#160;Michele Covell</span></i><i>,</i><i><span>&#160;Shumeet Baluja</span></i><i>,</i><i><span>&#160;Rahul Sukthankar</span></i><i>,</i><i><span>&#160;David Minnen</span></i><br /><br /><a href="http://arxiv.org/abs/1511.06391">Order Matters: Sequence to Sequence for Sets</a><br /><i><span>Oriol Vinyals</span></i><i>,</i><i><span>&#160;Samy Bengio</span></i><i>,</i><i><span>&#160;Manjunath Kudlur</span></i><br /><a href="http://arxiv.org/abs/1507.01526"><br /></a> <a href="http://arxiv.org/abs/1507.01526">Grid Long Short-Term Memory</a><br /><i><span>Nal Kalchbrenner</span></i><i>,</i><i><span>&#160;Alex Graves</span></i><i>,</i><i><span>&#160;Ivo Danihelka</span></i><br /><a href="http://arxiv.org/abs/1511.08228"><br /></a> <a href="http://arxiv.org/abs/1511.08228">Neural GPUs Learn Algorithms</a><br /><i><span>Lukasz Kaiser</span></i><i>,</i><i><span>&#160;Ilya Sutskever</span></i><br /><br /><a href="http://arxiv.org/abs/1511.05946">ACDC: A Structured Efficient Linear Layer</a><br /><i>Marcin Moczulski, <span>Misha Denil</span>, Jeremy Appleyard, <span>Nando de Freitas</span></i><br /><br /><b><u>Workshop Track Posters</u></b><br /><a href="http://beta.openreview.net/forum?id=D1VDZ5kMAu5jEJ1zfEWL"><br /></a> <a href="http://beta.openreview.net/forum?id=D1VDZ5kMAu5jEJ1zfEWL">Revisiting Distributed Synchronous SGD </a><br /><i><span>Jianmin Chen</span></i><i>,</i><i><span>&#160;Rajat Monga</span></i><i>,</i><i><span>&#160;Samy Bengio</span></i><i>,</i><i><span>&#160;Rafal Jozefowicz</span></i><br /><br /><a href="http://beta.openreview.net/forum?id=P7q1lVQQvSKvjNORtJZL">Black Box Variational Inference for State Space Models </a><br /><i>Evan Archer, Il Memming Park, <span>Lars Buesing</span>, John Cunningham, Liam Paninski</i><br /><a href="http://beta.openreview.net/forum?id=BNYYGWVA1F7PwR1riED4"><br /></a> <a href="http://beta.openreview.net/forum?id=BNYYGWVA1F7PwR1riED4">A Minimalistic Approach to Sum-Product Network Learning for Real Applications </a><br /><i>Viktoriya Krakovna, <span>Moshe Looks</span></i><br /><br /><a href="http://beta.openreview.net/forum?id=E8Vg037q7f31v0m2iqn3">Efficient Inference in Occlusion-Aware Generative Models of Images </a><br /><i><span>Jonathan Huang</span>, <span>Kevin Murphy</span></i><br /><a href="http://beta.openreview.net/forum?id=q7kqBkL33f8LEkD3t7X9"><br /></a> <a href="http://beta.openreview.net/forum?id=q7kqBkL33f8LEkD3t7X9">Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning </a><br /><i><span>Christian Szegedy</span></i><i>,</i><i><span>&#160;Sergey Ioffe</span></i><i>,</i><i><span>&#160;Vincent Vanhoucke</span></i><br /><br /><a href="http://beta.openreview.net/forum?id=E8VEozRYyi31v0m2iDwy">Deep Autoresolution Networks </a><br /><i><span>Gabriel Pereyra</span>, <span>Christian Szegedy</span></i><br /><br /><a href="http://beta.openreview.net/forum?id=oVgo4M4RRIrlgPMRsBz5">Learning visual groups from co-occurrences in space and time </a><br /><i>Phillip Isola, Daniel Zoran, <span>Dilip Krishnan</span>, Edward H. Adelson</i><br /><br /><a href="http://beta.openreview.net/forum?id=ZY9xxQDMMu5Pk8ELfEz4">Adding Gradient Noise Improves Learning For Very Deep Networks </a><br /><i>Arvind Neelakantan, Luke Vilnis, <span>Quoc V. Le</span></i><i>,</i><i><span>&#160;Ilya Sutskever</span></i><i>,</i><i><span>&#160;Lukasz Kaiser</span></i><i>,</i><i><span>&#160;Karol Kurach</span>, James Martens</i><br /><br /><a href="http://beta.openreview.net/forum?id=2xwp4Zwr3TpKBZvXtWoj">Adversarial Autoencoders </a><br /><i>Alireza Makhzani, <span>Jonathon Shlens</span></i><i>,</i><i><span>&#160;Navdeep Jaitly</span></i><i>,</i><i><span>&#160;Ian Goodfellow</span></i><br /><br /><a href="http://beta.openreview.net/forum?id=D1VVBv7BKS5jEJ1zfxJg">Generating Sentences from a Continuous Space </a><br /><i>Samuel R. Bowman, Luke Vilnis, <span>Oriol Vinyals</span></i><i>,</i><i><span>&#160;Andrew M. Dai</span></i><i>,</i><i><span>&#160;Rafal Jozefowicz</span></i><i>,</i><i><span>&#160;Samy Bengio</span></i>]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Dumitru Erhan, Gentleman Scientist</span><br /><br />This week, San Juan, Puerto Rico hosts the <a href="http://www.iclr.cc/doku.php?id=iclr2016:main">4th International Conference on Learning Representations</a> (ICLR 2016), a conference focused on how one can learn meaningful and useful representations of data for <a href="https://en.wikipedia.org/wiki/Machine_learning">Machine Learning</a>. ICLR includes conference and workshop tracks, with invited talks along with oral and poster presentations of some of the latest research on deep learning, metric learning, kernel learning, compositional models, non-linear structured prediction, and issues regarding non-convex optimization. <br /><br />At the forefront of innovation in cutting-edge technology in <a href="https://en.wikipedia.org/wiki/Artificial_neural_network">Neural Networks</a> and <a href="https://en.wikipedia.org/wiki/Deep_learning">Deep Learning</a>, Google focuses on both theory and application, developing learning approaches to understand and generalize. As Platinum Sponsor of ICLR 2016, Google will have a strong presence with over 40 researchers attending (many from the <a href="https://research.google.com/teams/brain/">Google Brain team</a> and <a href="https://deepmind.com/">Google DeepMind</a>), contributing to and learning from the broader academic research community by presenting papers and posters, in addition to participating on organizing committees and in workshops. <br /><br />If you are attending ICLR 2016, we hope you’ll stop by our booth and chat with our researchers about the projects and opportunities at Google that go into solving interesting problems for billions of people. You can also learn more about our research being presented at ICLR 2016 in the list below (Googlers highlighted in <span style="color: #3d85c6;">blue</span>).<br /><br /><b><u>Organizing Committee</u></b><br /><br /><b>Program Chairs</b><br /><i><span style="color: #3d85c6;">Samy Bengio</span>, Brian Kingsbury</i><br /><br /><b>Area Chairs include:</b><br /><i><span style="color: #3d85c6;">John Platt</span>, <span style="color: #3d85c6;">Tara Sanaith</span></i><br /><br /><b><u>Oral Sessions</u></b><br /><br /><a href="http://arxiv.org/abs/1511.06279">Neural Programmer-Interpreters</a>&nbsp;<i>(Best Paper Award Recipient)</i><br /><i>Scott Reed, <span style="color: #3d85c6;">Nando de Freitas</span></i><br /><br /><a href="http://arxiv.org/abs/1511.05641">Net2Net: Accelerating Learning via Knowledge Transfer </a><br /><i>Tianqi Chen, <span style="color: #3d85c6;">Ian Goodfellow</span>, <span style="color: #3d85c6;">Jon Shlens</span></i><br /><br /><b><u>Conference Track Posters</u></b><br /><a href="http://arxiv.org/abs/1511.05952"><br /></a> <a href="http://arxiv.org/abs/1511.05952">Prioritized Experience Replay </a><br /><i><span style="color: #3d85c6;">Tom Schau</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;John Quan</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Ioannis Antonoglou</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;David Silver</span></i><br /><a href="http://arxiv.org/abs/1509.06664"><br /></a> <a href="http://arxiv.org/abs/1509.06664">Reasoning about Entailment with Neural Attention </a><br /><i>Tim Rocktäschel, <span style="color: #3d85c6;">Edward Grefenstette</span></i><i>,&nbsp;</i><i><span style="color: #3d85c6;">Karl Moritz Hermann</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Tomáš Kočiský</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Phil Blunsom</span></i><br /><br /><a href="http://arxiv.org/abs/1511.04834">Neural Programmer: Inducing Latent Programs With Gradient Descent </a><br /><i>Arvind Neelakantan, <span style="color: #3d85c6;">Quoc Le</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Ilya Sutskever</span></i><br /><a href="http://arxiv.org/abs/1511.05176"><br /></a> <a href="http://arxiv.org/abs/1511.05176">MuProp: Unbiased Backpropagation For Stochastic Neural Networks</a><br /><i>Shixiang Gu, <span style="background-color: white; color: #3d85c6;">Sergey Levine</span></i><i>,</i><i><span style="background-color: white; color: #3d85c6;">&nbsp;Ilya Sutskever</span></i><i>,</i><i><span style="background-color: white; color: #3d85c6;">&nbsp;Andriy Mnih</span></i><br /><a href="http://arxiv.org/abs/1511.06114"><br /></a> <a href="http://arxiv.org/abs/1511.06114">Multi-Task Sequence to Sequence Learning </a><br /><i>Minh-Thang Luong, <span style="color: #3d85c6;">Quoc Le</span></i><i>,&nbsp;</i><i><span style="color: #3d85c6;">Ilya Sutskever</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Oriol Vinyals</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Lukasz Kaiser</span></i><br /><br /><a href="http://arxiv.org/abs/1511.04581">A Test of Relative Similarity for Model Selection in Generative Models </a><br /><i>Eugene Belilovsky, Wacha Bounliphone, Matthew Blaschko, <span style="color: #3d85c6;">Ioannis Antonoglou</span>, Arthur Gretton</i><br /><br /><a href="http://arxiv.org/abs/1509.02971">Continuous control with deep reinforcement learning</a><br /><i><span style="color: #3d85c6;">Timothy Lillicrap</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Jonathan Hunt</span></i><i>,&nbsp;</i><i><span style="color: #3d85c6;">Alexander Pritzel</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Nicolas Heess</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Tom Erez</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Yuval Tassa</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;David Silver</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Daan Wierstra</span></i><br /><br /><a href="http://arxiv.org/abs/1511.06295">Policy Distillation</a><br /><i><span style="color: #3d85c6;">Andrei Rusu</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Sergio Gomez</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;</span>Caglar Gulcehre,<span style="color: #3d85c6;"> Guillaume Desjardins</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;James Kirkpatrick</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Razvan Pascanu</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Volodymyr Mnih</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Koray Kavukcuoglu</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Raia Hadsell</span></i><br /><br /><a href="http://arxiv.org/abs/1511.06392">Neural Random-Access Machines</a><br /><i><span style="color: #3d85c6;">Karol Kurach</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Marcin Andrychowicz</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Ilya Sutskever</span></i><br /><br /><a href="http://arxiv.org/abs/1511.06085">Variable Rate Image Compression with Recurrent Neural Networks </a><br /><i><span style="color: #3d85c6;">George Toderici</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Sean O'Malley</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Damien Vincent</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Sung Jin Hwang</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Michele Covell</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Shumeet Baluja</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Rahul Sukthankar</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;David Minnen</span></i><br /><br /><a href="http://arxiv.org/abs/1511.06391">Order Matters: Sequence to Sequence for Sets</a><br /><i><span style="color: #3d85c6;">Oriol Vinyals</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Samy Bengio</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Manjunath Kudlur</span></i><br /><a href="http://arxiv.org/abs/1507.01526"><br /></a> <a href="http://arxiv.org/abs/1507.01526">Grid Long Short-Term Memory</a><br /><i><span style="color: #3d85c6;">Nal Kalchbrenner</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Alex Graves</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Ivo Danihelka</span></i><br /><a href="http://arxiv.org/abs/1511.08228"><br /></a> <a href="http://arxiv.org/abs/1511.08228">Neural GPUs Learn Algorithms</a><br /><i><span style="color: #3d85c6;">Lukasz Kaiser</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Ilya Sutskever</span></i><br /><br /><a href="http://arxiv.org/abs/1511.05946">ACDC: A Structured Efficient Linear Layer</a><br /><i>Marcin Moczulski, <span style="color: #3d85c6;">Misha Denil</span>, Jeremy Appleyard, <span style="color: #3d85c6;">Nando de Freitas</span></i><br /><br /><b><u>Workshop Track Posters</u></b><br /><a href="http://beta.openreview.net/forum?id=D1VDZ5kMAu5jEJ1zfEWL"><br /></a> <a href="http://beta.openreview.net/forum?id=D1VDZ5kMAu5jEJ1zfEWL">Revisiting Distributed Synchronous SGD </a><br /><i><span style="color: #3d85c6;">Jianmin Chen</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Rajat Monga</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Samy Bengio</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Rafal Jozefowicz</span></i><br /><br /><a href="http://beta.openreview.net/forum?id=P7q1lVQQvSKvjNORtJZL">Black Box Variational Inference for State Space Models </a><br /><i>Evan Archer, Il Memming Park, <span style="color: #3d85c6;">Lars Buesing</span>, John Cunningham, Liam Paninski</i><br /><a href="http://beta.openreview.net/forum?id=BNYYGWVA1F7PwR1riED4"><br /></a> <a href="http://beta.openreview.net/forum?id=BNYYGWVA1F7PwR1riED4">A Minimalistic Approach to Sum-Product Network Learning for Real Applications </a><br /><i>Viktoriya Krakovna, <span style="color: #3d85c6;">Moshe Looks</span></i><br /><br /><a href="http://beta.openreview.net/forum?id=E8Vg037q7f31v0m2iqn3">Efficient Inference in Occlusion-Aware Generative Models of Images </a><br /><i><span style="color: #3d85c6;">Jonathan Huang</span>, <span style="color: #3d85c6;">Kevin Murphy</span></i><br /><a href="http://beta.openreview.net/forum?id=q7kqBkL33f8LEkD3t7X9"><br /></a> <a href="http://beta.openreview.net/forum?id=q7kqBkL33f8LEkD3t7X9">Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning </a><br /><i><span style="color: #3d85c6;">Christian Szegedy</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Sergey Ioffe</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Vincent Vanhoucke</span></i><br /><br /><a href="http://beta.openreview.net/forum?id=E8VEozRYyi31v0m2iDwy">Deep Autoresolution Networks </a><br /><i><span style="color: #3d85c6;">Gabriel Pereyra</span>, <span style="color: #3d85c6;">Christian Szegedy</span></i><br /><br /><a href="http://beta.openreview.net/forum?id=oVgo4M4RRIrlgPMRsBz5">Learning visual groups from co-occurrences in space and time </a><br /><i>Phillip Isola, Daniel Zoran, <span style="color: #3d85c6;">Dilip Krishnan</span>, Edward H. Adelson</i><br /><br /><a href="http://beta.openreview.net/forum?id=ZY9xxQDMMu5Pk8ELfEz4">Adding Gradient Noise Improves Learning For Very Deep Networks </a><br /><i>Arvind Neelakantan, Luke Vilnis, <span style="color: #3d85c6;">Quoc V. Le</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Ilya Sutskever</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Lukasz Kaiser</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Karol Kurach</span>, James Martens</i><br /><br /><a href="http://beta.openreview.net/forum?id=2xwp4Zwr3TpKBZvXtWoj">Adversarial Autoencoders </a><br /><i>Alireza Makhzani, <span style="color: #3d85c6;">Jonathon Shlens</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Navdeep Jaitly</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Ian Goodfellow</span></i><br /><br /><a href="http://beta.openreview.net/forum?id=D1VVBv7BKS5jEJ1zfxJg">Generating Sentences from a Continuous Space </a><br /><i>Samuel R. Bowman, Luke Vilnis, <span style="color: #3d85c6;">Oriol Vinyals</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Andrew M. Dai</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Rafal Jozefowicz</span></i><i>,</i><i><span style="color: #3d85c6;">&nbsp;Samy Bengio</span></i>]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/research-at-google-and-iclr-2016/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>DeepMind moves to TensorFlow</title>
		<link>https://googledata.org/google-research/deepmind-moves-to-tensorflow/</link>
		<comments>https://googledata.org/google-research/deepmind-moves-to-tensorflow/#comments</comments>
		<pubDate>Fri, 29 Apr 2016 16:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=4c5a3433f7e9cc00a08cf8a0ef7dff09</guid>
		<description><![CDATA[<span>Posted by Koray Kavukcuoglu, Research Scientist, Google DeepMind</span><br /><br />At <a href="http://deepmind.com/">DeepMind</a>, we conduct state-of-the-art <a href="https://deepmind.com/publications.html">research</a> on a wide range of algorithms, from deep learning and reinforcement learning to systems neuroscience, towards the goal of building <a href="https://en.wikipedia.org/wiki/Artificial_general_intelligence">Artificial General Intelligence</a>. A key factor in facilitating rapid progress is the software environment used for research. For nearly four years, the open source <a href="http://torch.ch/">Torch7</a> machine learning library has served as our primary research platform, combining excellent flexibility with very fast runtime execution, enabling rapid prototyping. Our team has been proud to contribute to the open source project in capacities ranging from occasional bug fixes to being core maintainers of several crucial components.<br /><br />With Google&#8217;s recent open source release of <a href="http://tensorflow.org/">TensorFlow</a>, we initiated a project to test its suitability for our research environment. Over the last six months, we have re-implemented more than a dozen different projects in TensorFlow to develop a deeper understanding of its potential use cases and the tradeoffs for research. Today we are excited to announce that DeepMind will start using TensorFlow for all our future research. We believe that TensorFlow will enable us to execute our ambitious research goals at much larger scale and an even faster pace, providing us with a unique opportunity to further accelerate our research programme.<br /><br />As one of the core contributors of Torch7, I have had the pleasure of working closely with an excellent community of developers and researchers, and it has been amazing to see all the great work that has been built on top of the platform and the impact this has had on the field. Torch7 is currently being used by Facebook, Twitter, and many start-ups and academic labs as well as DeepMind, and I&#8217;m proud of the significant contribution it has made to a large community in both research and industry. Our transition to TensorFlow represents a new chapter, and I feel very excited about the prospect of DeepMind contributing heavily to another great open source machine learning platform that everyone can use to advance the state-of-the-art.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Koray Kavukcuoglu, Research Scientist, Google DeepMind</span><br /><br />At <a href="http://deepmind.com/">DeepMind</a>, we conduct state-of-the-art <a href="https://deepmind.com/publications.html">research</a> on a wide range of algorithms, from deep learning and reinforcement learning to systems neuroscience, towards the goal of building <a href="https://en.wikipedia.org/wiki/Artificial_general_intelligence">Artificial General Intelligence</a>. A key factor in facilitating rapid progress is the software environment used for research. For nearly four years, the open source <a href="http://torch.ch/">Torch7</a> machine learning library has served as our primary research platform, combining excellent flexibility with very fast runtime execution, enabling rapid prototyping. Our team has been proud to contribute to the open source project in capacities ranging from occasional bug fixes to being core maintainers of several crucial components.<br /><br />With Google’s recent open source release of <a href="http://tensorflow.org/">TensorFlow</a>, we initiated a project to test its suitability for our research environment. Over the last six months, we have re-implemented more than a dozen different projects in TensorFlow to develop a deeper understanding of its potential use cases and the tradeoffs for research. Today we are excited to announce that DeepMind will start using TensorFlow for all our future research. We believe that TensorFlow will enable us to execute our ambitious research goals at much larger scale and an even faster pace, providing us with a unique opportunity to further accelerate our research programme.<br /><br />As one of the core contributors of Torch7, I have had the pleasure of working closely with an excellent community of developers and researchers, and it has been amazing to see all the great work that has been built on top of the platform and the impact this has had on the field. Torch7 is currently being used by Facebook, Twitter, and many start-ups and academic labs as well as DeepMind, and I’m proud of the significant contribution it has made to a large community in both research and industry. Our transition to TensorFlow represents a new chapter, and I feel very excited about the prospect of DeepMind contributing heavily to another great open source machine learning platform that everyone can use to advance the state-of-the-art.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/deepmind-moves-to-tensorflow/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Computer Science Education for All Students</title>
		<link>https://googledata.org/google-research/computer-science-education-for-all-students/</link>
		<comments>https://googledata.org/google-research/computer-science-education-for-all-students/#comments</comments>
		<pubDate>Tue, 26 Apr 2016 13:15:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>
		<category><![CDATA[education]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=8d612ec342e421069472750fe00313b0</guid>
		<description><![CDATA[<span>Posted by Maggie Johnson, Director of Education and University Relations</span><br /><br /><i>(Cross-posted on the <a href="http://googleforeducation.blogspot.com/2016/04/computer-science-education-for-all.html">Google for Education Blog</a>)</i><br /><br />Computer science education is a pathway to innovation, to creativity, and to exciting career prospects. No longer considered an optional skill, CS is quickly becoming a &#8220;new basic&#8221;, foundational for learning. In order for our students to be equipped for the world of tomorrow, we need to provide them with access to computer science education today. <br /><br />At Google, we believe that all students deserve these opportunities. Today we <a href="http://www.csecoalition.org/">join</a> some of America&#8217;s leading companies, governors, and educators to support an <a href="https://www.washingtonpost.com/local/education/top-business-leaders-27-governors-urge-congress-to-boost-computer-science-education/2016/04/25/f161cbde-0ae7-11e6-bfa1-4efa856caf2a_story.html">open letter to Congress</a>, asking for funding to provide every student in every school the opportunity to learn computer science. Google has long been committed to <a href="https://www.google.com/edu/cs/">developing programs, resources, tools and community partnerships</a> that make computer science engaging and accessible for all students. <br /><br />We are strengthening that commitment today by announcing an additional investment of $10 million towards computer science education for 2017, along with the <a href="http://googleforeducation.blogspot.com/2016/01/CS4All.html">$23.5 million</a> that we have allocated for 2016. This funding will allow us to build more resources, scale our programs, and provide additional support to our partners, with a goal of reaching an additional 5 million students.<br /><br />With Congress&#8217; help, we can ensure that every child has access to computer science education. Please join us by signing our online petition at <a href="http://www.change.org/computerscience">www.change.org/computerscience</a>.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Maggie Johnson, Director of Education and University Relations</span><br /><br /><i>(Cross-posted on the <a href="http://googleforeducation.blogspot.com/2016/04/computer-science-education-for-all.html">Google for Education Blog</a>)</i><br /><br />Computer science education is a pathway to innovation, to creativity, and to exciting career prospects. No longer considered an optional skill, CS is quickly becoming a “new basic”, foundational for learning. In order for our students to be equipped for the world of tomorrow, we need to provide them with access to computer science education today. <br /><br />At Google, we believe that all students deserve these opportunities. Today we <a href="http://www.csecoalition.org/">join</a> some of America’s leading companies, governors, and educators to support an <a href="https://www.washingtonpost.com/local/education/top-business-leaders-27-governors-urge-congress-to-boost-computer-science-education/2016/04/25/f161cbde-0ae7-11e6-bfa1-4efa856caf2a_story.html">open letter to Congress</a>, asking for funding to provide every student in every school the opportunity to learn computer science. Google has long been committed to <a href="https://www.google.com/edu/cs/">developing programs, resources, tools and community partnerships</a> that make computer science engaging and accessible for all students. <br /><br />We are strengthening that commitment today by announcing an additional investment of $10 million towards computer science education for 2017, along with the <a href="http://googleforeducation.blogspot.com/2016/01/CS4All.html">$23.5 million</a> that we have allocated for 2016. This funding will allow us to build more resources, scale our programs, and provide additional support to our partners, with a goal of reaching an additional 5 million students.<br /><br />With Congress’ help, we can ensure that every child has access to computer science education. Please join us by signing our online petition at <a href="http://www.change.org/computerscience">www.change.org/computerscience</a>. ]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/computer-science-education-for-all-students/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Helping webmasters re-secure their sites</title>
		<link>https://googledata.org/google-research/helping-webmasters-re-secure-their-sites/</link>
		<comments>https://googledata.org/google-research/helping-webmasters-re-secure-their-sites/#comments</comments>
		<pubDate>Mon, 18 Apr 2016 16:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=308d8b3be4b5ebe17ffddc3167673b0f</guid>
		<description><![CDATA[<span>Posted by Kurt Thomas and Yuan Niu, Spam &#38; Abuse Research</span><br /><br />Every week, over <a href="https://www.google.com/transparencyreport/safebrowsing/">10 million users encounter harmful websites</a> that deliver malware and scams. Many of these sites are compromised personal blogs or small business pages that have fallen victim due to a weak password or outdated software. Safe Browsing and Google Search protect visitors from dangerous content by displaying browser warnings and labeling search results with <a href="https://support.google.com/websearch/answer/45449?hl=en">&#8216;this site may harm your computer&#8217;</a>. While this helps keep users safe in the moment, the compromised site remains a problem that needs to be fixed.<br /><br />Unfortunately, many webmasters for compromised sites are unaware anything is amiss. Worse yet, even when they learn of an incident, they may lack the security expertise to take action and address the root cause of compromise. Quoting one webmaster from a survey we conducted, &#8220;our daily and weekly backups were both infected&#8221; and even after seeking the help of a specialist, after &#8220;lots of wasted hours/days&#8221; the webmaster abandoned all attempts to restore the site and instead refocused his efforts on &#8220;rebuilding the site from scratch&#8221;.<br /><br />In order to find the best way to help webmasters clean-up from compromise, we recently teamed up with the University of California, Berkeley to explore how to quickly contact webmasters and expedite recovery while minimizing the distress involved. We&#8217;ve summarized our key lessons below. The full study, which you can read <a href="http://research.google.com/pubs/pub44924.html">here</a>, was recently presented at the <a href="http://www2016.ca/">International World Wide Web Conference</a>.<br /><br />When Google works directly with webmasters during critical moments like security breaches, we can help 75% of webmasters re-secure their content. The whole process takes a median of 3 days. This is a better experience for webmasters and their audience.<br /><br /><b>How many sites get compromised?</b><br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://1.bp.blogspot.com/-_IvVxz441jY/VxT-AjqgQSI/AAAAAAAAA-4/H0RFXx70EEwNmEvL85YibMKKYG3jYHW8wCLcB/s1600/image00.png"><img border="0" height="312" src="https://1.bp.blogspot.com/-_IvVxz441jY/VxT-AjqgQSI/AAAAAAAAA-4/H0RFXx70EEwNmEvL85YibMKKYG3jYHW8wCLcB/s640/image00.png" width="640"></a></td></tr><tr><td>Number of freshly compromised sites Google detects every week.</td></tr></tbody></table>Over the last year Google detected nearly 800,000 compromised websites&#8212;roughly 16,500 new sites every week from around the globe. Visitors to these sites are exposed to low-quality scam content and malware via <a href="https://security.googleblog.com/2008/02/all-your-iframe-are-point-to-us.html">drive-by downloads</a>. While browser and search warnings help protect visitors from harm, these warnings can at times feel punitive to webmasters who learn only after-the-fact that their site was compromised. To balance the safety of our users with the experience of webmasters, we set out to find the best approach to help webmasters recover from security breaches and ultimately reconnect websites with their audience.<br /><b><br /></b> <b>Finding the most effective ways to aid webmasters</b><br /><ol><li><b>Getting in touch with webmasters:</b> One of the hardest steps on the road to recovery is first getting in contact with webmasters. We tried three notification channels: email, browser warnings, and search warnings. For webmasters who proactively registered their site with <a href="https://www.google.com/webmaster">Search Console</a>, we found that email communication led to 75% of webmasters re-securing their pages. When we didn&#8217;t know a webmaster&#8217;s email address, browser warnings and search warnings helped 54% and 43% of sites clean up respectively.</li><li><b>Providing tips on cleaning up harmful content:</b> Attackers rely on hidden files, easy-to-miss redirects, and remote inclusions to serve scams and malware. This makes clean-up increasingly tricky. When we emailed webmasters, we included tips and samples of exactly which pages contained harmful content. This, combined with expedited notification, helped webmasters clean up 62% faster compared to no tips&#8212;usually within 3 days.</li><li><b>Making sure sites stay clean:</b> Once a site is no longer serving harmful content, it&#8217;s important to make sure attackers don&#8217;t reassert control. We monitored recently cleaned websites and found 12% were compromised again in 30 days. This illustrates the challenge involved in identifying the root cause of a breach versus dealing with the side-effects.</li></ol><b>Making security issues less painful for webmasters&#8212;and everyone</b><br /><br />We hope that webmasters never have to deal with a security incident. If you are a webmaster, there are some quick steps you can take to reduce your risk. We&#8217;ve made it <a href="https://security.googleblog.com/2015/02/safe-browsing-and-google-analytics.html">easier to receive security notifications through Google Analytics</a> as well as through <a href="https://www.google.com/webmaster">Search Console</a>. Make sure to register for both services. Also, we have laid out helpful tips for <a href="https://webmasters.googleblog.com/2015/07/nohacked-how-to-avoid-being-target-of.html">updating your site&#8217;s software</a> and <a href="https://webmasters.googleblog.com/2015/08/nohacked-using-two-factor.html">adding additional authentication</a> that will make your site safer.<br /><br />If you&#8217;re a hosting provider or building a service that needs to notify victims of compromise, understand that the entire process is distressing for users. Establish a reliable communication channel before a security incident occurs, make sure to provide victims with clear recovery steps, and promptly reply to inquiries so the process feels helpful, not punitive.<br /><br />As we work to make the web a safer place, we think it&#8217;s critical to empower webmasters and users to make good security decisions. It&#8217;s easy for the security community to be pessimistic about incident response being &#8216;too complex&#8217; for victims, but as our findings demonstrate, even just starting a dialogue can significantly expedite recovery.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Kurt Thomas and Yuan Niu, Spam &amp; Abuse Research</span><br /><br />Every week, over <a href="https://www.google.com/transparencyreport/safebrowsing/">10 million users encounter harmful websites</a> that deliver malware and scams. Many of these sites are compromised personal blogs or small business pages that have fallen victim due to a weak password or outdated software. Safe Browsing and Google Search protect visitors from dangerous content by displaying browser warnings and labeling search results with <a href="https://support.google.com/websearch/answer/45449?hl=en">‘this site may harm your computer’</a>. While this helps keep users safe in the moment, the compromised site remains a problem that needs to be fixed.<br /><br />Unfortunately, many webmasters for compromised sites are unaware anything is amiss. Worse yet, even when they learn of an incident, they may lack the security expertise to take action and address the root cause of compromise. Quoting one webmaster from a survey we conducted, “our daily and weekly backups were both infected” and even after seeking the help of a specialist, after “lots of wasted hours/days” the webmaster abandoned all attempts to restore the site and instead refocused his efforts on “rebuilding the site from scratch”.<br /><br />In order to find the best way to help webmasters clean-up from compromise, we recently teamed up with the University of California, Berkeley to explore how to quickly contact webmasters and expedite recovery while minimizing the distress involved. We’ve summarized our key lessons below. The full study, which you can read <a href="http://research.google.com/pubs/pub44924.html">here</a>, was recently presented at the <a href="http://www2016.ca/">International World Wide Web Conference</a>.<br /><br />When Google works directly with webmasters during critical moments like security breaches, we can help 75% of webmasters re-secure their content. The whole process takes a median of 3 days. This is a better experience for webmasters and their audience.<br /><br /><b>How many sites get compromised?</b><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-_IvVxz441jY/VxT-AjqgQSI/AAAAAAAAA-4/H0RFXx70EEwNmEvL85YibMKKYG3jYHW8wCLcB/s1600/image00.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="312" src="https://1.bp.blogspot.com/-_IvVxz441jY/VxT-AjqgQSI/AAAAAAAAA-4/H0RFXx70EEwNmEvL85YibMKKYG3jYHW8wCLcB/s640/image00.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Number of freshly compromised sites Google detects every week.</td></tr></tbody></table>Over the last year Google detected nearly 800,000 compromised websites—roughly 16,500 new sites every week from around the globe. Visitors to these sites are exposed to low-quality scam content and malware via <a href="https://security.googleblog.com/2008/02/all-your-iframe-are-point-to-us.html">drive-by downloads</a>. While browser and search warnings help protect visitors from harm, these warnings can at times feel punitive to webmasters who learn only after-the-fact that their site was compromised. To balance the safety of our users with the experience of webmasters, we set out to find the best approach to help webmasters recover from security breaches and ultimately reconnect websites with their audience.<br /><b><br /></b> <b>Finding the most effective ways to aid webmasters</b><br /><ol><li><b>Getting in touch with webmasters:</b> One of the hardest steps on the road to recovery is first getting in contact with webmasters. We tried three notification channels: email, browser warnings, and search warnings. For webmasters who proactively registered their site with <a href="https://www.google.com/webmaster">Search Console</a>, we found that email communication led to 75% of webmasters re-securing their pages. When we didn’t know a webmaster’s email address, browser warnings and search warnings helped 54% and 43% of sites clean up respectively.</li><li><b>Providing tips on cleaning up harmful content:</b> Attackers rely on hidden files, easy-to-miss redirects, and remote inclusions to serve scams and malware. This makes clean-up increasingly tricky. When we emailed webmasters, we included tips and samples of exactly which pages contained harmful content. This, combined with expedited notification, helped webmasters clean up 62% faster compared to no tips—usually within 3 days.</li><li><b>Making sure sites stay clean:</b> Once a site is no longer serving harmful content, it’s important to make sure attackers don’t reassert control. We monitored recently cleaned websites and found 12% were compromised again in 30 days. This illustrates the challenge involved in identifying the root cause of a breach versus dealing with the side-effects.</li></ol><b>Making security issues less painful for webmasters—and everyone</b><br /><br />We hope that webmasters never have to deal with a security incident. If you are a webmaster, there are some quick steps you can take to reduce your risk. We’ve made it <a href="https://security.googleblog.com/2015/02/safe-browsing-and-google-analytics.html">easier to receive security notifications through Google Analytics</a> as well as through <a href="https://www.google.com/webmaster">Search Console</a>. Make sure to register for both services. Also, we have laid out helpful tips for <a href="https://webmasters.googleblog.com/2015/07/nohacked-how-to-avoid-being-target-of.html">updating your site’s software</a> and <a href="https://webmasters.googleblog.com/2015/08/nohacked-using-two-factor.html">adding additional authentication</a> that will make your site safer.<br /><br />If you’re a hosting provider or building a service that needs to notify victims of compromise, understand that the entire process is distressing for users. Establish a reliable communication channel before a security incident occurs, make sure to provide victims with clear recovery steps, and promptly reply to inquiries so the process feels helpful, not punitive.<br /><br />As we work to make the web a safer place, we think it’s critical to empower webmasters and users to make good security decisions. It’s easy for the security community to be pessimistic about incident response being ‘too complex’ for victims, but as our findings demonstrate, even just starting a dialogue can significantly expedite recovery.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/helping-webmasters-re-secure-their-sites/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Announcing TensorFlow 0.8 – now with distributed computing support!</title>
		<link>https://googledata.org/google-research/announcing-tensorflow-0-8-now-with-distributed-computing-support/</link>
		<comments>https://googledata.org/google-research/announcing-tensorflow-0-8-now-with-distributed-computing-support/#comments</comments>
		<pubDate>Wed, 13 Apr 2016 17:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=deace4aecc44b8a561fbd49e261e5d20</guid>
		<description><![CDATA[<span>Posted by Derek Murray, Software Engineer</span><br /><br />Google uses machine learning across a wide range of its products. In order to continually improve our models, it's crucial that the training process be as fast as possible. One way to do this is to run <a href="https://www.tensorflow.org/">TensorFlow</a> across hundreds of machines, which shortens the training process for some models from weeks to hours, and allows us to experiment with models of increasing size and sophistication. Ever since we released TensorFlow as an open-source project, distributed training support has been one of the most requested features. Now the wait is over. <br /><br />Today, we're excited to release TensorFlow 0.8 with distributed computing support, including everything you need to train distributed models on your own infrastructure. Distributed TensorFlow is powered by the high-performance <a href="http://www.grpc.io/">gRPC</a> library, which supports training on hundreds of machines in parallel. It complements our recent announcement of <a href="http://googleresearch.blogspot.com/2016/03/machine-learning-in-cloud-with.html">Google Cloud Machine Learning</a>, which enables you to train and serve your TensorFlow models using the power of the Google Cloud Platform.<br /><br />To coincide with the TensorFlow 0.8 release, we have published a <a href="https://github.com/tensorflow/models/tree/master/inception">distributed trainer</a> for the I<a href="http://googleresearch.blogspot.com/2016/03/train-your-own-image-classifier-with.html">nception image classification</a> neural network in the TensorFlow models repository. Using the distributed trainer, we trained the Inception network to 78% accuracy in less than 65 hours using 100 GPUs. Even small clusters&#8212;or a couple of machines under your desk&#8212;can benefit from distributed TensorFlow, since adding more GPUs improves the overall throughput, and produces accurate results sooner.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://1.bp.blogspot.com/-3e8ukpeMD5g/Vw5r8lIuUQI/AAAAAAAAA-g/UEHdwqD1YKwVrUKDzKpy356yxo_0xJW6ACLcB/s1600/image00.png"><img border="0" height="376" src="https://1.bp.blogspot.com/-3e8ukpeMD5g/Vw5r8lIuUQI/AAAAAAAAA-g/UEHdwqD1YKwVrUKDzKpy356yxo_0xJW6ACLcB/s640/image00.png" width="640"></a></td></tr><tr><td>TensorFlow can speed up Inception training by a factor of 56, using 100 GPUs.</td></tr></tbody></table>The distributed trainer also enables you to scale out training using a cluster management system like <a href="http://kubernetes.io/">Kubernetes</a>. Furthermore, once you have trained your model, you can deploy to production and <a href="http://blog.kubernetes.io/2016/03/scaling-neural-network-image-classification-using-Kubernetes-with-TensorFlow-Serving.html">speed up inference using TensorFlow Serving on Kubernetes</a>.<br /><br />Beyond distributed Inception, the 0.8 release includes <a href="https://www.tensorflow.org/api_docs/python/train.html#distributed-execution">new libraries</a> for defining your own distributed models. TensorFlow's distributed architecture permits a great deal of flexibility in defining your model, because every process in the cluster can perform general-purpose computation. Our previous system <a href="http://research.google.com/archive/large_deep_networks_nips2012.html">DistBelief </a>(like many systems that have followed it) used special "parameter servers" to manage the shared model parameters, where the parameter servers had a simple read/write interface for fetching and updating shared parameters. In TensorFlow, all computation&#8212;including parameter management&#8212;is represented in the dataflow graph, and the system maps the graph onto heterogeneous devices (like multi-core CPUs, general-purpose GPUs, and mobile processors) in the available processes. To make TensorFlow easier to use, we have included Python libraries that make it easy to write a model that runs on a single process and scales to use multiple replicas for training. <br /><br />This architecture makes it easier to scale a single-process job up to use a cluster, and also to experiment with novel architectures for distributed training. As an example, my colleagues have recently shown that <a href="http://arxiv.org/abs/1604.00981">synchronous SGD with backup workers</a>, implemented in the TensorFlow graph, achieves improved time-to-accuracy for image model training.<br /><br />The current version of distributed computing support in TensorFlow is just the start. We are continuing to research ways of improving the performance of distributed training&#8212;both through engineering and algorithmic improvements&#8212;and will share these improvements with the community <a href="https://github.com/tensorflow/tensorflow">on GitHub</a>. However, getting to this point would not have been possible without help from the following people:<br /><ul><li><b>TensorFlow training libraries</b> - Jianmin Chen, Matthieu Devin, Sherry Moore and Sergio Guadarrama</li><li><b>TensorFlow core</b> - Zhifeng Chen, Manjunath Kudlur and Vijay Vasudevan</li><li><b>Testing</b> - Shanqing Cai</li><li><b>Inception model architecture </b>- Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Jonathon Shlens and Zbigniew Wojna</li><li><b>Project management</b> - Amy McDonald Sandjideh</li><li><b>Engineering leadership</b> - Jeff Dean and Rajat Monga</li></ul>]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Derek Murray, Software Engineer</span><br /><br />Google uses machine learning across a wide range of its products. In order to continually improve our models, it's crucial that the training process be as fast as possible. One way to do this is to run <a href="https://www.tensorflow.org/">TensorFlow</a> across hundreds of machines, which shortens the training process for some models from weeks to hours, and allows us to experiment with models of increasing size and sophistication. Ever since we released TensorFlow as an open-source project, distributed training support has been one of the most requested features. Now the wait is over. <br /><br />Today, we're excited to release TensorFlow 0.8 with distributed computing support, including everything you need to train distributed models on your own infrastructure. Distributed TensorFlow is powered by the high-performance <a href="http://www.grpc.io/">gRPC</a> library, which supports training on hundreds of machines in parallel. It complements our recent announcement of <a href="http://googleresearch.blogspot.com/2016/03/machine-learning-in-cloud-with.html">Google Cloud Machine Learning</a>, which enables you to train and serve your TensorFlow models using the power of the Google Cloud Platform.<br /><br />To coincide with the TensorFlow 0.8 release, we have published a <a href="https://github.com/tensorflow/models/tree/master/inception">distributed trainer</a> for the I<a href="http://googleresearch.blogspot.com/2016/03/train-your-own-image-classifier-with.html">nception image classification</a> neural network in the TensorFlow models repository. Using the distributed trainer, we trained the Inception network to 78% accuracy in less than 65 hours using 100 GPUs. Even small clusters—or a couple of machines under your desk—can benefit from distributed TensorFlow, since adding more GPUs improves the overall throughput, and produces accurate results sooner.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-3e8ukpeMD5g/Vw5r8lIuUQI/AAAAAAAAA-g/UEHdwqD1YKwVrUKDzKpy356yxo_0xJW6ACLcB/s1600/image00.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="376" src="https://1.bp.blogspot.com/-3e8ukpeMD5g/Vw5r8lIuUQI/AAAAAAAAA-g/UEHdwqD1YKwVrUKDzKpy356yxo_0xJW6ACLcB/s640/image00.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">TensorFlow can speed up Inception training by a factor of 56, using 100 GPUs.</td></tr></tbody></table>The distributed trainer also enables you to scale out training using a cluster management system like <a href="http://kubernetes.io/">Kubernetes</a>. Furthermore, once you have trained your model, you can deploy to production and <a href="http://blog.kubernetes.io/2016/03/scaling-neural-network-image-classification-using-Kubernetes-with-TensorFlow-Serving.html">speed up inference using TensorFlow Serving on Kubernetes</a>.<br /><br />Beyond distributed Inception, the 0.8 release includes <a href="https://www.tensorflow.org/api_docs/python/train.html#distributed-execution">new libraries</a> for defining your own distributed models. TensorFlow's distributed architecture permits a great deal of flexibility in defining your model, because every process in the cluster can perform general-purpose computation. Our previous system <a href="http://research.google.com/archive/large_deep_networks_nips2012.html">DistBelief </a>(like many systems that have followed it) used special "parameter servers" to manage the shared model parameters, where the parameter servers had a simple read/write interface for fetching and updating shared parameters. In TensorFlow, all computation—including parameter management—is represented in the dataflow graph, and the system maps the graph onto heterogeneous devices (like multi-core CPUs, general-purpose GPUs, and mobile processors) in the available processes. To make TensorFlow easier to use, we have included Python libraries that make it easy to write a model that runs on a single process and scales to use multiple replicas for training. <br /><br />This architecture makes it easier to scale a single-process job up to use a cluster, and also to experiment with novel architectures for distributed training. As an example, my colleagues have recently shown that <a href="http://arxiv.org/abs/1604.00981">synchronous SGD with backup workers</a>, implemented in the TensorFlow graph, achieves improved time-to-accuracy for image model training.<br /><br />The current version of distributed computing support in TensorFlow is just the start. We are continuing to research ways of improving the performance of distributed training—both through engineering and algorithmic improvements—and will share these improvements with the community <a href="https://github.com/tensorflow/tensorflow">on GitHub</a>. However, getting to this point would not have been possible without help from the following people:<br /><ul><li><b>TensorFlow training libraries</b> - Jianmin Chen, Matthieu Devin, Sherry Moore and Sergio Guadarrama</li><li><b>TensorFlow core</b> - Zhifeng Chen, Manjunath Kudlur and Vijay Vasudevan</li><li><b>Testing</b> - Shanqing Cai</li><li><b>Inception model architecture </b>- Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Jonathon Shlens and Zbigniew Wojna</li><li><b>Project management</b> - Amy McDonald Sandjideh</li><li><b>Engineering leadership</b> - Jeff Dean and Rajat Monga</li></ul>]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/announcing-tensorflow-0-8-now-with-distributed-computing-support/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>All of Google’s CS Education Programs and Tools in One Place</title>
		<link>https://googledata.org/google-research/all-of-googles-cs-education-programs-and-tools-in-one-place/</link>
		<comments>https://googledata.org/google-research/all-of-googles-cs-education-programs-and-tools-in-one-place/#comments</comments>
		<pubDate>Tue, 12 Apr 2016 17:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>
		<category><![CDATA[education]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=bab49dbeffc5de64a1e78f798743076d</guid>
		<description><![CDATA[<span>Posted by Chris Stephenson, Head of Computer Science Education Programs</span><br /><br /><i>(Cross-posted on the <a href="http://googleforeducation.blogspot.com/2016/04/all-of-googles-cs-education-programs.html">Google for Education Blog</a>) </i><br /><br />Interest in computer science education is growing rapidly; even the President of the United States has spoken of the importance of <a href="https://www.youtube.com/watch?v=8sthaV8ddJ4">giving every student an opportunity to learn computer science</a>. Google has been a supportive partner in these efforts by developing high-quality learning programs, educational tools and resources to advance new approaches in computer science education. To make it easier for all students and educators to access this information, today we&#8217;re launching a <a href="http://www.google.com/edu/cs">CS EDU website</a> that specifically outlines our initiatives in CS education. <br /><div><a href="https://1.bp.blogspot.com/-EGwX2Fh5JaM/Vwwp04OM4RI/AAAAAAAAA-M/znj7rp_yxgYh1te5_Cbf8JiRtw16b7cOQ/s1600/image00.png"><img border="0" height="496" src="https://1.bp.blogspot.com/-EGwX2Fh5JaM/Vwwp04OM4RI/AAAAAAAAA-M/znj7rp_yxgYh1te5_Cbf8JiRtw16b7cOQ/s640/image00.png" width="640"></a></div>The President&#8217;s call to action is grounded in economic realities coupled with a lack of access and ongoing system inequities. There is an increasing need for computer science skills in the workforce, with the <a href="http://www.bls.gov/opub/mlr/2013/article/occupational-employment-projections-to-2022.htm">Bureau of Labor Statistics</a> estimating that there will be more than 1.3 million job openings in computer and mathematical occupations by 2022. The majority of these jobs will require at least a Bachelor&#8217;s degree in Computer Science or in Information Technology, yet the U.S. is only producing 16,000 CS undergraduates per year.<br /><br />One of the reasons there are so few computer science graduates is that too few students have the opportunity to study computer science in high school. <a href="https://services.google.com/fh/files/misc/searching-for-computer-science_report.pdf">Google&#8217;s research</a> shows that only 25% of U.S. schools currently offer CS with programming or coding, despite the fact that 91% of parents want their children to learn computer science. In addition, schools with higher percentages of students living in households below the poverty line are even less likely to offer rigorous computer science courses.<br /><br />Increasing access to computer science for all learners requires tremendous commitment from a wide range of stakeholders, and we strive to be a strong supportive partner of these efforts. Our new <a href="http://www.google.com/edu/cs">CS EDU</a> website shows all the ways Google is working to address the need for improved access to high quality computer science learning in formal and informal education. Some current programs you&#8217;ll find there include:<br /><br /><ul><li><a href="https://www.cs-first.com/?utm_source=blog&#38;utm_medium=blog&#38;utm_campaign=csedublog0416">CS First</a>: providing more than 360,000 middle school students with an opportunity to create technology through free computer science clubs</li><li><a href="https://www.google.com/edu/resources/programs/exploring-computational-thinking/">Exploring Computational Thinking</a>: sharing more than 130 lesson plans aligned to international standards for students aged 8 to 18</li><li><a href="https://ignitecs.withgoogle.com/">igniteCS</a>: offering support and mentoring to address the retention problem in diverse student populations at the undergraduate level in more than 40 universities and counting</li><li><a href="https://developers.google.com/blockly/">Blockly</a> and other programming tools powering Code.org&#8217;s <a href="https://hourofcode.com/us">Hour of Code</a> (2 million users)</li><li><a href="https://www.madewithcode.com/">Google&#8217;s Made with Code</a>: movement that inspires millions of girls to learn to code and to see it as a means to pursue their dream careers (more than 10 million unique visitors)</li><li>...and many more!</li></ul><br />Computer science education is a pathway to innovation, to creativity and to exciting career opportunities, and Google believes that all students deserve these opportunities. That is why we are committed to developing programs, resources, tools and community partnerships that make computer science engaging and accessible for all students. With the launch of our <a href="http://www.google.com/edu/cs">CS EDU website</a>, all of these programs are at your fingertips.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Chris Stephenson, Head of Computer Science Education Programs</span><br /><br /><i>(Cross-posted on the <a href="http://googleforeducation.blogspot.com/2016/04/all-of-googles-cs-education-programs.html">Google for Education Blog</a>) </i><br /><br />Interest in computer science education is growing rapidly; even the President of the United States has spoken of the importance of <a href="https://www.youtube.com/watch?v=8sthaV8ddJ4">giving every student an opportunity to learn computer science</a>. Google has been a supportive partner in these efforts by developing high-quality learning programs, educational tools and resources to advance new approaches in computer science education. To make it easier for all students and educators to access this information, today we’re launching a <a href="http://www.google.com/edu/cs">CS EDU website</a> that specifically outlines our initiatives in CS education. <br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-EGwX2Fh5JaM/Vwwp04OM4RI/AAAAAAAAA-M/znj7rp_yxgYh1te5_Cbf8JiRtw16b7cOQ/s1600/image00.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="496" src="https://1.bp.blogspot.com/-EGwX2Fh5JaM/Vwwp04OM4RI/AAAAAAAAA-M/znj7rp_yxgYh1te5_Cbf8JiRtw16b7cOQ/s640/image00.png" width="640" /></a></div>The President’s call to action is grounded in economic realities coupled with a lack of access and ongoing system inequities. There is an increasing need for computer science skills in the workforce, with the <a href="http://www.bls.gov/opub/mlr/2013/article/occupational-employment-projections-to-2022.htm">Bureau of Labor Statistics</a> estimating that there will be more than 1.3 million job openings in computer and mathematical occupations by 2022. The majority of these jobs will require at least a Bachelor’s degree in Computer Science or in Information Technology, yet the U.S. is only producing 16,000 CS undergraduates per year.<br /><br />One of the reasons there are so few computer science graduates is that too few students have the opportunity to study computer science in high school. <a href="https://services.google.com/fh/files/misc/searching-for-computer-science_report.pdf">Google’s research</a> shows that only 25% of U.S. schools currently offer CS with programming or coding, despite the fact that 91% of parents want their children to learn computer science. In addition, schools with higher percentages of students living in households below the poverty line are even less likely to offer rigorous computer science courses.<br /><br />Increasing access to computer science for all learners requires tremendous commitment from a wide range of stakeholders, and we strive to be a strong supportive partner of these efforts. Our new <a href="http://www.google.com/edu/cs">CS EDU</a> website shows all the ways Google is working to address the need for improved access to high quality computer science learning in formal and informal education. Some current programs you’ll find there include:<br /><br /><ul><li><a href="https://www.cs-first.com/?utm_source=blog&amp;utm_medium=blog&amp;utm_campaign=csedublog0416">CS First</a>: providing more than 360,000 middle school students with an opportunity to create technology through free computer science clubs</li><li><a href="https://www.google.com/edu/resources/programs/exploring-computational-thinking/">Exploring Computational Thinking</a>: sharing more than 130 lesson plans aligned to international standards for students aged 8 to 18</li><li><a href="https://ignitecs.withgoogle.com/">igniteCS</a>: offering support and mentoring to address the retention problem in diverse student populations at the undergraduate level in more than 40 universities and counting</li><li><a href="https://developers.google.com/blockly/">Blockly</a> and other programming tools powering Code.org’s <a href="https://hourofcode.com/us">Hour of Code</a> (2 million users)</li><li><a href="https://www.madewithcode.com/">Google’s Made with Code</a>: movement that inspires millions of girls to learn to code and to see it as a means to pursue their dream careers (more than 10 million unique visitors)</li><li>...and many more!</li></ul><br />Computer science education is a pathway to innovation, to creativity and to exciting career opportunities, and Google believes that all students deserve these opportunities. That is why we are committed to developing programs, resources, tools and community partnerships that make computer science engaging and accessible for all students. With the launch of our <a href="http://www.google.com/edu/cs">CS EDU website</a>, all of these programs are at your fingertips.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/all-of-googles-cs-education-programs-and-tools-in-one-place/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Genomic Data Processing on Google Cloud Platform</title>
		<link>https://googledata.org/google-research/genomic-data-processing-on-google-cloud-platform/</link>
		<comments>https://googledata.org/google-research/genomic-data-processing-on-google-cloud-platform/#comments</comments>
		<pubDate>Wed, 06 Apr 2016 06:30:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=cccf52dc4b41c3a2493bc6a539a7249f</guid>
		<description><![CDATA[<span>Posted by Dr. Stacey Gabriel, Director of the Genomics Platform at the Broad Institute of MIT and Harvard</span><br /><br /><i>Today we hear from Broad Institute of MIT and Harvard about how their researchers and software engineers are collaborating closely with the Google Genomics team on large-scale genomic data analysis. They&#8217;ve already reduced the time and cost for whole genome processing by several fold, helping researchers think even bigger. Broad&#8217;s open source tools, developed in close <a href="https://cloudplatform.googleblog.com/2015/06/Google-Genomics-and-Broad-Institute-Team-Up-to-Tackle-Genomic-Data.html">collaboration with Google Genomics</a>, will also be made available to the wider research community. <br />&#8211; Jonathan Bingham, Product Manager, Google Genomics</i><br /><br /><table cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://4.bp.blogspot.com/-xIlCwubRzLQ/VwRBiP-L1xI/AAAAAAAAA9o/GBuokDiS1aQ1tlX4cYoVeh1V0zfoV-7jA/s1600/image02.jpg"><img border="0" height="265" src="https://4.bp.blogspot.com/-xIlCwubRzLQ/VwRBiP-L1xI/AAAAAAAAA9o/GBuokDiS1aQ1tlX4cYoVeh1V0zfoV-7jA/s400/image02.jpg" width="400"></a></td></tr><tr><td>Dr. Stacey Gabriel, Director of the <br />Genomics Platform at the Broad Institute</td></tr></tbody></table>As one of the largest genome sequencing centers in the world, the <a href="http://www.broadinstitute.org/">Broad Institute</a> of MIT and Harvard generates a lot of data. Our DNA sequencers produce more than 20 Terabytes (TB) of genomic data per day, and they run 365 days a year. Moreover, our rate of data generation is not only growing, but accelerating &#8211; our output increased more than two-fold last year, and nearly two-fold the previous year. We are not alone in facing this embarrassment of riches; across the whole genomics community, the rate of data production is doubling about every eight months with no end in sight.<br /><br />Here at  Broad, our team of software engineers and methods developers have spent the last year working to re-architect our production sequencing environment for the cloud. This has been no small feat, especially as we had to build the plane while we flew it! It required an entirely new system for developing and deploying pipelines (which we call <a href="https://github.com/broadinstitute/cromwell">Cromwell</a>), as well as a new framework for wet lab quality control that uncouples data generation from data processing.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://1.bp.blogspot.com/-0jAq6VN1y50/VwRCSEIaKyI/AAAAAAAAA9w/VukpNxJPgDcHuqUmcb8U85GAhbvgfKFRA/s1600/image00.png"><img border="0" height="168" src="https://1.bp.blogspot.com/-0jAq6VN1y50/VwRCSEIaKyI/AAAAAAAAA9w/VukpNxJPgDcHuqUmcb8U85GAhbvgfKFRA/s640/image00.png" width="640"></a></td></tr><tr><td>Courtesy: Broad Institute of MIT and Harvard</td></tr></tbody></table>Last summer Broad and Google <a href="http://www.broadinstitute.org/google">announced a collaboration</a> to develop a safe, secure and  scalable cloud computing infrastructure capable of storing and processing enormous datasets. We also set out to build cloud-supported tools to analyze such data and unravel long-standing mysteries about human health. Our engineers collaborate closely; we teach them about genomic data science and genomic data engineering, and they teach us about cloud computing and distributed systems. To us, this is a wonderful model for how a basic research institute can productively collaborate with industry to advance science and medicine. Both groups move faster and go further by working together.<br /><br />As of today, the largest and most important of our production pipelines, the <a href="https://www.broadinstitute.org/gatk/guide/bp_step.php?p=1">Whole Genome Sequencing Pipeline</a>, has been completely ported to the <a href="https://cloud.google.com/">Google Cloud Platform </a>(GCP).  We are now beginning to run production jobs on GCP and will be switching over entirely this month. This switch has proved to be a very cost-effective decision. While the conventional wisdom is that public clouds can be more expensive, our experience is that cloud is dramatically cheaper. Consider the curve below that my colleague Kristian Cibulskis recently showed at <a href="https://youtu.be/M_G_1SWVHgw?t=500">GCP NEXT</a>:<br /><div><a href="https://3.bp.blogspot.com/-4JxZeKaF5-0/VwRCixg6baI/AAAAAAAAA90/H7-gTVFWneAX6irAuGHqqKhCJaO2lgNIw/s1600/image01.png"><img border="0" height="380" src="https://3.bp.blogspot.com/-4JxZeKaF5-0/VwRCixg6baI/AAAAAAAAA90/H7-gTVFWneAX6irAuGHqqKhCJaO2lgNIw/s640/image01.png" width="640"></a></div>Out of the box, the cost of running the <a href="https://www.broadinstitute.org/gatk/">Genome Analysis Toolkit</a> (GATK) best practices pipeline on a 30X-coverage whole genome was roughly the same as the cost of our on-premise infrastructure. Over a period of a few months, however, we developed techniques that allowed us to <i>really</i> reduce costs: We learned how to parallelize the computationally intensive steps like aligning DNA sequences against a reference genome. We also optimized for GCP&#8217;s infrastructure to lower costs by using features such as <a href="https://cloud.google.com/preemptible-vms/">Preemptible VMs</a>. After doing these optimizations, our production whole genome pipeline was about 20% the cost of where we were when we started, saving our researchers millions of dollars, all while reducing processing turnaround time eight-fold.<br /><br />There is a similar story to be told on storage of the input and output data. <a href="https://cloud.google.com/storage/docs/nearline">Google Cloud Storage Nearline</a> is a medium for storing DNA sequence alignments and raw data. Like most people in genomics, we access genetic variants data every day, but raw DNA sequences only a few times per year, such as when there is a new algorithm that requires raw data or a new assembly of the human genome. Nearline&#8217;s price/performance tradeoff is well-suited to data that&#8217;s infrequently accessed. By using Nearline, along with some compression tricks, we were able to reduce our storage costs by greater than 50%. <br /><br />Altogether, we estimate that, by using GCP services for both compute and storage, we will be able to lower the total cost of ownership for storing and processing genomic data significantly relative to our on premise costs. Looking forward, we also see advantages for data sharing, particularly for large multi-group genome projects. An environment where the data can be securely stored and analyzed will solve problems of multiple groups copying and paying for transmission and storage of the same data. <br /><br />Porting the GATK whole genome pipeline to the cloud is just the starting point. During the coming year, we plan to migrate the bulk of our production pipelines to the cloud, including tools for arrays, exomes, cancer genomes, and RNA-seq. Moreover, our non-exclusive relationship with Google is founded on the principle that our groups can leverage complementary skills to make products that can not only serve the needs of Broad, but also help serve the needs of researchers around the world. Therefore, as we migrate each of our pipelines to the cloud to meet our own needs, we also plan to make them available to the greater genomics community through a Software-as-a-Service model. <br /><br />This is an exciting time for us at Broad. For more than a decade we have served the genomics community by acting as a hub for data generation; now, we are extending this mission to encompass not only sequencing services, but also data services. We believe that by expanding access to our tools and optimizing our pipelines for the cloud, will enable the community to benefit from the enormous effort we have invested. We look forward to expanding the scope of this mission in the years to come.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Dr. Stacey Gabriel, Director of the Genomics Platform at the Broad Institute of MIT and Harvard</span><br /><br /><i>Today we hear from Broad Institute of MIT and Harvard about how their researchers and software engineers are collaborating closely with the Google Genomics team on large-scale genomic data analysis. They’ve already reduced the time and cost for whole genome processing by several fold, helping researchers think even bigger. Broad’s open source tools, developed in close <a href="https://cloudplatform.googleblog.com/2015/06/Google-Genomics-and-Broad-Institute-Team-Up-to-Tackle-Genomic-Data.html">collaboration with Google Genomics</a>, will also be made available to the wider research community. <br />– Jonathan Bingham, Product Manager, Google Genomics</i><br /><br /><table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-xIlCwubRzLQ/VwRBiP-L1xI/AAAAAAAAA9o/GBuokDiS1aQ1tlX4cYoVeh1V0zfoV-7jA/s1600/image02.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" height="265" src="https://4.bp.blogspot.com/-xIlCwubRzLQ/VwRBiP-L1xI/AAAAAAAAA9o/GBuokDiS1aQ1tlX4cYoVeh1V0zfoV-7jA/s400/image02.jpg" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Dr. Stacey Gabriel, Director of the <br />Genomics Platform at the Broad Institute</td></tr></tbody></table>As one of the largest genome sequencing centers in the world, the <a href="http://www.broadinstitute.org/">Broad Institute</a> of MIT and Harvard generates a lot of data. Our DNA sequencers produce more than 20 Terabytes (TB) of genomic data per day, and they run 365 days a year. Moreover, our rate of data generation is not only growing, but accelerating – our output increased more than two-fold last year, and nearly two-fold the previous year. We are not alone in facing this embarrassment of riches; across the whole genomics community, the rate of data production is doubling about every eight months with no end in sight.<br /><br />Here at  Broad, our team of software engineers and methods developers have spent the last year working to re-architect our production sequencing environment for the cloud. This has been no small feat, especially as we had to build the plane while we flew it! It required an entirely new system for developing and deploying pipelines (which we call <a href="https://github.com/broadinstitute/cromwell">Cromwell</a>), as well as a new framework for wet lab quality control that uncouples data generation from data processing.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-0jAq6VN1y50/VwRCSEIaKyI/AAAAAAAAA9w/VukpNxJPgDcHuqUmcb8U85GAhbvgfKFRA/s1600/image00.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="168" src="https://1.bp.blogspot.com/-0jAq6VN1y50/VwRCSEIaKyI/AAAAAAAAA9w/VukpNxJPgDcHuqUmcb8U85GAhbvgfKFRA/s640/image00.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Courtesy: Broad Institute of MIT and Harvard</td></tr></tbody></table>Last summer Broad and Google <a href="http://www.broadinstitute.org/google">announced a collaboration</a> to develop a safe, secure and  scalable cloud computing infrastructure capable of storing and processing enormous datasets. We also set out to build cloud-supported tools to analyze such data and unravel long-standing mysteries about human health. Our engineers collaborate closely; we teach them about genomic data science and genomic data engineering, and they teach us about cloud computing and distributed systems. To us, this is a wonderful model for how a basic research institute can productively collaborate with industry to advance science and medicine. Both groups move faster and go further by working together.<br /><br />As of today, the largest and most important of our production pipelines, the <a href="https://www.broadinstitute.org/gatk/guide/bp_step.php?p=1">Whole Genome Sequencing Pipeline</a>, has been completely ported to the <a href="https://cloud.google.com/">Google Cloud Platform </a>(GCP).  We are now beginning to run production jobs on GCP and will be switching over entirely this month. This switch has proved to be a very cost-effective decision. While the conventional wisdom is that public clouds can be more expensive, our experience is that cloud is dramatically cheaper. Consider the curve below that my colleague Kristian Cibulskis recently showed at <a href="https://youtu.be/M_G_1SWVHgw?t=500">GCP NEXT</a>:<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-4JxZeKaF5-0/VwRCixg6baI/AAAAAAAAA90/H7-gTVFWneAX6irAuGHqqKhCJaO2lgNIw/s1600/image01.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="380" src="https://3.bp.blogspot.com/-4JxZeKaF5-0/VwRCixg6baI/AAAAAAAAA90/H7-gTVFWneAX6irAuGHqqKhCJaO2lgNIw/s640/image01.png" width="640" /></a></div>Out of the box, the cost of running the <a href="https://www.broadinstitute.org/gatk/">Genome Analysis Toolkit</a> (GATK) best practices pipeline on a 30X-coverage whole genome was roughly the same as the cost of our on-premise infrastructure. Over a period of a few months, however, we developed techniques that allowed us to <i>really</i> reduce costs: We learned how to parallelize the computationally intensive steps like aligning DNA sequences against a reference genome. We also optimized for GCP’s infrastructure to lower costs by using features such as <a href="https://cloud.google.com/preemptible-vms/">Preemptible VMs</a>. After doing these optimizations, our production whole genome pipeline was about 20% the cost of where we were when we started, saving our researchers millions of dollars, all while reducing processing turnaround time eight-fold.<br /><br />There is a similar story to be told on storage of the input and output data. <a href="https://cloud.google.com/storage/docs/nearline">Google Cloud Storage Nearline</a> is a medium for storing DNA sequence alignments and raw data. Like most people in genomics, we access genetic variants data every day, but raw DNA sequences only a few times per year, such as when there is a new algorithm that requires raw data or a new assembly of the human genome. Nearline’s price/performance tradeoff is well-suited to data that’s infrequently accessed. By using Nearline, along with some compression tricks, we were able to reduce our storage costs by greater than 50%. <br /><br />Altogether, we estimate that, by using GCP services for both compute and storage, we will be able to lower the total cost of ownership for storing and processing genomic data significantly relative to our on premise costs. Looking forward, we also see advantages for data sharing, particularly for large multi-group genome projects. An environment where the data can be securely stored and analyzed will solve problems of multiple groups copying and paying for transmission and storage of the same data. <br /><br />Porting the GATK whole genome pipeline to the cloud is just the starting point. During the coming year, we plan to migrate the bulk of our production pipelines to the cloud, including tools for arrays, exomes, cancer genomes, and RNA-seq. Moreover, our non-exclusive relationship with Google is founded on the principle that our groups can leverage complementary skills to make products that can not only serve the needs of Broad, but also help serve the needs of researchers around the world. Therefore, as we migrate each of our pipelines to the cloud to meet our own needs, we also plan to make them available to the greater genomics community through a Software-as-a-Service model. <br /><br />This is an exciting time for us at Broad. For more than a decade we have served the genomics community by acting as a hub for data generation; now, we are extending this mission to encompass not only sequencing services, but also data services. We believe that by expanding access to our tools and optimizing our pipelines for the cloud, will enable the community to benefit from the enormous effort we have invested. We look forward to expanding the scope of this mission in the years to come.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/genomic-data-processing-on-google-cloud-platform/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Lessons learned while protecting Gmail</title>
		<link>https://googledata.org/google-research/lessons-learned-while-protecting-gmail/</link>
		<comments>https://googledata.org/google-research/lessons-learned-while-protecting-gmail/#comments</comments>
		<pubDate>Tue, 29 Mar 2016 20:08:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>
		<category><![CDATA[Gmail]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=4582d249b0114a78bd14dd32dc82a9ab</guid>
		<description><![CDATA[<span>Posted by Elie Bursztein - anti-abuse &#38; security research,  Nicolas Lidzborski - Gmail security engineering, and Vijay Eranti - Gmail anti-abuse engineering</span><br /><br />Earlier this year in San Francisco, <a href="https://www.usenix.org/">USENIX</a> hosted their inaugural <a href="https://www.usenix.org/conference/enigma2016">Enigma Conference</a>, which focused on security, privacy and electronic crime through the lens of emerging threats and novel attacks. We were <a href="http://googleresearch.blogspot.com/2015/08/say-hello-to-enigma-conference.html">excited to help make this conference happen</a> and to participate in it. <br /><br />At the conference, we heard from a variety of terrific speakers including:<br /><ul><li><a href="https://en.wikipedia.org/wiki/Ron_Rivest">Ron Rivest</a>, Professor at MIT and inventor of RSA, who spoke about <a href="https://youtu.be/hqacHM6Wm0Q">the consequences of backdooring encryption</a></li><li><a href="http://iccs.fordham.edu/program/iccs2015/robert-joyce/">Rob Joyce</a>, Chief of the NSA Tailored Access Operations organization, who spoke about about <a href="https://youtu.be/bDJb8WOJYdA">defending against state attackers</a></li><li><a href="https://en.wikipedia.org/wiki/George_Hotz">George &#8220;Geohot&#8221; Hotz</a>, Hacker extraordinaire, who discussed <a href="https://youtu.be/eGl6kpSajag">state of the art software debugging</a></li></ul>In addition, we were able to <a href="https://goo.gl/nqGcpg">share the lessons we&#8217;ve learned</a> about protecting Gmail users since it was launched over a decade ago. Those lessons are summarized in the infographic below (the talk slides are <a href="https://goo.gl/59Sfqp">also available</a>).<br /><div><a href="https://4.bp.blogspot.com/-3nynTyfHkf4/Vvsiwhs6wRI/AAAAAAAAA9I/Onxz7rgVoZANQ7pHMhnSZ0K44Oz5S_Yyg/s1600/Lesson-learned-while-protecting-gmail.png"><img border="0" src="https://4.bp.blogspot.com/-3nynTyfHkf4/Vvsiwhs6wRI/AAAAAAAAA9I/Onxz7rgVoZANQ7pHMhnSZ0K44Oz5S_Yyg/s1600/Lesson-learned-while-protecting-gmail.png"></a></div>We were proud to sponsor this year's inaugural Enigma conference, and it is our hope that the core lessons that we have learned over the years can benefit other online products and services. We're looking forward to participating again next year when <a href="https://www.usenix.org/conference/enigma2017">Enigma returns in 2017</a>. We hope to see you there!]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Elie Bursztein - anti-abuse &amp; security research,  Nicolas Lidzborski - Gmail security engineering, and Vijay Eranti - Gmail anti-abuse engineering</span><br /><br />Earlier this year in San Francisco, <a href="https://www.usenix.org/">USENIX</a> hosted their inaugural <a href="https://www.usenix.org/conference/enigma2016">Enigma Conference</a>, which focused on security, privacy and electronic crime through the lens of emerging threats and novel attacks. We were <a href="http://googleresearch.blogspot.com/2015/08/say-hello-to-enigma-conference.html">excited to help make this conference happen</a> and to participate in it. <br /><br />At the conference, we heard from a variety of terrific speakers including:<br /><ul><li><a href="https://en.wikipedia.org/wiki/Ron_Rivest">Ron Rivest</a>, Professor at MIT and inventor of RSA, who spoke about <a href="https://youtu.be/hqacHM6Wm0Q">the consequences of backdooring encryption</a></li><li><a href="http://iccs.fordham.edu/program/iccs2015/robert-joyce/">Rob Joyce</a>, Chief of the NSA Tailored Access Operations organization, who spoke about about <a href="https://youtu.be/bDJb8WOJYdA">defending against state attackers</a></li><li><a href="https://en.wikipedia.org/wiki/George_Hotz">George “Geohot” Hotz</a>, Hacker extraordinaire, who discussed <a href="https://youtu.be/eGl6kpSajag">state of the art software debugging</a></li></ul>In addition, we were able to <a href="https://goo.gl/nqGcpg">share the lessons we’ve learned</a> about protecting Gmail users since it was launched over a decade ago. Those lessons are summarized in the infographic below (the talk slides are <a href="https://goo.gl/59Sfqp">also available</a>).<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-3nynTyfHkf4/Vvsiwhs6wRI/AAAAAAAAA9I/Onxz7rgVoZANQ7pHMhnSZ0K44Oz5S_Yyg/s1600/Lesson-learned-while-protecting-gmail.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://4.bp.blogspot.com/-3nynTyfHkf4/Vvsiwhs6wRI/AAAAAAAAA9I/Onxz7rgVoZANQ7pHMhnSZ0K44Oz5S_Yyg/s1600/Lesson-learned-while-protecting-gmail.png" /></a></div>We were proud to sponsor this year's inaugural Enigma conference, and it is our hope that the core lessons that we have learned over the years can benefit other online products and services. We're looking forward to participating again next year when <a href="https://www.usenix.org/conference/enigma2017">Enigma returns in 2017</a>. We hope to see you there!]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/lessons-learned-while-protecting-gmail/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Machine Learning in the Cloud, with TensorFlow</title>
		<link>https://googledata.org/google-research/machine-learning-in-the-cloud-with-tensorflow/</link>
		<comments>https://googledata.org/google-research/machine-learning-in-the-cloud-with-tensorflow/#comments</comments>
		<pubDate>Wed, 23 Mar 2016 17:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=54b03e7c07b0fb26886322c1e4131ea3</guid>
		<description><![CDATA[<span>Posted by Slaven Bilac, Software Engineer, Google Research</span><br /><br />At Google, researchers collaborate closely with product teams, applying the latest advances in Machine Learning to existing products and services - such as <a href="http://googleresearch.blogspot.com/2015/09/google-voice-search-faster-and-more.html">speech recognition in the Google app</a>, <a href="http://googleresearch.blogspot.com/2013/06/improving-photo-search-step-across.html">search in Google Photos</a> and the <a href="http://googleresearch.blogspot.com/2015/11/computer-respond-to-this-email.html">Smart Reply feature in Inbox by Gmail</a> -  in order to make them more useful.  A growing number of Google products are using <a href="https://www.tensorflow.org/">TensorFlow</a>, our open source Machine Learning system, to tackle ML challenges and we would like to enable others do the same. <br /><br />Today, at <a href="https://cloudplatformonline.com/NEXT2016.html">GCP NEXT 2016</a>, we <a href="https://cloudplatform.googleblog.com/2016/03/Google-takes-Cloud-Machine-Learning-service-mainstream.html">announced the alpha release</a> of <a href="https://cloud.google.com/products/machine-learning">Cloud Machine Learning</a>, a framework for building and training custom models to be used in intelligent applications.  <br /><div><a href="https://cloud.google.com/ml/"><img border="0" height="320" src="https://3.bp.blogspot.com/-ySmp6NANwB4/VvLGkyVAQJI/AAAAAAAAA8c/iMeTQoMEM70YwLuUpqpYN100L85506N7Q/s320/image00.png" width="320"></a></div>Machine Learning projects can come in many sizes, and as we&#8217;ve seen with our open source offering <a href="https://www.tensorflow.org/">TensorFlow</a>, projects often need to scale up. Some small tasks are best handled with a local solution running on one&#8217;s desktop, while large scale applications require both the scale and dependability of a hosted solution. Google <a href="https://cloud.google.com/products/machine-learning">Cloud Machine Learning</a> aims to support the full range and provide a seamless transition from local to cloud environment.<br /><br />The <a href="https://cloud.google.com/products/machine-learning">Cloud Machine Learning</a> offering allows users to run custom distributed learning algorithms based on <a href="https://www.tensorflow.org/">TensorFlow</a>. In addition to the <a href="https://en.wikipedia.org/wiki/Deep_learning">deep learning</a> capabilities that power <a href="http://cloud.google.com/translate">Cloud Translate API</a>, <a href="https://cloud.google.com/vision/">Cloud Vision API</a>, and <a href="https://cloud.google.com/speech/">Cloud Speech API</a>, we provide easy-to-adopt samples for common tasks like linear regression/classification with very fast convergence properties (based on the <a href="http://arxiv.org/abs/1211.2717">SDCA</a> algorithm) and building a custom image classification model with few hundred training examples (based on the <a href="http://arxiv.org/pdf/1310.1531v1.pdf">DeCAF</a> algorithm).<br /><br />We are excited to bring the best of <a href="https://research.google.com/">Google Research</a> to <a href="https://cloud.google.com/">Google Cloud Platform</a>. Learn more about this release and more from GCP Next 2016 on the <a href="https://cloudplatform.googleblog.com/2016/03/Google-takes-Cloud-Machine-Learning-service-mainstream.html">Google Cloud Platform blog</a>.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Slaven Bilac, Software Engineer, Google Research</span><br /><br />At Google, researchers collaborate closely with product teams, applying the latest advances in Machine Learning to existing products and services - such as <a href="http://googleresearch.blogspot.com/2015/09/google-voice-search-faster-and-more.html">speech recognition in the Google app</a>, <a href="http://googleresearch.blogspot.com/2013/06/improving-photo-search-step-across.html">search in Google Photos</a> and the <a href="http://googleresearch.blogspot.com/2015/11/computer-respond-to-this-email.html">Smart Reply feature in Inbox by Gmail</a> -  in order to make them more useful.  A growing number of Google products are using <a href="https://www.tensorflow.org/">TensorFlow</a>, our open source Machine Learning system, to tackle ML challenges and we would like to enable others do the same. <br /><br />Today, at <a href="https://cloudplatformonline.com/NEXT2016.html">GCP NEXT 2016</a>, we <a href="https://cloudplatform.googleblog.com/2016/03/Google-takes-Cloud-Machine-Learning-service-mainstream.html">announced the alpha release</a> of <a href="https://cloud.google.com/products/machine-learning">Cloud Machine Learning</a>, a framework for building and training custom models to be used in intelligent applications.  <br /><div class="separator" style="clear: both; text-align: center;"><a href="https://cloud.google.com/ml/" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://3.bp.blogspot.com/-ySmp6NANwB4/VvLGkyVAQJI/AAAAAAAAA8c/iMeTQoMEM70YwLuUpqpYN100L85506N7Q/s320/image00.png" width="320" /></a></div>Machine Learning projects can come in many sizes, and as we’ve seen with our open source offering <a href="https://www.tensorflow.org/">TensorFlow</a>, projects often need to scale up. Some small tasks are best handled with a local solution running on one’s desktop, while large scale applications require both the scale and dependability of a hosted solution. Google <a href="https://cloud.google.com/products/machine-learning">Cloud Machine Learning</a> aims to support the full range and provide a seamless transition from local to cloud environment.<br /><br />The <a href="https://cloud.google.com/products/machine-learning">Cloud Machine Learning</a> offering allows users to run custom distributed learning algorithms based on <a href="https://www.tensorflow.org/">TensorFlow</a>. In addition to the <a href="https://en.wikipedia.org/wiki/Deep_learning">deep learning</a> capabilities that power <a href="http://cloud.google.com/translate">Cloud Translate API</a>, <a href="https://cloud.google.com/vision/">Cloud Vision API</a>, and <a href="https://cloud.google.com/speech/">Cloud Speech API</a>, we provide easy-to-adopt samples for common tasks like linear regression/classification with very fast convergence properties (based on the <a href="http://arxiv.org/abs/1211.2717">SDCA</a> algorithm) and building a custom image classification model with few hundred training examples (based on the <a href="http://arxiv.org/pdf/1310.1531v1.pdf">DeCAF</a> algorithm).<br /><br />We are excited to bring the best of <a href="https://research.google.com/">Google Research</a> to <a href="https://cloud.google.com/">Google Cloud Platform</a>. Learn more about this release and more from GCP Next 2016 on the <a href="https://cloudplatform.googleblog.com/2016/03/Google-takes-Cloud-Machine-Learning-service-mainstream.html">Google Cloud Platform blog</a>.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/machine-learning-in-the-cloud-with-tensorflow/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Announcing the 2016 Google PhD Fellows for North America, Europe and the Middle East</title>
		<link>https://googledata.org/google-research/announcing-the-2016-google-phd-fellows-for-north-america-europe-and-the-middle-east/</link>
		<comments>https://googledata.org/google-research/announcing-the-2016-google-phd-fellows-for-north-america-europe-and-the-middle-east/#comments</comments>
		<pubDate>Thu, 10 Mar 2016 18:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=2a896b139730596c86fc17ff1d08b75d</guid>
		<description><![CDATA[<span>Posted by Michael Rennaker, Google PhD Fellowships Lead</span><br /><br />Google created the <a href="http://research.google.com/research-outreach.html#/research-outreach/graduate-fellowships">PhD Fellowship program</a> in 2009 to recognize and support outstanding graduate students doing exceptional research in Computer Science and related disciplines. Now in its eighth year, our fellowship program has supported hundreds of future faculty, industry researchers, innovators and entrepreneurs.<br /><br />Reflecting our continuing commitment to supporting and building relationships with the academic community, we are excited to announce the 39 recipients from North America, Europe and the Middle East. We offer our sincere congratulations to Google&#8217;s 2016 Class of PhD Fellows.<br /><br /><b><u>Computational Neuroscience</u></b><br />Cameron (Po-Hsuan) Chen, <i>Princeton University</i><br />Grace Lindsay, <i>Columbia University</i><br />Martino Sorbaro Sindaci, <i>The University of Edinburgh</i><br /><br /><b><u>Human-Computer Interaction</u></b><br />Koki Nagano, <i>University of Southern California</i><br />Arvind Satyanarayan, <i>Stanford University</i><br />Amy Xian Zhang, <i>Massachusetts Institute of Technology</i><br /><br /><b><u>Machine Learning</u></b><br />Olivier Bachem, <i>Swiss Federal Institute of Technology Zurich</i><br />Tianqi Chen, <i>University of Washington</i><br />Emily Denton, <i>New York University</i><br />Yves-Laurent Kom Samo, <i>University of Oxford</i><br />Daniel Jaymin Mankowitz, <i>Technion - Israel Institute of Technology</i><br />Lucas Maystre, <i>&#201;cole Polytechnique F&#233;d&#233;rale de Lausanne</i><br />Arvind Neelakantan, <i>University of Massachusetts, Amherst</i><br />Ludwig Schmidt, <i>Massachusetts Institute of Technology</i><br />Shandian Zhe, <i>Purdue University, West Lafayette</i><br /><br /><b><u>Machine Perception, Speech Technology and Computer Vision</u></b><br />Eugen Beck, <i>RWTH Aachen University</i><br />Yu-Wei Chao, <i>University of Michigan, Ann Arbor</i><br />Wei Liu, <i>University of North Carolina at Chapel Hill</i><br />Aron Monszpart, <i>University College London</i><br />Thomas Schoeps, <i>Swiss Federal Institute of Technology Zurich</i><br />Chia-Yin Tsai, <i>Carnegie Mellon University</i><br /><br /><b><u>Market Algorithms</u></b><br />Hossein Esfandiari, <i>University of Maryland, College Park</i><br />Sandy Heydrich, <i>Saarland University - Saarbrucken GSCS</i><br />Rad Niazadeh, <i>Cornell University</i><br />Sadra Yazdanbod, <i>Georgia Institute of Technology</i><br /><br /><b><u>Mobile Computing</u></b><br />Lei Kang, <i>University of Wisconsin</i><br />Tauhidur Rahman, <i>Cornell University</i><br />Yuhao Zhu, <i>University of Texas, Austin</i><br /><br /><b><u>Natural Language Processing</u></b><br />Tamer Alkhouli, <i>RWTH Aachen University</i><br />Jose Camacho Collados, <i>Sapienza - Universit&#224; di Roma </i><br /><br /><b><u>Privacy and Security</u></b><br />Kartik Nayak, <i>University of Maryland, College Park</i><br />Nicolas Papernot, <i>Pennsylvania State University</i><br />Damian Vizar, <i>&#201;cole Polytechnique F&#233;d&#233;rale de Lausanne</i><br />Xi Wu, <i>University of Wisconsin</i><br /><u><br /></u> <b><u>Programming Languages and Software Engineering</u></b><br />Marcelo Sousa, <i>University of Oxford</i><br /><br /><b><u>Structured Data and Database Management</u></b><br />Xiang Ren, <i>University of Illinois, Urbana-Champaign</i><br /><br /><b><u>Systems and Networking</u></b><br />Andrew Crotty, <i>Brown University</i><br />Ilias Marinos, <i>University of Cambridge</i><br />Kay Ousterhout, <i>University of California, Berkeley</i>]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Michael Rennaker, Google PhD Fellowships Lead</span><br /><br />Google created the <a href="http://research.google.com/research-outreach.html#/research-outreach/graduate-fellowships">PhD Fellowship program</a> in 2009 to recognize and support outstanding graduate students doing exceptional research in Computer Science and related disciplines. Now in its eighth year, our fellowship program has supported hundreds of future faculty, industry researchers, innovators and entrepreneurs.<br /><br />Reflecting our continuing commitment to supporting and building relationships with the academic community, we are excited to announce the 39 recipients from North America, Europe and the Middle East. We offer our sincere congratulations to Google’s 2016 Class of PhD Fellows.<br /><br /><b><u>Computational Neuroscience</u></b><br />Cameron (Po-Hsuan) Chen, <i>Princeton University</i><br />Grace Lindsay, <i>Columbia University</i><br />Martino Sorbaro Sindaci, <i>The University of Edinburgh</i><br /><br /><b><u>Human-Computer Interaction</u></b><br />Koki Nagano, <i>University of Southern California</i><br />Arvind Satyanarayan, <i>Stanford University</i><br />Amy Xian Zhang, <i>Massachusetts Institute of Technology</i><br /><br /><b><u>Machine Learning</u></b><br />Olivier Bachem, <i>Swiss Federal Institute of Technology Zurich</i><br />Tianqi Chen, <i>University of Washington</i><br />Emily Denton, <i>New York University</i><br />Yves-Laurent Kom Samo, <i>University of Oxford</i><br />Daniel Jaymin Mankowitz, <i>Technion - Israel Institute of Technology</i><br />Lucas Maystre, <i>École Polytechnique Fédérale de Lausanne</i><br />Arvind Neelakantan, <i>University of Massachusetts, Amherst</i><br />Ludwig Schmidt, <i>Massachusetts Institute of Technology</i><br />Shandian Zhe, <i>Purdue University, West Lafayette</i><br /><br /><b><u>Machine Perception, Speech Technology and Computer Vision</u></b><br />Eugen Beck, <i>RWTH Aachen University</i><br />Yu-Wei Chao, <i>University of Michigan, Ann Arbor</i><br />Wei Liu, <i>University of North Carolina at Chapel Hill</i><br />Aron Monszpart, <i>University College London</i><br />Thomas Schoeps, <i>Swiss Federal Institute of Technology Zurich</i><br />Chia-Yin Tsai, <i>Carnegie Mellon University</i><br /><br /><b><u>Market Algorithms</u></b><br />Hossein Esfandiari, <i>University of Maryland, College Park</i><br />Sandy Heydrich, <i>Saarland University - Saarbrucken GSCS</i><br />Rad Niazadeh, <i>Cornell University</i><br />Sadra Yazdanbod, <i>Georgia Institute of Technology</i><br /><br /><b><u>Mobile Computing</u></b><br />Lei Kang, <i>University of Wisconsin</i><br />Tauhidur Rahman, <i>Cornell University</i><br />Yuhao Zhu, <i>University of Texas, Austin</i><br /><br /><b><u>Natural Language Processing</u></b><br />Tamer Alkhouli, <i>RWTH Aachen University</i><br />Jose Camacho Collados, <i>Sapienza - Università di Roma </i><br /><br /><b><u>Privacy and Security</u></b><br />Kartik Nayak, <i>University of Maryland, College Park</i><br />Nicolas Papernot, <i>Pennsylvania State University</i><br />Damian Vizar, <i>École Polytechnique Fédérale de Lausanne</i><br />Xi Wu, <i>University of Wisconsin</i><br /><u><br /></u> <b><u>Programming Languages and Software Engineering</u></b><br />Marcelo Sousa, <i>University of Oxford</i><br /><br /><b><u>Structured Data and Database Management</u></b><br />Xiang Ren, <i>University of Illinois, Urbana-Champaign</i><br /><br /><b><u>Systems and Networking</u></b><br />Andrew Crotty, <i>Brown University</i><br />Ilias Marinos, <i>University of Cambridge</i><br />Kay Ousterhout, <i>University of California, Berkeley</i>]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/announcing-the-2016-google-phd-fellows-for-north-america-europe-and-the-middle-east/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Train your own image classifier with Inception in TensorFlow</title>
		<link>https://googledata.org/google-research/train-your-own-image-classifier-with-inception-in-tensorflow/</link>
		<comments>https://googledata.org/google-research/train-your-own-image-classifier-with-inception-in-tensorflow/#comments</comments>
		<pubDate>Wed, 09 Mar 2016 18:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=57950f9d8ab62117811e509fdd1bc90a</guid>
		<description><![CDATA[<span>Posted by Jon Shlens, Senior Research Scientist</span><br /><br />At the end of last year we released code that allows a user <a href="http://googleresearch.blogspot.com/2015/12/how-to-classify-images-with-tensorflow.html">to classify images with TensorFlow</a> models. This code demonstrated how to build an image classification system by employing a deep learning model that we had previously trained. This model was known to classify an image across 1000 categories supplied by the <a href="http://www.image-net.org/">ImageNet</a> academic competition with an error rate that approached <a href="http://karpathy.github.io/2014/09/02/what-i-learned-from-competing-against-a-convnet-on-imagenet/">human performance</a>. After all, what self-respecting computer vision system would fail to recognize a cute puppy?<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://3.bp.blogspot.com/-W__wiaHUjwI/Vt3Grd8df0I/AAAAAAAAA78/7xqUNj8ujtY/s1600/image02.png"><img border="0" height="400" src="https://3.bp.blogspot.com/-W__wiaHUjwI/Vt3Grd8df0I/AAAAAAAAA78/7xqUNj8ujtY/s400/image02.png" width="331"></a></td></tr><tr><td>Image via <a href="https://commons.wikimedia.org/wiki/File:Golde33443.jpg">Wikipedia</a></td></tr></tbody></table>Well, thankfully the image classification model would recognize this image as a <i>retriever</i> with 79.3% confidence. But, more spectacularly, it would also be able to distinguish between a <a href="https://en.wikipedia.org/wiki/Spotted_salamander">spotted salamander</a> and <a href="https://en.wikipedia.org/wiki/Fire_salamander">fire salamander</a> with high confidence &#8211; a task that might be quite difficult for those not experts in <a href="https://en.wikipedia.org/wiki/Herpetology">herpetology</a>. Can <i>you</i> tell the difference? <br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://3.bp.blogspot.com/-FRQzMrAIr_g/Vt3G7-9oSkI/AAAAAAAAA8A/mdnBJBm3oM0/s1600/salamander.png"><img border="0" height="198" src="https://3.bp.blogspot.com/-FRQzMrAIr_g/Vt3G7-9oSkI/AAAAAAAAA8A/mdnBJBm3oM0/s640/salamander.png" width="640"></a></td></tr><tr><td>Images via <a href="https://en.wikipedia.org/wiki/Fire_salamander#/media/File:Salamandra_salamandra_MHNT_1.jpg">Wikipedia</a></td></tr></tbody></table>The deep learning model we released, <b>Inception-v3</b>, is described in our Arxiv preprint "<a href="http://arxiv.org/abs/1512.00567"><i>Rethinking the Inception Architecture for Computer Vision</i></a>&#8221; and can be visualized with this schematic diagram:<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://4.bp.blogspot.com/-TMOLlkJBxms/Vt3HQXpE2cI/AAAAAAAAA8E/7X7XRFOY6Xo/s1600/image03.png"><img border="0" src="https://4.bp.blogspot.com/-TMOLlkJBxms/Vt3HQXpE2cI/AAAAAAAAA8E/7X7XRFOY6Xo/s1600/image03.png"></a></td></tr><tr><td>Schematic diagram of Inception-v3</td></tr></tbody></table>As described in the preprint, this model achieves 5.64% top-5 error while an ensemble of four of these models achieves 3.58% top-5 error on the validation set of the ImageNet whole image <a href="http://www.image-net.org/challenges/LSVRC/2012/">ILSVRC 2012</a> classification task. Furthermore, in the <a href="http://image-net.org/challenges/LSVRC/2015/results">2015 ImageNet Challenge</a>, an ensemble of 4 of these models came in 2nd in the image classification task.<br /><br />After the release of this model, many people in the TensorFlow community voiced their preference on having an Inception-v3 model that they can train themselves, rather than using our pre-trained model. We could not agree more, since a system for training an Inception-v3 model provides many opportunities, including:<br /><ul><li>Exploration of different variants of this model architecture in order to improve the image classification system.</li><li>Comparison of optimization algorithms and hardware setups for training this model faster or to a higher degree of predictive performance.</li><li>Retraining/fine-tuning the Inception-v3 model on a distinct image classification task or as a component of a larger network tasked with object detection or multi-modal learning.</li></ul>The last topic is often referred to as <a href="https://en.wikipedia.org/wiki/Inductive_transfer"><i>transfer learning</i></a>, and has been an area of particular excitement in the field of deep networks in the context of vision. A common prescription to a computer vision problem is to first train an image classification model with the ImageNet Challenge data set, and then transfer this model&#8217;s knowledge to a distinct task. This has been done for <a href="http://arxiv.org/abs/1311.2524">object detection</a>, <a href="http://arxiv.org/abs/1312.5650">zero-shot learning</a>, <a href="http://googleresearch.blogspot.com/2014/11/a-picture-is-worth-thousand-coherent.html">image captioning</a>, <a href="http://ieeexplore.ieee.org/xpl/login.jsp?tp=&#38;arnumber=6751448">video analysis</a> and multitudes of other applications.<br /><br />Today we are happy to announce that <a href="https://github.com/tensorflow/models/tree/master/inception">we are releasing libraries and code for training <b>Inception-v3</b></a> on one or multiple GPU&#8217;s. Some features of this code include:<br /><ul><li>Training an Inception-v3 model with synchronous updates across multiple GPUs.</li><li>Employing batch normalization to speed up training of the model.</li><li>Leveraging many distortions of the image to augment model training.</li><li>Releasing a new (still experimental) high-level language for specifying complex model architectures, which we call <a href="https://github.com/tensorflow/models/blob/master/inception/inception/slim/README.md"><b>TensorFlow-Slim</b></a>.</li><li>Demonstrating how to perform transfer learning by taking a pre-trained Inception-v3 model and fine-tuning it for another task.</li></ul>We can train a model from scratch to its best performance on a desktop with 8 NVIDIA Tesla K40s in about 2 weeks. In order to make research progress faster, we are additionally supplying a new version of a pre-trained Inception-v3 model that is ready to be fine-tuned or adapted to a new task. We demonstrate how to use this model for transfer learning on a simple flower classification task. Hopefully, this provides a useful didactic example for employing this Inception model on wide range of vision tasks.<br /><br />Want to get started? See the accompanying <b><a href="https://github.com/tensorflow/models/tree/master/inception/README.md">instructions</a></b> on how to <a href="https://github.com/tensorflow/models/tree/master/inception/README.md#how-to-train-from-scratch">train</a>, <a href="https://github.com/tensorflow/models/tree/master/inception/README.md#how-to-evaluate">evaluate</a> or <a href="https://github.com/tensorflow/models/tree/master/inception/README.md#how-to-fine-tune-a-pre-trained-model-on-a-new-task">fine-tune</a> a network.<br /><br />Releasing this code has been a huge team effort. These efforts have taken several months with contributions from many individuals spanning research at Google. We wish to especially acknowledge the following people who contributed to this project:<br /><ul><li><b>Model Architecture</b> &#8211; Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Jon Shlens and Zbigniew Wojna</li><li><b>Systems Infrastructure</b> &#8211; Sherry Moore, Martin Wicke, David Andersen, Matthieu Devin, Manjunath Kudlur and Nishant Patil</li><li><b>TensorFlow-Slim</b> &#8211; Sergio Guadarrama and Nathan Silberman</li><li><b>Model Visualization</b> &#8211; Fernanda Vi&#233;gas, Martin Wattenberg and James Wexler</li></ul>]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Jon Shlens, Senior Research Scientist</span><br /><br />At the end of last year we released code that allows a user <a href="http://googleresearch.blogspot.com/2015/12/how-to-classify-images-with-tensorflow.html">to classify images with TensorFlow</a> models. This code demonstrated how to build an image classification system by employing a deep learning model that we had previously trained. This model was known to classify an image across 1000 categories supplied by the <a href="http://www.image-net.org/">ImageNet</a> academic competition with an error rate that approached <a href="http://karpathy.github.io/2014/09/02/what-i-learned-from-competing-against-a-convnet-on-imagenet/">human performance</a>. After all, what self-respecting computer vision system would fail to recognize a cute puppy?<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://3.bp.blogspot.com/-W__wiaHUjwI/Vt3Grd8df0I/AAAAAAAAA78/7xqUNj8ujtY/s1600/image02.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="400" src="https://3.bp.blogspot.com/-W__wiaHUjwI/Vt3Grd8df0I/AAAAAAAAA78/7xqUNj8ujtY/s400/image02.png" width="331" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Image via <a href="https://commons.wikimedia.org/wiki/File:Golde33443.jpg">Wikipedia</a></td></tr></tbody></table>Well, thankfully the image classification model would recognize this image as a <i>retriever</i> with 79.3% confidence. But, more spectacularly, it would also be able to distinguish between a <a href="https://en.wikipedia.org/wiki/Spotted_salamander">spotted salamander</a> and <a href="https://en.wikipedia.org/wiki/Fire_salamander">fire salamander</a> with high confidence – a task that might be quite difficult for those not experts in <a href="https://en.wikipedia.org/wiki/Herpetology">herpetology</a>. Can <i>you</i> tell the difference? <br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://3.bp.blogspot.com/-FRQzMrAIr_g/Vt3G7-9oSkI/AAAAAAAAA8A/mdnBJBm3oM0/s1600/salamander.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="198" src="https://3.bp.blogspot.com/-FRQzMrAIr_g/Vt3G7-9oSkI/AAAAAAAAA8A/mdnBJBm3oM0/s640/salamander.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Images via <a href="https://en.wikipedia.org/wiki/Fire_salamander#/media/File:Salamandra_salamandra_MHNT_1.jpg">Wikipedia</a></td></tr></tbody></table>The deep learning model we released, <b>Inception-v3</b>, is described in our Arxiv preprint "<a href="http://arxiv.org/abs/1512.00567"><i>Rethinking the Inception Architecture for Computer Vision</i></a>” and can be visualized with this schematic diagram:<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-TMOLlkJBxms/Vt3HQXpE2cI/AAAAAAAAA8E/7X7XRFOY6Xo/s1600/image03.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://4.bp.blogspot.com/-TMOLlkJBxms/Vt3HQXpE2cI/AAAAAAAAA8E/7X7XRFOY6Xo/s1600/image03.png" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Schematic diagram of Inception-v3</td></tr></tbody></table>As described in the preprint, this model achieves 5.64% top-5 error while an ensemble of four of these models achieves 3.58% top-5 error on the validation set of the ImageNet whole image <a href="http://www.image-net.org/challenges/LSVRC/2012/">ILSVRC 2012</a> classification task. Furthermore, in the <a href="http://image-net.org/challenges/LSVRC/2015/results">2015 ImageNet Challenge</a>, an ensemble of 4 of these models came in 2nd in the image classification task.<br /><br />After the release of this model, many people in the TensorFlow community voiced their preference on having an Inception-v3 model that they can train themselves, rather than using our pre-trained model. We could not agree more, since a system for training an Inception-v3 model provides many opportunities, including:<br /><ul><li>Exploration of different variants of this model architecture in order to improve the image classification system.</li><li>Comparison of optimization algorithms and hardware setups for training this model faster or to a higher degree of predictive performance.</li><li>Retraining/fine-tuning the Inception-v3 model on a distinct image classification task or as a component of a larger network tasked with object detection or multi-modal learning.</li></ul>The last topic is often referred to as <a href="https://en.wikipedia.org/wiki/Inductive_transfer"><i>transfer learning</i></a>, and has been an area of particular excitement in the field of deep networks in the context of vision. A common prescription to a computer vision problem is to first train an image classification model with the ImageNet Challenge data set, and then transfer this model’s knowledge to a distinct task. This has been done for <a href="http://arxiv.org/abs/1311.2524">object detection</a>, <a href="http://arxiv.org/abs/1312.5650">zero-shot learning</a>, <a href="http://googleresearch.blogspot.com/2014/11/a-picture-is-worth-thousand-coherent.html">image captioning</a>, <a href="http://ieeexplore.ieee.org/xpl/login.jsp?tp=&amp;arnumber=6751448">video analysis</a> and multitudes of other applications.<br /><br />Today we are happy to announce that <a href="https://github.com/tensorflow/models/tree/master/inception">we are releasing libraries and code for training <b>Inception-v3</b></a> on one or multiple GPU’s. Some features of this code include:<br /><ul><li>Training an Inception-v3 model with synchronous updates across multiple GPUs.</li><li>Employing batch normalization to speed up training of the model.</li><li>Leveraging many distortions of the image to augment model training.</li><li>Releasing a new (still experimental) high-level language for specifying complex model architectures, which we call <a href="https://github.com/tensorflow/models/blob/master/inception/inception/slim/README.md"><b>TensorFlow-Slim</b></a>.</li><li>Demonstrating how to perform transfer learning by taking a pre-trained Inception-v3 model and fine-tuning it for another task.</li></ul>We can train a model from scratch to its best performance on a desktop with 8 NVIDIA Tesla K40s in about 2 weeks. In order to make research progress faster, we are additionally supplying a new version of a pre-trained Inception-v3 model that is ready to be fine-tuned or adapted to a new task. We demonstrate how to use this model for transfer learning on a simple flower classification task. Hopefully, this provides a useful didactic example for employing this Inception model on wide range of vision tasks.<br /><br />Want to get started? See the accompanying <b><a href="https://github.com/tensorflow/models/tree/master/inception/README.md">instructions</a></b> on how to <a href="https://github.com/tensorflow/models/tree/master/inception/README.md#how-to-train-from-scratch">train</a>, <a href="https://github.com/tensorflow/models/tree/master/inception/README.md#how-to-evaluate">evaluate</a> or <a href="https://github.com/tensorflow/models/tree/master/inception/README.md#how-to-fine-tune-a-pre-trained-model-on-a-new-task">fine-tune</a> a network.<br /><br />Releasing this code has been a huge team effort. These efforts have taken several months with contributions from many individuals spanning research at Google. We wish to especially acknowledge the following people who contributed to this project:<br /><ul><li><b>Model Architecture</b> – Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Jon Shlens and Zbigniew Wojna</li><li><b>Systems Infrastructure</b> – Sherry Moore, Martin Wicke, David Andersen, Matthieu Devin, Manjunath Kudlur and Nishant Patil</li><li><b>TensorFlow-Slim</b> – Sergio Guadarrama and Nathan Silberman</li><li><b>Model Visualization</b> – Fernanda Viégas, Martin Wattenberg and James Wexler</li></ul>]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/train-your-own-image-classifier-with-inception-in-tensorflow/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>Deep Learning for Robots: Learning from Large-Scale Interaction</title>
		<link>https://googledata.org/google-research/deep-learning-for-robots-learning-from-large-scale-interaction/</link>
		<comments>https://googledata.org/google-research/deep-learning-for-robots-learning-from-large-scale-interaction/#comments</comments>
		<pubDate>Tue, 08 Mar 2016 18:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=33ed22a2fcaf86a6d2746154bcfdf750</guid>
		<description><![CDATA[<span>Posted by Sergey Levine, Research Scientist</span><br /><br />While we&#8217;ve recently seen great strides in robotic capability, the gap between human and robot motor skills remains vast. Machines still have a very long way to go to match human proficiency even at basic sensorimotor skills like grasping. However, by linking learning with continuous feedback and control, we might begin to bridge that gap, and in so doing make it possible for robots to intelligently and reliably handle the complexities of the real world.<br /><br />Consider for example <a href="https://www.youtube.com/watch?v=PomkJ4l9CMU">this robot</a> from <a href="http://www.kaist.edu/html/en/index.html">KAIST</a>, which won last year&#8217;s <a href="http://www.theroboticschallenge.org/">DARPA robotics challenge</a>. The remarkably precise and deliberate motions are deeply impressive. But they are also quite&#8230; robotic. Why is that? What makes robot behavior so distinctly robotic compared to human behavior? At a high level, current robots typically follow a sense-plan-act paradigm, where the robot observes the world around it, formulates an internal model, constructs a plan of action, and then executes this plan. This approach is modular and often effective, but tends to break down in the kinds of cluttered natural environments that are typical of the real world. Here, perception is imprecise, all models are wrong in some way, and no plan survives first contact with reality.<br /><br />In contrast, humans and animals move quickly, reflexively, and often with remarkably little advance planning, by relying on highly developed and intelligent feedback mechanisms that use sensory cues to correct mistakes and compensate for perturbations. For example, when serving a tennis ball, the player continually observes the ball and the racket, adjusting the motion of his hand so that they meet in the air. This kind of feedback is fast, efficient, and, crucially, can correct for mistakes or unexpected perturbations. Can we train robots to reliably handle complex real-world situations by using similar feedback mechanisms to handle perturbations and correct mistakes?<br /><br />While servoing and feedback control have been studied extensively in robotics, the question of how to define the right sensory cue remains exceptionally challenging, especially for rich modalities such as vision. So instead of choosing the cues by hand, we can program a robot to acquire them on its own from scratch, by learning from extensive experience in the real world. In our first experiments with real physical robots, we decided to tackle robotic grasping in clutter.<br /><br />A human child is able to reliably grasp objects after one year, and takes around four years to acquire more sophisticated precision grasps. However, networked robots can instantaneously share their experience with one another, so if we dedicate 14 separate robots to the job of learning grasping in parallel, we can acquire the necessary experience much faster. Below is a video of our robots practicing grasping a range of common office and household objects:<br /><div></div>While initially the grasps are executed at random and succeed only rarely, each day the latest experiences are used to train a deep <a href="https://en.wikipedia.org/wiki/Convolutional_neural_network">convolutional neural network</a> (CNN) to learn to predict the outcome of a grasp, given a camera image and a potential motor command. This CNN is then deployed on the robots the following day, in the inner loop of a servoing mechanism that continually adjusts the robot&#8217;s motion to maximize the predicted chance of a successful grasp. In essence, the robot is constantly predicting, by observing the motion of its own hand, which kind of subsequent motion will maximize its chances of success. The result is continuous feedback: what we might call hand-eye coordination. Observing the behavior of the robot after over 800,000 grasp attempts, which is equivalent to about 3000 robot-hours of practice, we can see the beginnings of intelligent reactive behaviors. The robot observes its own gripper and corrects its motions in real time. It also exhibits interesting pre-grasp behaviors, like isolating a single object from a group. All of these behaviors emerged naturally from learning, rather than being programmed into the system.<br /><div></div>To evaluate whether the system achieves measurable benefit from continuous feedback, we can compare its performance to an open-loop baseline that closer resembles the perception-planning-action loop described previously, albeit with a learned CNN used to determine both the open-loop grasps and the closed-loop servoing trained on the same data. This approach is most similar to <a href="http://arxiv.org/abs/1509.06825">recent work by Pinto and Gupta</a>. With open-loop grasp selection, the robot chooses a single grasp pose from a single image, and then blindly executes this grasp. This method has a 34% average failure rate on the first 30 picking attempts for this set of office objects:<br /><div></div>Incorporating continuous feedback into the system reduces the failures by nearly half, down to 18% from 34%, and produces interesting corrections and adjustments:<br /><div></div>Neural networks have made great strides in allowing us to build computer programs that can process images, speech, text, and even draw pictures. However, introducing actions and control adds considerable new challenges, since every decision the network makes will affect what it sees next. Overcoming these challenges will bring us closer to building systems that understand the effects of their actions in the world. If we can bring the power of large-scale machine learning to robotic control, perhaps we will come one step closer to solving fundamental problems in robotics and automation.<br /><br /><i>The research on robotic hand-eye coordination and grasping was conducted by Sergey Levine, Peter Pastor, Alex Krizhevsky, and Deirdre Quillen, with special thanks to colleagues at Google Research and X who've contributed their expertise and time to this research. An early preprint is <a href="http://arxiv.org/abs/1603.02199">available on arXiv</a></i>.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Sergey Levine, Research Scientist</span><br /><br />While we’ve recently seen great strides in robotic capability, the gap between human and robot motor skills remains vast. Machines still have a very long way to go to match human proficiency even at basic sensorimotor skills like grasping. However, by linking learning with continuous feedback and control, we might begin to bridge that gap, and in so doing make it possible for robots to intelligently and reliably handle the complexities of the real world.<br /><br />Consider for example <a href="https://www.youtube.com/watch?v=PomkJ4l9CMU">this robot</a> from <a href="http://www.kaist.edu/html/en/index.html">KAIST</a>, which won last year’s <a href="http://www.theroboticschallenge.org/">DARPA robotics challenge</a>. The remarkably precise and deliberate motions are deeply impressive. But they are also quite… robotic. Why is that? What makes robot behavior so distinctly robotic compared to human behavior? At a high level, current robots typically follow a sense-plan-act paradigm, where the robot observes the world around it, formulates an internal model, constructs a plan of action, and then executes this plan. This approach is modular and often effective, but tends to break down in the kinds of cluttered natural environments that are typical of the real world. Here, perception is imprecise, all models are wrong in some way, and no plan survives first contact with reality.<br /><br />In contrast, humans and animals move quickly, reflexively, and often with remarkably little advance planning, by relying on highly developed and intelligent feedback mechanisms that use sensory cues to correct mistakes and compensate for perturbations. For example, when serving a tennis ball, the player continually observes the ball and the racket, adjusting the motion of his hand so that they meet in the air. This kind of feedback is fast, efficient, and, crucially, can correct for mistakes or unexpected perturbations. Can we train robots to reliably handle complex real-world situations by using similar feedback mechanisms to handle perturbations and correct mistakes?<br /><br />While servoing and feedback control have been studied extensively in robotics, the question of how to define the right sensory cue remains exceptionally challenging, especially for rich modalities such as vision. So instead of choosing the cues by hand, we can program a robot to acquire them on its own from scratch, by learning from extensive experience in the real world. In our first experiments with real physical robots, we decided to tackle robotic grasping in clutter.<br /><br />A human child is able to reliably grasp objects after one year, and takes around four years to acquire more sophisticated precision grasps. However, networked robots can instantaneously share their experience with one another, so if we dedicate 14 separate robots to the job of learning grasping in parallel, we can acquire the necessary experience much faster. Below is a video of our robots practicing grasping a range of common office and household objects:<br /><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/iaF43Ze1oeI/0.jpg" frameborder="0" height="360" src="https://www.youtube.com/embed/iaF43Ze1oeI?rel=0&amp;feature=player_embedded" width="640"></iframe></div>While initially the grasps are executed at random and succeed only rarely, each day the latest experiences are used to train a deep <a href="https://en.wikipedia.org/wiki/Convolutional_neural_network">convolutional neural network</a> (CNN) to learn to predict the outcome of a grasp, given a camera image and a potential motor command. This CNN is then deployed on the robots the following day, in the inner loop of a servoing mechanism that continually adjusts the robot’s motion to maximize the predicted chance of a successful grasp. In essence, the robot is constantly predicting, by observing the motion of its own hand, which kind of subsequent motion will maximize its chances of success. The result is continuous feedback: what we might call hand-eye coordination. Observing the behavior of the robot after over 800,000 grasp attempts, which is equivalent to about 3000 robot-hours of practice, we can see the beginnings of intelligent reactive behaviors. The robot observes its own gripper and corrects its motions in real time. It also exhibits interesting pre-grasp behaviors, like isolating a single object from a group. All of these behaviors emerged naturally from learning, rather than being programmed into the system.<br /><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/l8zKZLqkfII/0.jpg" frameborder="0" height="360" src="https://www.youtube.com/embed/l8zKZLqkfII?rel=0&amp;feature=player_embedded" width="640"></iframe></div>To evaluate whether the system achieves measurable benefit from continuous feedback, we can compare its performance to an open-loop baseline that closer resembles the perception-planning-action loop described previously, albeit with a learned CNN used to determine both the open-loop grasps and the closed-loop servoing trained on the same data. This approach is most similar to <a href="http://arxiv.org/abs/1509.06825">recent work by Pinto and Gupta</a>. With open-loop grasp selection, the robot chooses a single grasp pose from a single image, and then blindly executes this grasp. This method has a 34% average failure rate on the first 30 picking attempts for this set of office objects:<br /><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/Q9tDHuidzak/0.jpg" frameborder="0" height="360" src="https://www.youtube.com/embed/Q9tDHuidzak?rel=0&amp;feature=player_embedded" width="640"></iframe></div>Incorporating continuous feedback into the system reduces the failures by nearly half, down to 18% from 34%, and produces interesting corrections and adjustments:<br /><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/H4V6NZLNu-c/0.jpg" frameborder="0" height="360" src="https://www.youtube.com/embed/H4V6NZLNu-c?rel=0&amp;feature=player_embedded" width="640"></iframe></div>Neural networks have made great strides in allowing us to build computer programs that can process images, speech, text, and even draw pictures. However, introducing actions and control adds considerable new challenges, since every decision the network makes will affect what it sees next. Overcoming these challenges will bring us closer to building systems that understand the effects of their actions in the world. If we can bring the power of large-scale machine learning to robotic control, perhaps we will come one step closer to solving fundamental problems in robotics and automation.<br /><br /><i>The research on robotic hand-eye coordination and grasping was conducted by Sergey Levine, Peter Pastor, Alex Krizhevsky, and Deirdre Quillen, with special thanks to colleagues at Google Research and X who've contributed their expertise and time to this research. An early preprint is <a href="http://arxiv.org/abs/1603.02199">available on arXiv</a></i>.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/deep-learning-for-robots-learning-from-large-scale-interaction/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>An Update on fast Transit Routing with Transfer Patterns</title>
		<link>https://googledata.org/google-research/an-update-on-fast-transit-routing-with-transfer-patterns/</link>
		<comments>https://googledata.org/google-research/an-update-on-fast-transit-routing-with-transfer-patterns/#comments</comments>
		<pubDate>Wed, 02 Mar 2016 12:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=c2cd4c080483466b575a8f256d6f0c5a</guid>
		<description><![CDATA[<span>Arno Eigenwillig, Software Engineer on Google Maps Directions</span><br /><br />What is the best way to get from A to B by public transit? Google Maps is answering such queries for over 20,000 cities and towns in over 70 countries around the world, including large metro areas like New York, S&#227;o Paulo or Moscow, and some complete countries, such as Japan or Great Britain.<br /><div><a href="https://1.bp.blogspot.com/-GqQpmehJy0E/VtTDk2077HI/AAAAAAAAA7Y/ThiBUFbufMk/s1600/image00.png"><img border="0" height="368" src="https://1.bp.blogspot.com/-GqQpmehJy0E/VtTDk2077HI/AAAAAAAAA7Y/ThiBUFbufMk/s640/image00.png" width="640"></a></div>Since its <a href="https://googleblog.blogspot.ch/2005/12/public-transit-via-google.html">beginnings in 2005</a> with the single city of Portland, Oregon, the number of cities and countries served by Google&#8217;s public transit directions has been growing rapidly. With more and larger regions, the amount of data we need to search in order to provide optimal directions has grown as well. In 2010, the search speed of transit directions made a leap ahead of that growth and became fast enough to update the result <a href="http://google-latlong.blogspot.ch/2010/03/planning-your-public-transit-ride.html">while you drag the endpoints</a>. The technique behind that speed-up is the Transfer Patterns algorithm [1], which was created at Google&#8217;s engineering office in Zurich, Switzerland, by visiting researcher Hannah Bast and a number of Google engineers.<br /><br />I am happy to report that this research collaboration has continued and expanded with the <a href="http://research.google.com/research-outreach.html#/research-outreach/faculty-engagement/focused-research-awards">Google Focused Research Award</a> on <a href="http://ad.informatik.uni-freiburg.de/projects/google-focused-research-award-next-generation-route-planning">Next-Generation Route Planning</a>. Over the past three years, this grant has supported <a href="http://ad.informatik.uni-freiburg.de/staff/bast">Hannah Bast</a>&#8217;s research group at the <a href="http://www.uni-freiburg.de/">University of Freiburg</a>, as well as the research groups of <a href="http://algo2.iti.kit.edu/sanders.php">Peter Sanders</a> and <a href="http://i11www.iti.uni-karlsruhe.de/en/members/dorothea_wagner/index">Dorothea Wagne</a>r at the <a href="http://www.kit.edu/index.php">Karlsruhe Institute of Technology</a> (KIT).<br /><br />From the project&#8217;s numerous outcomes, I&#8217;d like to highlight two recent ones that re-examine the Transfer Patterns approach and massively improve it for continent-sized networks: <a href="http://ad-publications.informatik.uni-freiburg.de/ALENEX_scalable_tp_BHS_2016.pdf"><b>Scalable Transfer Patterns</b></a> [2] and <a href="http://ad-publications.informatik.uni-freiburg.de/SIGSPATIAL_frequency_BS_2014.pdf"><b>Frequency-Based Search for Public Transit</b></a> [3] by Hannah Bast, <a href="http://ad.informatik.uni-freiburg.de/staff/storandt">Sabine Storandt</a> and Matthias Hertel. This blogpost presents the results from these publications.<br /><br />The notion of a <i>transfer pattern</i> is easy to understand. Suppose you are at a transit stop downtown, call it A, and want to go to some stop B as quickly as possible. Suppose further you brought a printed schedule book but no smartphone. (This sounded plausible only a few years ago!) As a local, you might know that there are only two reasonable options:<br /><ol><li>Take a tram from A to C, then transfer at C to a bus to B.</li><li>Take the direct bus from A to B, which only runs infrequently.</li></ol>We say the first option has transfer pattern A-C-B, and the second option has transfer pattern A-B. Notice that no in-between stops are mentioned. This is very compact information, much less than the actual schedules, but it makes looking up the schedules significantly faster: Knowing that all optimal trips follow one of these patterns, you only need to look at those lines in the schedule book that provide direct connections from A to C, C to B and A to B. All other lines can safely be ignored: you know you will not miss a better option.<br /><br />While the basic idea of transfer patterns is indeed that simple, it takes more to make it work in practice. The transfer patterns of all optimal trips have to be computed ahead of time and stored, so that they are available to answer queries. Conceptually, we need transfer patterns for every pair of stops, because any pair could come up in a query. It is perfectly reasonable to compute them for all pairs within one city, or even one metro area that is densely criss-crossed by a transit network comprising, say, a thousand stops, yielding a million of pairs to consider.<br /><br />As the scale of the problem increases from one metro area to an entire country or continent, this &#8220;all pairs&#8221; approach rapidly becomes expensive: ten thousand stops (10x more than above) already yield a hundred million pairs (100x more than above), and so on. Also, the individual transfer patterns become quite repetitive: For example, from any stop in Paris, France to any stop in Cologne, Germany, all optimal connections end up using the same few long-distance train lines in the middle, only the local connections to the railway stations depend on the specific pair of stops considered.<br /><br />However, designated long-distance connections are not the only way to travel between different local networks &#8211; they also overlap and connect to each other. For mid-range trips, there is no universally correct rule when to choose a long-distance train or intercity bus, short of actually comparing options with local or regional transit, too.<br /><br />The Scalable Transfer Patterns algorithm [2] does just that, but in a smart way. For starters, it uses what is known as <i>graph clustering</i> to cut the network into pieces, called <i>clusters</i>, that have a lot of connections inside but relatively few to the outside. As an example, the figure below (kindly provided by the authors) shows a partitioning of Germany into clusters. The stops highlighted in red are <i>border stops</i>: They connect directly to stops outside the cluster. Notice how they are a small fraction of the total network.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://2.bp.blogspot.com/-DAUmD3ohHlQ/VtTD6bC6DXI/AAAAAAAAA7s/KEa6gBXkEho/s1600/image02.png"><img border="0" src="https://2.bp.blogspot.com/-DAUmD3ohHlQ/VtTD6bC6DXI/AAAAAAAAA7s/KEa6gBXkEho/s1600/image02.png"></a></td></tr><tr><td>The public transit network of Germany (dots and lines), split into clusters (shown in various colors). Of all  251,763 stops, only 10,886 (4.32%) are boundary stops, highlighted as red boxes. <a href="http://ad-publications.informatik.uni-freiburg.de/ALENEX_scalable_tp_BHS_2016.materials/germany.png">Click here</a> to view the full resolution image.[source: S. Storandt, 2016]</td></tr></tbody></table><div></div><div></div>Based on the clustering, the transfer patterns of all optimal connections are computed in two steps.<br /><br /><b>In step 1</b>, transfer patterns are computed for optimal connections inside each cluster. They are stored for query processing later on, but they also accelerate the search <i>through</i> a cluster in the following step: between the stops on its border, we only need to consider the connections captured in the transfer patterns.<br /><br />The next figure sketches how the transit network in the cluster around Berlin gets reduced to much fewer connections between border stations. (The central station stands out as a hub, as expected. It is a border station itself, because it has direct connections out of the cluster.)<br /><div></div><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://4.bp.blogspot.com/--Z-F6ZPlhFs/VtTDuXtv0bI/AAAAAAAAA7s/XKK1Etk98Zo/s1600/image01.png"><img border="0" src="https://4.bp.blogspot.com/--Z-F6ZPlhFs/VtTDuXtv0bI/AAAAAAAAA7s/XKK1Etk98Zo/s1600/image01.png"></a></td></tr><tr><td>The cluster of public transit connections around Berlin (shown as dots and lines in light blue), its border stops (highlighted as red boxes), and the transfer patterns of optimal connections between border stops (thick black lines; only the most important 111 of 592 are shown to keep the image legible). This cuts out 96.15% of the network (especially a lot of the high-frequency inner city trips, which makes the time savings even bigger). <a href="http://ad-publications.informatik.uni-freiburg.de/ALENEX_scalable_tp_BHS_2016.materials/berlin.png">Click here</a> to view the full resolution image. [source: S. Storandt, 2016]</td></tr></tbody></table><b>In step 2</b>, transfer patterns can be computed for the entire network, that is, between any pair of clusters. This is done with the following twists:<br /><br /><ul><li>It suffices to consider trips from and to boundary stops of any cluster; the local transfer patterns from step 1 will supply the missing pieces later on.</li><li>The per-cluster transfer patterns from step 1 greatly accelerate the search across other clusters.</li><li>The search stops exploring any possible connection between two boundary stops as soon as it gets worse than a connection that sticks to long-distance transit between clusters (which may not always be optimal, but is always quick to compute).</li></ul><br />The results of steps 1 and 2 are stored and used to answer queries. For any given query from some A to some B, one can now easily stitch together a network of transfer patterns that covers all optimal connections from A to B. Looking up the direct connections on that small network (like in the introductory example) and finding the best one for the queried time is very fast, even if A and B are far away.<br /><br />The total storage space needed for this is much smaller than the space that would be needed for all pairs of stops, all the more the larger the network gets. Extrapolating from their experiments, the researchers estimate [2] that Scalable Transfer Patterns for the whole world could be stored in 30 GB, cutting their estimate for the basic Transfer Patterns by a thousand(!). This is considerably more powerful than the &#8220;hub station&#8221; idea from the original Transfer Patterns paper [1].<br /><br />The time needed to compute Scalable Transfer Patterns is also estimated to shrink by three orders of magnitude: At a high level, the earlier phases of the algorithm accelerate the later ones, as described above. At a low level, a second optimization technique kicks in: exploiting the repetitiveness of schedules in time. Recall that finding transfer patterns is all about finding the optimal connections between pairs of stops <i>at any possible departure time</i>.<br /><br />Frequency-based schedules (e.g., one bus every 10 minutes) cause a lot of similarity during the day, although it often doesn&#8217;t match up between lines (e.g., said bus runs every 10 minutes before 6pm and every 20 minutes after, and we seek connections to a train that runs every 12 minutes before 8pm and every 30 minutes after). Moreover, this similarity also exists from one day to the next, and we need to consider all permissible departure dates. <br /><br />The Frequency-Based Search for Public Transit [3] is carefully designed to find and take advantage of repetitive schedules while representing all one-off cases exactly. Comparing to the set-up from the original Transfer Patterns paper [1], the authors estimate a whopping 60x acceleration of finding transfer patterns from this part alone.<br /><br />I am excited to see that the scalability questions originally left open by [1] have been answered so convincingly as part of this Focused Research Award. Please see the <a href="http://ad.informatik.uni-freiburg.de/projects/google-focused-research-award-next-generation-route-planning#publications">list of publications</a> on the <a href="http://ad.informatik.uni-freiburg.de/projects/google-focused-research-award-next-generation-route-planning">project&#8217;s website</a> for more outcomes of this award. Besides more on transfer patterns, they contain a wealth of other results about routing on road networks, transit networks, and with combinations of travel modes.<br /><br />References:<br /><br />[1] <a href="http://ad-publications.informatik.uni-freiburg.de/ESA_transferpatterns_BCEGHRV_2010.pdf">Fast Routing in Very Large Public Transportation Networks Using Transfer Patterns</a><br />by H. Bast, E. Carlsson, A. Eigenwillig, R. Geisberger, C. Harrelson, V. Raychev and F. Viger<br />(ESA 2010). [<a href="http://dx.doi.org/10.1007/978-3-642-15775-2_25">doi</a>]<br /><br />[2] <a href="http://ad-publications.informatik.uni-freiburg.de/ALENEX_scalable_tp_BHS_2016.pdf">Scalable Transfer Patterns</a><br />by H. Bast, M. Hertel and S. Storandt (ALENEX 2016). [<a href="http://dx.doi.org/10.1137/1.9781611974317.2">doi</a>]<br /><br />[3] <a href="http://ad-publications.informatik.uni-freiburg.de/SIGSPATIAL_frequency_BS_2014.pdf">Frequency-based Search for Public Transit</a><br />by H. Bast and S. Storandt (SIGSPATIAL 2014). [<a href="http://doi.acm.org/10.1145/2666310.2666405">doi</a>]]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Arno Eigenwillig, Software Engineer on Google Maps Directions</span><br /><br />What is the best way to get from A to B by public transit? Google Maps is answering such queries for over 20,000 cities and towns in over 70 countries around the world, including large metro areas like New York, São Paulo or Moscow, and some complete countries, such as Japan or Great Britain.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-GqQpmehJy0E/VtTDk2077HI/AAAAAAAAA7Y/ThiBUFbufMk/s1600/image00.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="368" src="https://1.bp.blogspot.com/-GqQpmehJy0E/VtTDk2077HI/AAAAAAAAA7Y/ThiBUFbufMk/s640/image00.png" width="640" /></a></div>Since its <a href="https://googleblog.blogspot.ch/2005/12/public-transit-via-google.html">beginnings in 2005</a> with the single city of Portland, Oregon, the number of cities and countries served by Google’s public transit directions has been growing rapidly. With more and larger regions, the amount of data we need to search in order to provide optimal directions has grown as well. In 2010, the search speed of transit directions made a leap ahead of that growth and became fast enough to update the result <a href="http://google-latlong.blogspot.ch/2010/03/planning-your-public-transit-ride.html">while you drag the endpoints</a>. The technique behind that speed-up is the Transfer Patterns algorithm [1], which was created at Google’s engineering office in Zurich, Switzerland, by visiting researcher Hannah Bast and a number of Google engineers.<br /><br />I am happy to report that this research collaboration has continued and expanded with the <a href="http://research.google.com/research-outreach.html#/research-outreach/faculty-engagement/focused-research-awards">Google Focused Research Award</a> on <a href="http://ad.informatik.uni-freiburg.de/projects/google-focused-research-award-next-generation-route-planning">Next-Generation Route Planning</a>. Over the past three years, this grant has supported <a href="http://ad.informatik.uni-freiburg.de/staff/bast">Hannah Bast</a>’s research group at the <a href="http://www.uni-freiburg.de/">University of Freiburg</a>, as well as the research groups of <a href="http://algo2.iti.kit.edu/sanders.php">Peter Sanders</a> and <a href="http://i11www.iti.uni-karlsruhe.de/en/members/dorothea_wagner/index">Dorothea Wagne</a>r at the <a href="http://www.kit.edu/index.php">Karlsruhe Institute of Technology</a> (KIT).<br /><br />From the project’s numerous outcomes, I’d like to highlight two recent ones that re-examine the Transfer Patterns approach and massively improve it for continent-sized networks: <a href="http://ad-publications.informatik.uni-freiburg.de/ALENEX_scalable_tp_BHS_2016.pdf"><b>Scalable Transfer Patterns</b></a> [2] and <a href="http://ad-publications.informatik.uni-freiburg.de/SIGSPATIAL_frequency_BS_2014.pdf"><b>Frequency-Based Search for Public Transit</b></a> [3] by Hannah Bast, <a href="http://ad.informatik.uni-freiburg.de/staff/storandt">Sabine Storandt</a> and Matthias Hertel. This blogpost presents the results from these publications.<br /><br />The notion of a <i>transfer pattern</i> is easy to understand. Suppose you are at a transit stop downtown, call it A, and want to go to some stop B as quickly as possible. Suppose further you brought a printed schedule book but no smartphone. (This sounded plausible only a few years ago!) As a local, you might know that there are only two reasonable options:<br /><ol><li>Take a tram from A to C, then transfer at C to a bus to B.</li><li>Take the direct bus from A to B, which only runs infrequently.</li></ol>We say the first option has transfer pattern A-C-B, and the second option has transfer pattern A-B. Notice that no in-between stops are mentioned. This is very compact information, much less than the actual schedules, but it makes looking up the schedules significantly faster: Knowing that all optimal trips follow one of these patterns, you only need to look at those lines in the schedule book that provide direct connections from A to C, C to B and A to B. All other lines can safely be ignored: you know you will not miss a better option.<br /><br />While the basic idea of transfer patterns is indeed that simple, it takes more to make it work in practice. The transfer patterns of all optimal trips have to be computed ahead of time and stored, so that they are available to answer queries. Conceptually, we need transfer patterns for every pair of stops, because any pair could come up in a query. It is perfectly reasonable to compute them for all pairs within one city, or even one metro area that is densely criss-crossed by a transit network comprising, say, a thousand stops, yielding a million of pairs to consider.<br /><br />As the scale of the problem increases from one metro area to an entire country or continent, this “all pairs” approach rapidly becomes expensive: ten thousand stops (10x more than above) already yield a hundred million pairs (100x more than above), and so on. Also, the individual transfer patterns become quite repetitive: For example, from any stop in Paris, France to any stop in Cologne, Germany, all optimal connections end up using the same few long-distance train lines in the middle, only the local connections to the railway stations depend on the specific pair of stops considered.<br /><br />However, designated long-distance connections are not the only way to travel between different local networks – they also overlap and connect to each other. For mid-range trips, there is no universally correct rule when to choose a long-distance train or intercity bus, short of actually comparing options with local or regional transit, too.<br /><br />The Scalable Transfer Patterns algorithm [2] does just that, but in a smart way. For starters, it uses what is known as <i>graph clustering</i> to cut the network into pieces, called <i>clusters</i>, that have a lot of connections inside but relatively few to the outside. As an example, the figure below (kindly provided by the authors) shows a partitioning of Germany into clusters. The stops highlighted in red are <i>border stops</i>: They connect directly to stops outside the cluster. Notice how they are a small fraction of the total network.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://2.bp.blogspot.com/-DAUmD3ohHlQ/VtTD6bC6DXI/AAAAAAAAA7s/KEa6gBXkEho/s1600/image02.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://2.bp.blogspot.com/-DAUmD3ohHlQ/VtTD6bC6DXI/AAAAAAAAA7s/KEa6gBXkEho/s1600/image02.png" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The public transit network of Germany (dots and lines), split into clusters (shown in various colors). Of all  251,763 stops, only 10,886 (4.32%) are boundary stops, highlighted as red boxes. <a href="http://ad-publications.informatik.uni-freiburg.de/ALENEX_scalable_tp_BHS_2016.materials/germany.png">Click here</a> to view the full resolution image.[source: S. Storandt, 2016]</td></tr></tbody></table><div class="separator" style="clear: both; text-align: center;"></div><div class="separator" style="clear: both; text-align: center;"></div>Based on the clustering, the transfer patterns of all optimal connections are computed in two steps.<br /><br /><b>In step 1</b>, transfer patterns are computed for optimal connections inside each cluster. They are stored for query processing later on, but they also accelerate the search <i>through</i> a cluster in the following step: between the stops on its border, we only need to consider the connections captured in the transfer patterns.<br /><br />The next figure sketches how the transit network in the cluster around Berlin gets reduced to much fewer connections between border stations. (The central station stands out as a hub, as expected. It is a border station itself, because it has direct connections out of the cluster.)<br /><div class="separator" style="clear: both; text-align: center;"></div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/--Z-F6ZPlhFs/VtTDuXtv0bI/AAAAAAAAA7s/XKK1Etk98Zo/s1600/image01.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://4.bp.blogspot.com/--Z-F6ZPlhFs/VtTDuXtv0bI/AAAAAAAAA7s/XKK1Etk98Zo/s1600/image01.png" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The cluster of public transit connections around Berlin (shown as dots and lines in light blue), its border stops (highlighted as red boxes), and the transfer patterns of optimal connections between border stops (thick black lines; only the most important 111 of 592 are shown to keep the image legible). This cuts out 96.15% of the network (especially a lot of the high-frequency inner city trips, which makes the time savings even bigger). <a href="http://ad-publications.informatik.uni-freiburg.de/ALENEX_scalable_tp_BHS_2016.materials/berlin.png">Click here</a> to view the full resolution image. [source: S. Storandt, 2016]</td></tr></tbody></table><b>In step 2</b>, transfer patterns can be computed for the entire network, that is, between any pair of clusters. This is done with the following twists:<br /><br /><ul><li>It suffices to consider trips from and to boundary stops of any cluster; the local transfer patterns from step 1 will supply the missing pieces later on.</li><li>The per-cluster transfer patterns from step 1 greatly accelerate the search across other clusters.</li><li>The search stops exploring any possible connection between two boundary stops as soon as it gets worse than a connection that sticks to long-distance transit between clusters (which may not always be optimal, but is always quick to compute).</li></ul><br />The results of steps 1 and 2 are stored and used to answer queries. For any given query from some A to some B, one can now easily stitch together a network of transfer patterns that covers all optimal connections from A to B. Looking up the direct connections on that small network (like in the introductory example) and finding the best one for the queried time is very fast, even if A and B are far away.<br /><br />The total storage space needed for this is much smaller than the space that would be needed for all pairs of stops, all the more the larger the network gets. Extrapolating from their experiments, the researchers estimate [2] that Scalable Transfer Patterns for the whole world could be stored in 30 GB, cutting their estimate for the basic Transfer Patterns by a thousand(!). This is considerably more powerful than the “hub station” idea from the original Transfer Patterns paper [1].<br /><br />The time needed to compute Scalable Transfer Patterns is also estimated to shrink by three orders of magnitude: At a high level, the earlier phases of the algorithm accelerate the later ones, as described above. At a low level, a second optimization technique kicks in: exploiting the repetitiveness of schedules in time. Recall that finding transfer patterns is all about finding the optimal connections between pairs of stops <i>at any possible departure time</i>.<br /><br />Frequency-based schedules (e.g., one bus every 10 minutes) cause a lot of similarity during the day, although it often doesn’t match up between lines (e.g., said bus runs every 10 minutes before 6pm and every 20 minutes after, and we seek connections to a train that runs every 12 minutes before 8pm and every 30 minutes after). Moreover, this similarity also exists from one day to the next, and we need to consider all permissible departure dates. <br /><br />The Frequency-Based Search for Public Transit [3] is carefully designed to find and take advantage of repetitive schedules while representing all one-off cases exactly. Comparing to the set-up from the original Transfer Patterns paper [1], the authors estimate a whopping 60x acceleration of finding transfer patterns from this part alone.<br /><br />I am excited to see that the scalability questions originally left open by [1] have been answered so convincingly as part of this Focused Research Award. Please see the <a href="http://ad.informatik.uni-freiburg.de/projects/google-focused-research-award-next-generation-route-planning#publications">list of publications</a> on the <a href="http://ad.informatik.uni-freiburg.de/projects/google-focused-research-award-next-generation-route-planning">project’s website</a> for more outcomes of this award. Besides more on transfer patterns, they contain a wealth of other results about routing on road networks, transit networks, and with combinations of travel modes.<br /><br />References:<br /><br />[1] <a href="http://ad-publications.informatik.uni-freiburg.de/ESA_transferpatterns_BCEGHRV_2010.pdf">Fast Routing in Very Large Public Transportation Networks Using Transfer Patterns</a><br />by H. Bast, E. Carlsson, A. Eigenwillig, R. Geisberger, C. Harrelson, V. Raychev and F. Viger<br />(ESA 2010). [<a href="http://dx.doi.org/10.1007/978-3-642-15775-2_25">doi</a>]<br /><br />[2] <a href="http://ad-publications.informatik.uni-freiburg.de/ALENEX_scalable_tp_BHS_2016.pdf">Scalable Transfer Patterns</a><br />by H. Bast, M. Hertel and S. Storandt (ALENEX 2016). [<a href="http://dx.doi.org/10.1137/1.9781611974317.2">doi</a>]<br /><br />[3] <a href="http://ad-publications.informatik.uni-freiburg.de/SIGSPATIAL_frequency_BS_2014.pdf">Frequency-based Search for Public Transit</a><br />by H. Bast and S. Storandt (SIGSPATIAL 2014). [<a href="http://doi.acm.org/10.1145/2666310.2666405">doi</a>]]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/an-update-on-fast-transit-routing-with-transfer-patterns/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
		<item>
		<title>And the winner of the $1 Million Little Box Challenge is…CE+T Power’s Red Electrical Devils</title>
		<link>https://googledata.org/google-research/and-the-winner-of-the-1-million-little-box-challenge-iscet-powers-red-electrical-devils/</link>
		<comments>https://googledata.org/google-research/and-the-winner-of-the-1-million-little-box-challenge-iscet-powers-red-electrical-devils/#comments</comments>
		<pubDate>Mon, 29 Feb 2016 21:00:00 +0000</pubDate>
		<dc:creator><![CDATA[Research Blog]]></dc:creator>
				<category><![CDATA[Google Research]]></category>

		<guid isPermaLink="false">https://googledata.org/?guid=9f62af3c6f982fe2de58883b2c24ce31</guid>
		<description><![CDATA[<span>Posted by Ross Koningstein, Engineering Director Emeritus, Google Research</span><br /><br />In July 2014, Google and the <a href="https://www.ieee.org/index.html">IEEE</a> launched the $1 Million <a href="https://www.littleboxchallenge.com/">Little Box Challenge</a>, an open competition to design and build a small kW-scale inverter with a power density greater than 50 Watts per cubic inch while meeting a number of other specifications related to efficiency, electrical noise and thermal performance. Over 2,000 teams from across the world registered for the competition and more than 80 proposals qualified for review by <a href="http://www.ieee-pels.org/">IEEE Power Electronics Society</a> and Google. In October 2015, <a href="http://googlegreenblog.blogspot.com/2015/10/finalists-announced-for-little-box.html">18 finalists were selected</a> to bring their inverters to the <a href="http://www.nrel.gov/">National Renewable Energy Laboratory</a> (NREL) for testing.<br /><div><a href="https://3.bp.blogspot.com/-yjgE6hFTse0/VtSJM6tTBNI/AAAAAAAAA6c/WSjvxHCMZd4/s1600/Screen%2BShot%2B2016-02-26%2Bat%2B10.43.25%2BAM.png"><img border="0" height="400" src="https://3.bp.blogspot.com/-yjgE6hFTse0/VtSJM6tTBNI/AAAAAAAAA6c/WSjvxHCMZd4/s640/Screen%2BShot%2B2016-02-26%2Bat%2B10.43.25%2BAM.png" width="640"></a></div>Today, Google and the IEEE are proud to announce that the grand prize winner of the $1 Million Little Box Challenge is <a href="http://www.cet-power.com/">CE+T Power</a>&#8217;s Red Electrical Devils. The Red Electrical Devils (named after <a href="http://www.belgianfootball.be/en/red-devils">Belgium&#8217;s national soccer team</a>) were declared the winner by a consensus of judges from Google, IEEE Power Electronics Society and NREL. Honorable mentions go to teams from <a href="http://www.schneider-electric.com/ww/en/">Schneider Electric</a> and <a href="http://www.feec.ece.vt.edu/">Virginia Tech&#8217;s Future Energy Electronics Center</a>.<br /><table align="center" cellpadding="0" cellspacing="0"><tbody><tr><td><a href="https://4.bp.blogspot.com/-SdLdMQxrU2o/VtSw0luEJYI/AAAAAAAAA60/r5fAz6C4Pac/s1600/image1.JPG"><img border="0" height="480" src="https://4.bp.blogspot.com/-SdLdMQxrU2o/VtSw0luEJYI/AAAAAAAAA60/r5fAz6C4Pac/s640/image1.JPG" width="640"></a></td></tr><tr><td>CE+T Power&#8217;s Red Electrical Devils receive $1 Million Little Box Challenge Prize</td></tr></tbody></table>Schneider, Virginia Tech and The Red Electrical Devils all built 2kW inverters that passed <a href="http://www.nrel.gov/news/press/2016/23654">100 hours of testing at NREL</a>, adhered to the technical specifications of the competition, and were recognized today in a ceremony at the <a href="http://www.arpae-summit.com/">ARPA-E Energy Innovation Summit</a> in Washington, DC. Among the 3 finalists, the Red Electric Devils&#8217; inverter had the highest power density and smallest volume.<br /><div></div><div><a href="https://1.bp.blogspot.com/-_LrbTM5mjmE/VtS0PduuYDI/AAAAAAAAA7A/ovGYrmf8TOU/s1600/correct_image.png"><img border="0" height="168" src="https://1.bp.blogspot.com/-_LrbTM5mjmE/VtS0PduuYDI/AAAAAAAAA7A/ovGYrmf8TOU/s640/correct_image.png" width="640"></a></div><br />Impressively, the winning team exceeded the power density goal for the competition by a factor of 3, <i>which is more than 10 times more compact than commercially available inverters</i>!  When we initially brainstormed technical targets for the Little Box Challenge, some of us at Google didn&#8217;t think such audacious goals could be achieved. Three teams from around the world proved decisively that it could be done.<br /><br /><b>Our takeaway: Establish a worthy goal and smart people will exceed it!</b><br /><br />Congratulations again to CE+T Power&#8217;s Red Electrical Devils, Schneider Electric and Virginia Tech&#8217;s Future Energy Electronics and sincere thanks to our collaborators at IEEE and NREL. The finalist&#8217;s technical approach documents will be posted on the <a href="https://www.littleboxchallenge.com/">Little Box Challenge</a> website until December 31, 2017. We hope this helps advance the state of the art and innovation in kW-scale inverters.]]></description>
				<content:encoded><![CDATA[<span class="byline-author">Posted by Ross Koningstein, Engineering Director Emeritus, Google Research</span><br /><br />In July 2014, Google and the <a href="https://www.ieee.org/index.html">IEEE</a> launched the $1 Million <a href="https://www.littleboxchallenge.com/">Little Box Challenge</a>, an open competition to design and build a small kW-scale inverter with a power density greater than 50 Watts per cubic inch while meeting a number of other specifications related to efficiency, electrical noise and thermal performance. Over 2,000 teams from across the world registered for the competition and more than 80 proposals qualified for review by <a href="http://www.ieee-pels.org/">IEEE Power Electronics Society</a> and Google. In October 2015, <a href="http://googlegreenblog.blogspot.com/2015/10/finalists-announced-for-little-box.html">18 finalists were selected</a> to bring their inverters to the <a href="http://www.nrel.gov/">National Renewable Energy Laboratory</a> (NREL) for testing.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-yjgE6hFTse0/VtSJM6tTBNI/AAAAAAAAA6c/WSjvxHCMZd4/s1600/Screen%2BShot%2B2016-02-26%2Bat%2B10.43.25%2BAM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="400" src="https://3.bp.blogspot.com/-yjgE6hFTse0/VtSJM6tTBNI/AAAAAAAAA6c/WSjvxHCMZd4/s640/Screen%2BShot%2B2016-02-26%2Bat%2B10.43.25%2BAM.png" width="640" /></a></div>Today, Google and the IEEE are proud to announce that the grand prize winner of the $1 Million Little Box Challenge is <a href="http://www.cet-power.com/">CE+T Power</a>’s Red Electrical Devils. The Red Electrical Devils (named after <a href="http://www.belgianfootball.be/en/red-devils">Belgium’s national soccer team</a>) were declared the winner by a consensus of judges from Google, IEEE Power Electronics Society and NREL. Honorable mentions go to teams from <a href="http://www.schneider-electric.com/ww/en/">Schneider Electric</a> and <a href="http://www.feec.ece.vt.edu/">Virginia Tech’s Future Energy Electronics Center</a>.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-SdLdMQxrU2o/VtSw0luEJYI/AAAAAAAAA60/r5fAz6C4Pac/s1600/image1.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="480" src="https://4.bp.blogspot.com/-SdLdMQxrU2o/VtSw0luEJYI/AAAAAAAAA60/r5fAz6C4Pac/s640/image1.JPG" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">CE+T Power’s Red Electrical Devils receive $1 Million Little Box Challenge Prize</td></tr></tbody></table>Schneider, Virginia Tech and The Red Electrical Devils all built 2kW inverters that passed <a href="http://www.nrel.gov/news/press/2016/23654">100 hours of testing at NREL</a>, adhered to the technical specifications of the competition, and were recognized today in a ceremony at the <a href="http://www.arpae-summit.com/">ARPA-E Energy Innovation Summit</a> in Washington, DC. Among the 3 finalists, the Red Electric Devils’ inverter had the highest power density and smallest volume.<br /><div class="separator" style="clear: both; text-align: center;"></div><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-_LrbTM5mjmE/VtS0PduuYDI/AAAAAAAAA7A/ovGYrmf8TOU/s1600/correct_image.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="168" src="https://1.bp.blogspot.com/-_LrbTM5mjmE/VtS0PduuYDI/AAAAAAAAA7A/ovGYrmf8TOU/s640/correct_image.png" width="640" /></a></div><br />Impressively, the winning team exceeded the power density goal for the competition by a factor of 3, <i>which is more than 10 times more compact than commercially available inverters</i>!  When we initially brainstormed technical targets for the Little Box Challenge, some of us at Google didn’t think such audacious goals could be achieved. Three teams from around the world proved decisively that it could be done.<br /><br /><b>Our takeaway: Establish a worthy goal and smart people will exceed it!</b><br /><br />Congratulations again to CE+T Power’s Red Electrical Devils, Schneider Electric and Virginia Tech’s Future Energy Electronics and sincere thanks to our collaborators at IEEE and NREL. The finalist’s technical approach documents will be posted on the <a href="https://www.littleboxchallenge.com/">Little Box Challenge</a> website until December 31, 2017. We hope this helps advance the state of the art and innovation in kW-scale inverters.]]></content:encoded>
			<wfw:commentRss>https://googledata.org/google-research/and-the-winner-of-the-1-million-little-box-challenge-iscet-powers-red-electrical-devils/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="" length="" type="" />
		</item>
	</channel>
</rss>
