How We Tested Google Instant Pages
July 27th, 2011 | Published in Google Testing
By Jason Arbon and Tejas Shah
Google Instant Pages are a cool new way that Google speeds up your search experience. When Google thinks it knows which result you are likely to click, it preloads that page in the background, so when you click the page it renders instantly, saving the user about 5 seconds. 5 seconds is significant when you think of how many searches are performed each day--and especially when you consider that the rest of the search experience is optimized for sub-second performance.
The testing problem here is interesting. This feature requires client and server coordination, and since we are pre-loading and rendering the pages in an invisible background page, we wanted to make sure that nothing major was broken with the page rendering.
The original idea was for developers to test out a few pages as they went.But, this doesn’t scale to a large number of sites and is very expensive to repeat. Also, how do you know what the pages should look like? To write Selenium tests to functionally validate thousands of sites would take forever--the product would ship first. The solution was to perform automated test runs that load these pages from search results with Instant Pages turned on, and another run with Instant Pages turned off. The page renderings from each run were then compared.
How did we compare the two runs? How to compare pages when content and ads on web pages are constantly changing and we don't know what the expected behavior is? We could have used cached versions of these pages, but that wouldn’t be the realworld experience we were testing and would take time setting up, and the timing would have been different. We opted to leverage some other work that compares pages using the Document Object Model (DOM). We automatically scan each page, pixel by pixel, but look at what element is visible at the point on the page, not the color/RGB values. We then do a simple measure of how closely these pixel measurements match. These so-called "quality bots" generate a score of 0-100%, where 100% means all measurements were identical.
When we performed the runs, the vast majority (~95%) of all comparisons were almost identical, like we hoped. Where the pages where different we built a web page that showed the differences between the two pages by rendering both images and highlighting the difference. It was quick and easy for the developers to visually verify that the differences were only due to content or other non-structural differences in the rendering. Anytime test automation scales, is repeatable, quantified, and developers can validate the results without us is a good thing!
How did this testing get organized? As with many things in testing at Google, it came down to people chatting and realizing their work can be helpful for other engineers. This was bottom up, not top down. Tejas Shah was working on a general quality bot solution for compatibility (more on that in later posts) between Chrome and other browsers. He chatted with the Instant Pages developers when he was visiting their building and they agreed his bot might be able to help. He then spend the next couple of weeks pulling it all together and sharing the results with the team.
And now more applications of the quality bot are surfacing. What if we kept the browser version fixed, and only varied the version of the application? Could this help validate web applications independent of a functional spec and without custom validation script development and maintenance? Stay tuned...
Google Instant Pages are a cool new way that Google speeds up your search experience. When Google thinks it knows which result you are likely to click, it preloads that page in the background, so when you click the page it renders instantly, saving the user about 5 seconds. 5 seconds is significant when you think of how many searches are performed each day--and especially when you consider that the rest of the search experience is optimized for sub-second performance.
The testing problem here is interesting. This feature requires client and server coordination, and since we are pre-loading and rendering the pages in an invisible background page, we wanted to make sure that nothing major was broken with the page rendering.
The original idea was for developers to test out a few pages as they went.But, this doesn’t scale to a large number of sites and is very expensive to repeat. Also, how do you know what the pages should look like? To write Selenium tests to functionally validate thousands of sites would take forever--the product would ship first. The solution was to perform automated test runs that load these pages from search results with Instant Pages turned on, and another run with Instant Pages turned off. The page renderings from each run were then compared.
How did we compare the two runs? How to compare pages when content and ads on web pages are constantly changing and we don't know what the expected behavior is? We could have used cached versions of these pages, but that wouldn’t be the realworld experience we were testing and would take time setting up, and the timing would have been different. We opted to leverage some other work that compares pages using the Document Object Model (DOM). We automatically scan each page, pixel by pixel, but look at what element is visible at the point on the page, not the color/RGB values. We then do a simple measure of how closely these pixel measurements match. These so-called "quality bots" generate a score of 0-100%, where 100% means all measurements were identical.
When we performed the runs, the vast majority (~95%) of all comparisons were almost identical, like we hoped. Where the pages where different we built a web page that showed the differences between the two pages by rendering both images and highlighting the difference. It was quick and easy for the developers to visually verify that the differences were only due to content or other non-structural differences in the rendering. Anytime test automation scales, is repeatable, quantified, and developers can validate the results without us is a good thing!
How did this testing get organized? As with many things in testing at Google, it came down to people chatting and realizing their work can be helpful for other engineers. This was bottom up, not top down. Tejas Shah was working on a general quality bot solution for compatibility (more on that in later posts) between Chrome and other browsers. He chatted with the Instant Pages developers when he was visiting their building and they agreed his bot might be able to help. He then spend the next couple of weeks pulling it all together and sharing the results with the team.
And now more applications of the quality bot are surfacing. What if we kept the browser version fixed, and only varied the version of the application? Could this help validate web applications independent of a functional spec and without custom validation script development and maintenance? Stay tuned...