Google goes scraper with Hummingbird update
Google announced today their latest update to their search engine — Hummingbird. Google’s PR team calls the move “significant” and says it will impact 90% of search queries, which if true is an insane amount and a massive change.
Google on what will be changing with Hummingbird:
[Hummingbird will] give better answers to the increasingly complex questions posed by Web surfers.
After a look at the search results over the last month, it’s clear what Google is really saying is “We will now scrape more data from 3rd party websites so people never have to leave Google.com.” Once Google starts fully answering search queries, there is no reason for people to leave Google. With Google crawling and caching nearly every page on the web, they can display snippets and data from web pages that can directly answer informational search queries.
This is going to be frustrating for website owners and content producers who are spending countless hours to create content and information that people want to read online…only to have Google crawl their page, grab relevant information and display it on Google.com instead of sending the visitors to your site. What Hummingbird is really about is keeping people on Google properties longer.
I spent a few hours trying hundreds of different queries to demonstrate what I’m talking about. Here are a few that stuck out where Google is scraping others content and serving it up on their own pages.
Query: Tylenol PM
Observations: Google is adding in a bunch of scraped content from authoratative medicine and health sites along the right side. They also have prominent ad listings and a number of links to other Google searches. As with a lot of the Hummingbird SERPs, Google is utilizing the right side of the page to display scraped content and advertisements.
Query: How To Lose Weight
Observations: This is a massively popular search query that Google has drastically changed in the SERPs. Instead of displaying extra info on the right side, Google decided to use the most valuable real estate on the entire page to scrap information from authoritative sites (seeing a pattern yet?). This is really going to impact non-government and educational health sites who used to rank in position 3-10 because they are pushed way down in the SERPs post-Hummingbird.
Query: Flight to San Francisco
Observations: Ads ads ads. Google has a huge affiliate Google Flight box right at the top of the SERPs. This has been seen on flight related queries for a while but has become even more prominent in Hummingbird. A common theme: Google wants more money from transactional queries.
Query: Requiem For A Dream
Observations: Search most media titles (books, movies, etc) and you’ll see the right side of the screen loaded with tons of scraped content — reviews, summaries, actors and more. There’s also other images and graphics that go back to other Google properties. All of the cast and related media links are to other Google search queries. Google also added a call-to-action to “Watch Now” on Google Play. I’m sure users will really appreciate that addition.
Observations: MLB standing show up right at the top and take up the entire screen on smaller laptops and devices. There’s no reason to leave Google if you are looking to keep up to date with the scores. Not sure how MLB.com will feel about Google using their data and taking away traffic, but it’s for users!
I leave you with some words from Google themselves on affiliate sites with thin, scraped content:
Some webmasters use content taken (“scraped”) from other, more reputable sites on the assumption that increasing the volume of pages on their site is a good long-term strategy regardless of the relevance or uniqueness of that content. Purely scraped content, even from high-quality sources, may not provide any added value to your users without additional useful services or content provided by your site; it may also constitute copyright infringement in some cases. It’s worthwhile to take the time to create original content that sets your site apart. This will keep your visitors coming back and will provide more useful results for users searching on Google.
Wanted to answer add my thoughts to a couple of questions that people keep sending me. Feel free to shoot me an email if you have any questions.
1. Hasn’t Google been showing these types of search results for a while?
Yes, in some cases. The Knowledge Graph has been around for a while and Google has started using it in more and more queries. Hummingbird is aimed to better understand what Google calls “conversation” queries and to better answer more complex queries. From what I’ve seen, in practical terms this means that Google is rolling out Knowledge Graph results to more and more queries.
Another example of this is the query “earth vs mars” that has been being shared today. Google is directly answering a query in a different way than they used to through the Knowledge Graph. They’ve definitely become more prominent in recent months and Hummingbird is said to have rolled out over the course of the past few months.
Feel free to disagree, but Knowledge Graph results seem to be a lot more prominent now than they ever have and Google is specifically referencing the Knowledge Graph and semantic search as the two biggest areas of change.
2. The SERPs that you are showing look a lot different than mine.
Google SERPs seem to change all of the time depending on location, browser history, data center and other factors. I’ve been seeing similar results on a number of queries from different locations.