Pipeline Column

Due Jan. 14, 2011

Internet @ Schools Mar. / Apr. 2011 Issue

Trust, Credulity and Search Engines: Is Google Over?

by Stephen Abram

Some recent articles and blog postings are finally starting to challenge the Google search religion. I’ve provided a bunch of links for you in the webliography at the end of this column. The real issue is one of search engine spam and the serving up of questionable results. This is a real issue for any kind of searcher, learner, educator, researcher and more. It runs the risk of ruining the usefulness of search engines, and in particular, Google. Some scenarios could happen:

  1. Google could become a massive dark hole of lousy content driven by the needs of advertisers, marketers and special interest groups. Users may or may not notice.
  2. New competitors could arrive and address these weaknesses and create market options for search that drive improvements across the board. Perhaps Bing or Blekko are already starting this.
  3. Search Engine Optimizers (SEO) could become regulated or self manage to address the threats to their own interests.
  4. Content reputation management systems (like a Good Housekeeping Seal of Approval) that have been tried over the years may finally come alive.
  5. Recommendation systems that rely on the value of the recommender, your own social connections or respected groups, leaders, or professions, could influence the relevancy of search results. This shows some potential in recommendations tied to your own contacts in such environments as Facebook, LinkedIn, StumbleUpon, Digg, Quora, or even a renewed Delicious. Peer recommendations are already working better in music, movies and recreational reading than they are in the research and question & answer space.

Each of the above potential opportunity scenarios has some chance of occurring. Some are desirable goals but most also run the risk of being double-edged swords. While you could get better answers under some scenarios it comes at a cost of narrowness, a dependence on group-sourcing answers and /or a reduction in innovative thought and serendipity. So, what to do?

I’d suggest that what is most important, in the near term, is to build credulity skills in learners and researchers about what’s behind the results they get from web search engines. To do this, we must add a greater dimension to the teaching of searching and information literacies. We must move beyond the teaching of raw searching skills and the retrieval of information, simple content quality evaluations and the narrowly based search training for media literacy to avoid the dangers, prurient, and gambling aspects of the web. These skills are important but there are more fundamental insights that can be gained by understanding the business models behind search engines. Learners and researchers should know and be able to ask themselves who or what chooses to promote that link on the pages of search results they are seeing. Are those links driven by simple mathematical relevancy or a search algorithm? Are special interest groups, political parties, individuals, lobbyists, or commercial advertising interests determining the results searchers are finding?

So, here are some insights into what we need to be teaching, in addition to all the good stuff we’re already doing now:

Content Farms

First, EVERY hour, one million spam pages of content are created. Spammers are out to: harm users, steal publisher traffic, and defraud legitimate advertisers. A new search engine has created a spam clock to highlight this issue:

For starters, every searcher should know who creates spam pages, why, and how they influence search results. For instance, did you know that Yahoo! owns one of the largest content and article creation companies that are designed to drive traffic to advertisers? Can you name the other majors? This is an important issue. These so-called content farms are companies like Demand Media and Answers.com. Each creates thousands of pieces of content per day. This content may actually be correct or maybe not. On the surface analysis it seems a bit shallow but it serves as link bait to attract searchers to information that may be biased or lack perspective. For instance it may be paid for by a single pharmaceutical company to drive people to review their drug therapy. It may be a class action cohort attempting to build their numbers for a mesothelioma legal suit. It might be an appliance manufacturer attempting to influence your consumer choice of freezer or stove brand. Both of those two companies are now firmly inside the top 20 Web properties in the U.S., on a par with the likes of Apple and AOL. Surprised? Google alone makes $1 billion dollars in profit every month or so. It is highly unlikely that any of that money is coming from the pockets of you or your students. The search engines are focused on serving the needs of their real customers – the advertisers - and have many tools and services at their disposal to delight those paying clients.

SEO: Search Engine Optimization

Search Engine Optimization (SEO) and its little brother, Social Media Optimization (SMO), are the big boys of influence in the world of changing search engine results. These techniques are used by any web property with any degree of sophistication including library websites. There are white hat and black hat search optimizers. Usually for a fee, they work to ensure that your web presence (website, Facebook profile, Twitter feed, etc.) gets the traffic you desire. Sometimes they want to sell something and other times they are promoting a point of view. There are well known sites from racist organizations like Stormfront that promote their causes and points of view. This is an example of black hat optimization. White hat optimization is that undertaken by charities and commercial interests. Political parties, politicians and PACs, have become expert in driving voters to their sites and editorials. In recent years these have become very sophisticated with the ability to geo-code SEO and direct results at the electoral district, area code, zip code, or census tract level (GEO). I am told that you can purchase the ability to use localized SEO at the school and college campus level since young targets are the sweet spot of advertisers.

Google is excellent at providing search results for the big who, what, where, and when questions. The roles of SEO, SMO and GEO play a key role in making the search results better. Who would want an answer to <pizza with just the best pizza sites that didn’t contain the one’s they could use and a local coupon? The difficult questions – those that start with why and how – are more important and they are the foundation of educations that are based on critical thinking. Most of the time we get delightful results because the questions are simple. So, we get lured into a sense of comfort and trust when we fail to notice that the results are heavily influenced when the questions are harder – health issues, politics, business decisions and more. Intelligent searchers will question their search results and dig deeper when the response is important to a decision they are making. We need to teach this deeply and scaffold those skills as learners age and their questions increase in difficulty, importance and impact.

Clutter, Spam, Relevancy

Google search results have become a spammed and cluttered mess. At this point it seems to be a game of whack-a-mole to build a search algorithm that senses spam sites and SEO content. The big engines are notoriously secretive about their algorithms and that is understandable. They reportedly change them often, maybe even daily. Google has become a search religion, or a bad habit, and that’s dangerous to critical thinking, democracy and the learners and users we care about. Google may have outlasted its usefulness or it may overcome its current deficiencies and problems. At this point, though, the only ethical thing for educators is to do is to train learners and researchers for the future and to encourage them to explore options beyond Google. In my opinion, Blekko, Exalead, and Bing are fine choices for a start. Any one of these can suffer from the same issues and the skills apply broadly. And when you add the alternative models of library sources that do not depend on revenue from advertisers, you have a better toolkit and skills as a searcher. Alibrary’s licensed database resource and online catalogue results are never influenced by SEO techniques and third party manipulation. That’s a key learning that everyone should know to succeed.

To learn more about the dangers of trusting the search results too much, follow and read these citations below. Many have god examples of sites and searches that show the impact of overly influenced search results that could be readily adapted and used for training sessions.

  1. Google’s decreasingly useful, spam-filled web search
  2. Trouble In the House of Google
  3. Why We Desperately Need a New (and Better) Google
  4. Dishwashers, and How Google Eats Its Own Tail
  5. Content Farms: Why Media, Blogs & Google Should Be Worried
  6. On the increasing uselessness of Google html
  7. Google’s “Gold Standard” Search Results Take Big Hit In New York Times Story
  8. How The “Focus On First” Helps Hide Google’s Relevancy Problems
  9. What Is Search Engine Spam? The Video Edition
  10. Google’s Search Engine Optimization Starter Guide(32 page PDF)
  11. Google's Search Algorithm Has Been Ruined, Time To Move Back To Curation (GOOG)
  12. Blekko Launches Spam Clock To Keep Pressure On Google

Stephen Abram, MLS is Vice President, Strategic Partnerships and Markets for Gale Cengage Learning. He is a Past President of SLA, the Ontario Library Association and the Canadian Library Association. He is the author of ALA Edition’s Out Front with Stephen Abram and Stephen’s Lighthouse Blog. Stephen would love to hear from you at .