Search Engine Functioning

When you use Google, it is not an instant, “live” search of the web. It is a search of the Google database, a static document. Different engines store different amounts of the actual web page that is crawled. Results are displayed using proprietary ranking technology. Google’s Pagerank algorithm has proven to be the most effective. Its primary breakthrough was the consideration of link frequency as a ranking criteria. However, the algorithm is much, much more complicated than that and we know little about how it actually works.

Google is the search engine market leader:

Its major competitors are: and

The significant differences in these engines are to be found in the add-on services that surround the primary search technology and in the details of the user interface.

Other Search Options

  • Metasearch–Clusty – these send your query to an assortment of other search engines in the hopes of combining the strengths of each.
  • Directories – DMOZ – these are categorically arranged collections of sites constructed by humans.
  • Deepweb – Completeplanet – searches resources behind portals. These are web pages that can only be accessed from some other page. They are not included in Google’s index.
  • Semantic Search – Hakia – uses human (librarians) site selection to provide more focused results from only credible sites. Does not use popularity for ranking purposes.

Constructing an Effective Search Phrase, or Query

If you’re looking for something specific, use as many terms as you think are directly and commonly associated with the subject in question. For a broader overview, stick with fewer words.

Most search engines also allow for some logic operators in your search phrase or in an advanced search form. Google accepts “OR” on its main search page as well as “+” and “-.“ “AND” is assumed between two words. Google also offers an advanced search page with a form that approximates the operators “AND” “OR” and “NOT.” Yahoo offers nearly identical capabilities.

Evaluating Web Search Results

As with any information source, documents published on the World Wide Web must be evaluated for their credibility, currency, and accuracy.

Credibility

  • Who wrote this?
  • What is the author’s motivation?
  • Who paid for this web site?
  • What sites link to and are linked to from this site?
  • What information sources cite or are cited by this site?

Currency

  • When was this information published?
  • What major events have unfolded in the field since this site’s publication?
  • Is this site updated regularly?

Accuracy

  • How closely does this site’s information mirror that of other sources known to be credible and / or accurate?
  • Is this site reviewed or rated?

The Google Suite of Services

Google Web Search: This is a service that searches a massive private database of web pages. Results are ranked with proprietary technology known as Pagerank.

Google Book Search: This servicesearches the full text of more than a million digitized books. It Does not employ the Google web searching relevancy algorithm, Pagerank, so results are not as meaningful or uncluttered.

Google Scholar: Google scholar is a search service aimed at serious researchers. It distinguishes itself most by indexing the full text of many academic journals, both print and digital.

Google Images: Google Image Search is a service which searches the web for image content. Results are displayed in image format. Clicking on a result image will direct the user to a split page which shows the image at the top and the image in it’s “original context” on the page it came from on the bottom.

Google News: Google News is an aggregator of online news articles as well as a search engine for current and archived news articles. The sites from which news headlines are pulled are chosen by Google editors and number around 4500. The page does not display articles, only headlines. The headlines are linked to the articles at their site of origin. The placing of headline articles is determined by an algorithm without any human input.