Navigating and Searching the Web
Since so many people create web pages, the Web should be chaotic. However, underlying systems are in place specifying how pages are organized on the Web and how they are delivered to your computer. This system involves unique addresses used to access each web page, a unique address for each computer, and browser features for locating and retrieving online content.
IPs and URLs
An Internet Protocol (IP) address is a series of numbers that uniquely identifies a location on the Internet. An IP address consists of four groups of numbers separated by periods; for example: 225.73.110.102. A nonprofit organization called ICANN keeps track of IP numbers around the world.
Because numbers would be difficult to remember for retrieving pages, we use a text-based address referred to as a uniform resource locator (URL) to go to a website. A URL, also called a web address, has several parts separated by a colon (:), slashes (/), and dots (.). The first part of a URL is called a protocol and identifies a certain way for interpreting computer information in the transmission process. Http, which stands for hypertext transfer protocol, and ftp, for file transfer protocol, are examples of protocols. Some sites use a secondary identifier for the type of site being contacted, such as www for World Wide Web site, but this is often optional.
The next part of the URL is the domain name, which identifies the group of servers (the domain) to which the site belongs and the particular company or organization name. A suffix, such as .com or .edu, further identifies the domain. For example, the .com in the URL is a top-level domain (TLD). Several TLDs exists such as .com, .net, .org, .edu, and .gov. Table 1.1 provides a rundown of TLDs being used today.
Table 1.1 Common Top-Level Domain Suffixes Used in URLs
Suffix / Type of Organization / Example.biz / business site / Billboard:
.com / company or commercial institution / Intel:
.edu / educational institution / Harvard University:
.gov / government site / Internal Revenue Service:
.int / international organizations endorsed by treaty / World Health Organization:
.mil / military site / U.S. Department of Defense:
.net / administrative site for ISPs / Earthlink:
.org / nonprofit or private organization / Red Cross:
Browsing Web Pages
You may already be quite comfortable with browsing the Internet, but you may not have pondered how browsers move around the Web and retrieve data. Any element of a web page (text, graphic, audio, or video) can be linked to another page using a hyperlink. A hyperlink describes a destination within a web document and can be inserted in text or a graphical object such as a company logo. Text that is linked is called hypertext.
A website is a series of related web pages that are linked together. You get to a website by entering the URL, such as in your browser. Every website has a starting page, called the home page, which is displayed when you enter the site URL. You can also enter a URL to jump to a specific page on a site, such as the Video-On-Demand page at Amazon’s site,
Searching for Content Online
A search engine, such as Google.com, Ask.com, and Yahoo.com, catalogs and indexes web pages for you. A type of search engine, called a search directory, can also catalog pages into topics such as finance, health, news, shopping, and so on. Search engines may seem to be free services, but in reality they are typically financed by selling advertising. Some also make money by selling information about your online activities and interests to advertisers.
The newest wave of search engines, including Microsoft Bing and Google Squared, not only search for content but also make choices among content to deliver more targeted results. Such search engines allow you, for example, to ask for a list of female tennis stars from 1900 on, and they then assemble a table of results for you.
So how do search engines work? You can search for information by going to the search engine’s website and typing your search text, which is comprised of one or more keywords or keyword phrases. For example, to find information about the international space station you could type space station in the search engine’s search text box and press the Enter key. You can narrow your search by specifying that you want to view links to certain types of results such as images, maps, or videos.
You can get more targeted search results by honing your searching technique. Effective searching is a skill that you gain through practice. For example, typing space station in a search engine’s web page could easily return more than 80 million results. If what you really need is the cost to build the station, consider a more targeted keyword phrase like “space station cost.” Search engines provide advanced search options, which you can use to include or exclude certain results. For example, you can exclude pages with certain domain suffixes (such as .com and .net) to limit your search results to educational and government sites.
A metasearch engine, such as dogpile.com, searches keywords across several websites at the same time. For example, imagine you need to fly from Atlanta to Seattle. Instead of checking available flights on three different airline websites, you can use a metasearch engine to check all of the airline sites at once.
Online Content
Calculating exactly how many websites and web pages exist today is difficult but information from the Netcraft Secure Server Survey in 2009 indicated an increase of over six million websites just between March and April of that year. With that kind of constant activity, it is logical to conclude that not all of the content that is online is of the same quality or accuracy. In addition, some of that content is free for the taking, while other content is protected by copyright, or legal ownership of that content. Learning how to evaluate the quality of content, respect laws that govern use of content, and understanding when free exchange of content is allowed is important.
Evaluating Web Content
Though a wealth of accurate and useful information exists online, some people believe that if they read it in the newspaper or online, it must be true. That, however, is not the case. As in the offline world, you have to consider the source of online content. If you trust technology information from Wired magazine in print, you can have a similar level of trust in their online site. If you do not know a source at all, you may have to do some digging to discover if it is reputable by looking at the source’s credentials (which individuals or organizations are involved in the venture?), methods (for example, is the information based on surveys and experiment, or personal opinion?), and reputation (what do other online users say in reviews of the site or the company’s products?).
Because anyone can publish to the Web, to gauge the accuracy of what you read, you have to verify the three Ws (or WWW) of online content.
- WHO is the author or publisher? Is the source credible?
- WHAT is the message? Is the information verifiable? Is there a possibility of bias? Always try to crosscheck the information with other sources. Look for sponsors of a site to determine if they have a bias.
- WHEN was this published? Is this information current? If no date is published, is it possible to figure out how current the information is from the text? Online information can stay put for a very long time. Always look for the most current information on any topic.
Intellectual Property
Some information or works online are placed there to be shared and passed on. Other content falls into the category of intellectual property, much of which is copyrighted. According to the World Intellectual Property Organization (WIPO), intellectual property refers to “the creations of the mind; inventions, literary and artistic works; and symbols, names, images, and designs used in commerce.” Copying or distributing intellectual property without appropriate permission is illegal.
The Internet has brought the issue of illegal treatment of intellectual property front and center. Because copying and pasting content online is so simple, many people who would never dream of stealing a CD from a music store or a book from a bookstore download music illegally or plagiarize by using text or images from a website and representing that content as their own work.
Peer-to-peer (P2P) file sharing programs, such as BearShare, are used by millions of people to share music, video, and other types of files. File sharing allows people to download content from another user’s hard drive. This type of sharing is ripe for copyright abuse because materials that might be downloaded from a legitimate source by paying a fee are instead exchanged freely with no payment going to the copyright owner.
However, some people feel that copyright law in the digital age has gone too far. There’s a strong sense among many Internet activists that laws such as the Digital Millennium Copyright Act distorts the balance between fair use (the right to reuse content that is available to all) and intellectual property rights and are a threat to creativity and technological innovation.
The Invisible Web (aka Deep Web)
The content you typically find online is only the tip of the Web content iceberg. There are huge “hidden” collections of information that are collectively known as the invisible Web or deep Web. A typical search engine won’t return links to these databases or documents when you enter a search keyword. To get to some databases on the invisible Web, libraries and companies have to pay for access to them. In the future, you will probably be able to find this content more easily as search engines get more sophisticated. You can try by entering the word “database” after your keyword(s) in services such as Google and you may locate some of this content. You can also try to locate this content in directories such as the Librarians’ Internet Index, free and paid-for databases such as LexisNexis for legal research, and some specialized search engines such as Scirus, a science search engine. If you are willing to pay for help accessing the invisible Web, companies such as BrightPlanet and Kozmix specialize in harvesting information.
E-Commerce
Electronic commerce, or e-commerce, involves using the Internet to transact business. When you are buying downloadable music, shopping for shoes, or paying to access your credit report, for example, you are involved in e-commerce.
Three main types of e-commerce describe how money flows in an online business. Money can flow from business-to-consumer (B2C), business-to-business (B2B), or consumer-to-consumer (C2C). Sometimes more than one of these models occurs on a single site (for example, when a consumer on eBay buys a product from another consumer (C2C), but eBay makes money from advertisers (B2B).
B2C E-Commerce
Business-to-consumer (B2C) e-commerce is probably the kind with which you are most familiar. It involves companies that sell products and services to individual consumers, such as Amazon.com, JustHost.com (website hosting service), and Zappos.com. This is the model that most resembles those stores in the mall that you go to when purchasing books, obtaining tax return help, or finding shoes.
B2B E-Commerce
Business-to-business (B2B) e-commerce involves businesses selling to businesses. In some cases, a business provides supplies or services to another business, such as a plumbing supply site that caters to building contractors. In another B2B model, businesses provide a service to consumers but do not charge those consumers directly. Instead, their business model involves making money from selling ad space to advertisers, or selling information about their customers to advertisers. Given that e-commerce models are defined by how money flows, Facebook is an example of this second kind of B2B site because it gets no money from its members, only from advertisers (or other businesses).
C2C E-Commerce
Consumer-to-consumer (C2C) e-commerce activity occurs on sites such as Craigslist or eBay where consumers buy and sell items from each other over the Internet. Though the host site provides the infrastructure, the money flows from one consumer to another. What e-commerce model do you think supports the companies that host C2C sites? If you guessed B2B (they get their money from advertisers) you would be right!
E-Commerce and Consumer Safety
In many cases, buying and selling items online is safer than doing so offline. That’s because rather than handing your credit card to a clerk in a store, you are performing a transaction over a secure connection, providing payment information to a system rather than an individual. Of course, every system has its problems, and online stores, banks, and investment sites are hacked into now and then. Still, if you use care in choosing trusted shopping sites, pay by a third-party payment service such as PayPal or by credit card (these purchases are protected from theft, while a check or debit card purchase is not), and make sure that while performing a transaction the URL prefix reads https (which indicates a secure connection), you can be confident that you will have a safe shopping experience.