Web Page Media and Content Classification

Aims

  • describe basic procedures for using a web browser (Netscape 4.0 and higher) in order to determine page layout
  • provide basic procedures for acquiring detailed properties of individual elements
  • describe and explain procedures for using a web authoring application (NetObjects Team Fusion Version 3.0 and higher) for reproducing the layout or web pages classified according to its constituent media

Introduction

In order to completely describe a web page you need to be able to provide several kinds of information for all page elements. The purpose of this document is to introduce simple methods for describing web pages. These methods provide data sufficient to create page mock-ups that show the important attributes of page media used on a specific web page. Web pages are described by gathering or determining the following data for each element:

  • assigning a unique label,
  • identifying its screen location,
  • determining the type of media used to represent it on the screen,
  • determining its size- both in terms of its screen coordinates, bytes, and area
  • the number of times the element is repeated on the page (if any), and
  • any other properties including for example if the repeating element occurs in distinct groups (if relevant).

Each of these attributes is described along with a method for determining it. The following discussion assumes that you are using Netscape Navigator 4.5 to acquire data about page layouts and elements and NetObjects Team Fusion Version 3.0 to create page mock-ups based on that data.

However before describing the attributes of page elements we need to understand how the page is organised - its page layout - and how to find out as much about the constituent media being used on the page. The University of Wollongong home page at http://www.uow.edu.auis used as an example for the remainder of this exercise.

Determining Page Layout

Arrangement of Elements on the Page

In order to identify the elements that exist on a page, and their respective locations, point your web browser to the appropriate page. To show all elements of the page, select the File | Edit Page option from Netscape Communicator, see Figure 1 (top left image). This invokes a program called Netscape Composer, a simple HTML authoring program whose initial screen is shown in Figure 1 (lower right image). Note that each page element is now surrounded by a dashed box - called a bounding box - which shows how large the element is on the screen. These bounding boxes around each element are used as the basis for determining screen co-ordinates as described below.

Figure 1:Select File | Edit Page from Netscape Communicator (top left image) loads up a component called Netscape Composer - a simple HTML authoring program (lower right image)



Listing All Page Elements


To make sure that all page elements are identified, and in order to determine how many of these elements are used more than once- see the Section called Number (Repetition), select the View | Page Info option from either Netscape Communicator or Netscape Composer. The listing of page elements for the University of Wollongong home page is provided in Figure 2. This list is shown in two windows. The top window consists of a list of links for all page elements. The lower window provides information on the selected element. The default element is a description of the page itself, as shown in Figure 2.

Figure 2:Listing of Page Elements on the University of Wollongong Home Page. This information is available from either Netscape Navigator or Netscape Composer by selecting View | Page Info.

Detailed Properties of Individual Elements

To acquire detailed properties about an individual element on the screen, the element must be selected by clicking the left mouse button while the cursor is located within the bounding box of the element. A thick lined box surrounds a selected element. The selected banner image in the University of Wollongong home page is shown in Figure 3 (top image). Once the required element is selected, click the right hand mouse button to show the floating menu of commands that can be requested for that element. Figure 3 (middle image) shows the floating menu relevant to the selected banner image. If the selected object is an image, its Image Properties element can be selected, which displays an Image Properties window. Selecting the Image Properties option for the University of Wollongong’s banner image displays the Image Properties window in Figure 3 (lower image). Useful information about this object includes its Dimensions (Height and Width) which can be used when writing up the Element Size information, see latter.


Figure 3:Selecting an element (top image) and choosing its Image Properties (middle image) from those that are available for an object of its type. The result is the Image Properties window (lower image).



The properties of other types of page elements can be found using the same procedure. For example, the University of Wollongong home page contains a list of images containing text, which are the major options around which the site is organised. Each of these images contains a link to a page location forming a section or weblet in the site. To find detailed properties of this element, select the element and open up the Image Properties as described above. Note that the window contains three tab controls labelled from left to right: ‘Image’, ‘Link’, and ‘Paragraph’, see Figure 4 (upper image). The ‘Image’ information is displayed by default. To show the name of the page linked to image/comm.gif select the Link Tab, see Figure 4 (lower image).

Figure 4:Other types of elements can also be examined using the same procedure (steps) as that of Figure 3. In this example, the image ‘Community & Business’ located in images/comm.gif (top image) has a link (lower image) pointing to /about/community/ page. The link information is available from the Link tab in the Image Properties dialogue.



Describing Elements

Unique Element Labels

In almost all cases each screen element should have a unique label. There are many ways of creating unique labels for an element. One way- and it is by no means the only one- is to create a unique label consisting of a page number, a media type code, and an sequential number to uniquely identify one instance of an element from on a specific page. This way of uniquely labelling a page element is described below however the reader is encouraged to attempt his or her own classification schemes.

  • Page Number: Creating a unique label for an element involves being able to locate that element on one of many different pages within the site structure. One way of identifying the page is to know the hierarchical arrangement of pages on the site. Figure 5 shows a section of the University of Wollongong extranet where the home page at the root of the hierarchy is shown as Level 0, the first major content pages are at Level 1, and the minor content pages are located at Level 2 and so on. Each of these pages can be given a numbered based on this information. This information can be stored in NetObjects as a Comments field in the Page tab of the Properties window in the Site View. A numbering scheme can be devised which identifies the level at which the page is located and how that page are connected in the site structure. In Figure 5, a page called teaching/Tstrategic_plan is ‘coded’ as Page 3.1. This code has two numbers indicating that this page is located at Level 2. The parent of this page is Page 3, and it is the left most in a series of at least two (sibling) pages.

Figure 5:A way of numbering pages to indicate the level in the hierarchy at which the page is located. The pages represent a weblet from the University of Wollongong web site. A ‘code’ is manually stored in the Comments field of the Properties window.


  • Media Type Code: Creating a unique label for an element also involves being able to distinguish between one of a potentially large number of media types- we will describe this step latter in the section ‘Media Type’. Using the Netscape Composer Image Properties window, the banner in Figure 3 (top image) is identified as a GIF file.
  • Sequential Number: There is almost always more than one page element of the same type on the same page, so in order to distinguish between them a unique sequentially assigned number is appended to the Unique element number.

As it is the first image encountered on the page, the Unique Element Label for the banner image on the home page of the University of Wollongong is 0-GIF-1.

Screen Coordinates

The screen coordinates of a page element must be measured directly off the screen using a transparent ruler. The top-left hand corner of the page area becomes the origin of the measurements (0,0), see Figure 6. Two sets of coordinates are needed. The first set of coordinates defines the top-left hand corner of the bounding box of the element in question (a). The second set of coordinates defines the bottom-right hand corner of the bounding box of the element in question (b). All row measurements (X) are made from the left hand edge of the screen while all column measurements (Y) are made from the top edge of the screen to the point of interest on the bounding box.

Figure 6:Measuring the screen coordinates of the bounding box for the banner image viewed from Netscape Composer.


Media type

All page elements should be classified according to their respective media types. You are required to create a classification scheme that includes the kind of media that you find on the web pages being analysed. A good place to start is the media type classification developed by Gibbs and Tsichritzis (1995) Reading #3 in the BUSS909 Reader. The authors classify media into three distinct groups shown in Table 1. The ‘text’ category is reproduced here in order to assist in the discussion below.

Table 1:Media Type Classification (after Gibbs and Tsichritzis 1995, 15-77). Note that all the ‘text’ media types are shown.

1 / Nontemporal Media
1.1 / Text
ASCII
ISO Character Sets
Marked up Text
Structured Text
Hypertext
1.2 / Image
1.3 / Graphics
2 / Temporal Media
2.1 / Analog Video
2.2 / Digital Video
2.3 / Digital Audio
2.4 / Music
2.5 / Animation
3 / Other Media
3.1 / Extended Images
3.2 / Digital Ink
3.3 / Speech Audio
3.4 / Temporal Sequences
3.5 / Nontemporal Video and Animation

With reference to Table 1, note that for web applications the ASCII, ISO Character Sets and Marked Up Text are probably irrelevant. Note also that Structured Texts could be further classified into HTML and XML. Alternatively you might classify these as Hypertext! Say you have a digital video clip of a Disney classic animation. How would you classify it? I would classify this media as a digital video because of how the user interacts with this media is determined by the capabilities of the video player. Therefore, classify media from the point of view of the audience not the web site developers!

The media type can be given a three-letter code, as was illustrated in the Unique Element Labels. Once you have your classification scheme established assign a unique colour to each media type that you find on your site. This unique colour is used to fill the objects background that you use to show its location and size on a NetObjects page.

Element Size

Element size can be interpreted in several ways. For example:

  • Screen Dimensions: the Height and Width Dimensions are available from the Image Properties window of Netscape Composer, as previously mentioned. This information is also available from the View | Page Info option. For example, the University of Wollongong banner image is 61 pixels high by 567 pixels wide.
  • Storage: the size of the page element in bytes is also another useful measure of a page element. For example, the University of Wollongong banner image represents 19560 bytes of storage.
  • Screen Area: another useful way of showing the size of an element is by calculating its screen area (A) using the following formula provided below and the screen coordinates determined above:

A = (Xb - Xa) (Yb - Ya)

Number (Repetition)

It is often the case that elements are repeatedly used on a given page. For example the littlebullet.jpg image is used as a highlight for the major options on the University of Wollongong home page. Rather than defining all aspects of this image seven times, it is better to describe its properties once, and then simply provide a unique name and its location for each subsequent time it is used on the page. A unique number is still needed in this case, as it may be necessary to refer to a specific instance of the littlebullet.jpg image. The number refers to the number of occurrences and for littlebullet.jpg = 7. If a page element is not reused on the page, then its number = 0.

Properties (Groupings)

On occasion a repeating element may be arranged into identifiable groups. For example, the littlebullet.jpg image on the University of Wollongong home page can be thought of as occurring in a group which is to the left of the group of options that it will highlight. In this case, the littlebullet.jpg form a single group. Bullets on a page are often a repeating page element that may for one or more groups- the separate lists of items for which the bullets are used. As previously defined if a page element is not reused on the page, then its number = 0 and by definition its groupings = 0.

Mapping Page Elements

In this section we describe:

  • how to select an appropriately scaled grid in NetObjects so that these measurement units match those used when you measured the page elements from your selected site,
  • required procedures to place NetObjects in the proper state prior to drawing any page elements
  • how to draw and label the representations of page elements on the screen using two available alternatives in NetObjects (the text box method and the labelled polygon method).

Selecting the Units of Measurement

To show the NetObjects measurement rulers in Page View, or type CTRL+U or click on View | Rulers and Guides. Make sure that the option has a tick mark beside it which indicating that it has been selected. Say for example that you have measured the elements on your web pages using centimetres, but currently the rulers and guides are set to inches. To change the unit of measure to centimetres, click on Edit | Preferences that opens up a Preferences window (see Figure 7 left hand image). Make sure that the User Tab is showing in this window, find the Size Measurement Units drop-down object. Note that it is currently in Inches. Click on the down arrow that is part of the drop-down box, and select the Centimetres option. Notice that the rulers and guides immediately change to centimetres. Click OK to close the window.

Figure 7:Changing the units of measure from inches (left image) to centimetres (right image)



Preparing to Draw and Label

Prior to doing any drawing and labelling of page elements NetObjects must be in a specific set of states described below:

  • Display Standard Tools:NetObjects must be in Page View and the Standard Tools must be displayed. If the Standard Tools are not shown on the screen (see Figure 8 left image) then click View | Toolbars | Standard Tools to display them (see Figure 8 right image).

Figure 8:NetObjects screen without (left image) and with Standard Tools being displayed (right image)



  • Checkout MasterBorders or Layout: Note that although the Standard Tools are displayed on the screen they are greyed out indicating that they cannot be selected (see Figure 9 left image). This is because both the MasterBorders (currently selected) and Layout are ‘Checked In’ as indicated by the locked icon. The following procedure assumes that you are adding page elements not navigation elements which should generally be placed in the MasterBorders area.
    Click anywhere within the Layout area of the current page to select the Layout area. The Layout area label background turns from grey (unselected) to blue (selected). To unlock access to this page, click on Workgroup | Check Out Layout. Note that the icon changes from a locked state to an unlocked state. Also notice that the Standard Tools change from greyed out to coloured indicating that they can now be used (see Figure 9 right image).

Figure 9:Standard Tools are greyed out because both MasterBorders and Layout are Checked In (left image). Having selected and checked out the Layout region on the screen, all the Standard Tools become available for use.



Drawing and Labelling Page Elements

There are two ways of drawing and labelling page elements in NetObjects. These are provided in increasing order of usefulness:

  • Text Box Method: This method enables labels to be placed inside a text box area on the screen. This method suffers from the disadvantage that an exact area cannot be easily achieved except by adjusting the point size of additional line feeds entered into the box.

To use this method, select the text box option from the Standard Toolbar and draw a box on the Layout section of the Page (see Figure 10 left image). Then enter the appropriate unique label identifier for that object, click anywhere in the Layout and select the text box object (see Figure 10 right image). The text box object is selected when the black handles appear on its edges.