Web Standards - They Are Needed
Aaron Westerdale
Software Engineering Dept.
University of Wisconsin – Platteville
Abstract
Over the years the usefulness of the World Wide Web has expanded
exponentially. This expansion has lead to more diverse users including
everyone from elderly seniors ordering their medication to young children
playing games. The addition of these new uses has raised the demand for web
sites to rely heavily on programs along with causing many sites to become
much too large for a small group to manage. The increase in the diversity
of users and the increase in the number of programmers creating web pages
has brought the need for standards in web programming into the light. A
number of standards have been around for many years and as the consequences
for ignoring them increases, the need for standards to come into wider use increases.
There are also a number of tools available to help programmers utilize these
standards.
Introduction
Every Software Engineer knows standards are a necessity but not everyone realizes every place they are needed. Web development is one place commonly overlooked when it comes to standards even though there are official standards covering almost every aspect of web development. The World Wide Web Consortium is a group that creates and updates standards for web development that are complex but generally easy to understand. Since the first creation of html standards in 1994, the html standards have come a long way, along with many other standards being made to cover virtually all other areas of web development. There are also many advantages for following the standards including making your web pages more accessible and displaying better.
Why Should We Use Web Standards
There are a number of advantages to using web standards. One advantage to a programmer is, if cascading style sheets (CSS) are used a programmer can create the scripting and html files needed to display the data needed and do the processing needed and then a designer can make CSS files to style the data the programmer puts into the html files. This limits the amount of changes if any the designer has to do to the programmers files, especially if the programmer put in appropriate class and id attributes on appropriate elements within the html pages. Classes are an attribute that can be added to an element on a page to group the elements for styling. Ids are an attribute that must be unique to each element which is a way to uniquely style each element. Figuring these appropriate places can be learned by discussing the page with a designer and can become fairly automatic over time.
Another reason a programmer should use web standards is a number of Software Engineering guidelines mention standards as a necessity. Standards are one of the thirteen principles of Extreme Programming. The ASM Software Engineering Code of Ethics also mentions standards in a number of places emphasizing their importance. In section 3.06 the code talks about how engineers should follow professional standards.[1] Standards are also mentioned in section 8.05 where it talks about improving ones self by improving ones knowledge of relevant standards.[1]
As in general for all programming, standards ease maintenance and debugging by making the code look common to all programmers using the standards. Standards also make maintenance easier by removing all design elements from the main document so you do not have to dig through the style elements to find problems in your code. Using the standards also helps make your page look more consistent across different web browsers as many of the developers of the web browsers use these standards as guidelines when developing the rendering engines of the web browsers.
Legal Reasons
Using web standards is not required by law but there have been a few legal cases that would have been avoided if the developers would have followed the web standards. As mentioned earlier the standards help make your web pages more accessible to not only a variety of devices but also more accessible to people with disabilities. This accessibility to the disabled is what would have kept these two groups of people out of court.
An example of the main problem of what these organizations, discussed in the following paragraphs, did wrong is that people with good vision take it for granted when they click on the logo on the Computer Science and Software Engineering department homepage at the University of Wisconsin – Platteville that they are taken to the school’s homepage, but a blind person would not see the image and would only read the following text. <a href=" src=" alt="UWP Homepage"</a
The alt text as part of the image is the only indication, if the link was not obvious, that selecting the image is what would get them to the campus homepage. There are a number of ways the link could be ambiguous, for example if a page name is to vague or if the link is a link to a JavaScript function that does some processing or checking of data before submitting an order.
2000 Sydney Olympics
In 1999 a blind person filed a complaint against the organizers of the Sydney Olympics citing he was being unlawfully discriminated by not being able to use the web site for the Olympics along with a number of paper documents not being made available in brail.[12] The complaint said the web site was inaccessible by a blind person mainly because alt text was not provided on all images and image map links.[12] The alt text on images is important since blind people are unable to see the pictures the alt text is what they are able to read to understand what the image was of, as illustrated above. This is especially important when an image is used to distinguish a link or button since without this text a blind person would have no idea what the link or button was for. The paper documents were quickly released in brail versions but a ruling on the web site did not come until almost a year later. This ruling awarded $20,000 to the man that filed the complaint and told the developers they had until the beginning of the Olympics (Nov 6, 2000) to make the web page accessible. [12]
National Federation of the Blind vs. Target
In 2006 Disability Rights Advocates, Brown Goldstein & Levy, and Schneider & Wallace filed a class action lawsuit against Target for having their site inaccessible to the blind.[7] The Target web site had no alt text for images, inaccessible image maps and navigation, and required a mouse to complete a transaction.[7] It was even impossible for some one to order medication on Target’s web site without using a mouse. This poses a problem for not only the blind but also anyone disabled who cannot use a mouse or a web browsing device that does not have a mouse. Representatives from the National Federation of the Blind pointed the problems with the web site out to Target, who replied by saying that they were not legally required to make their web site accessible. After repeated attempts to get Target to fix their web page the court case was filed. Target attempted to get the case thrown out but after a judge refused to throw the case out, changes began to appear on Targets web page with some of the alt text appearing and the requirement for the mouse to complete a transaction being fixed. Even as I write this article I see that Target’s homepage still has 501 errors based on the standards using the tidy tool I discuss later that integrates into the Firefox browser. As of the writing of this article no final judgment has been made on this case, but it has cost Target a lot of money in legal fees that could have been avoided by using the standards. [7]
Standards
HTML
The first official web standards were HTML 2.0 created in 1994 and last updated in August of 1995. HTML 2.0 was just based on what the current practices were in 1994.[4] In 1996 HTML 3.2 was released and once again these standards were just the HTML 2.0 standards updated to cover what the current practices were. HTML 3.2 was last updated in January of 1997 when work began on HTML 4 standards.[5] HTML 4.0 standards were originally released in December of 1997. HTML 4.0 was last updated in April of 1998 when work on the HTML 4.01 standards began which was initially released in December of 1998.[9]
For the first time while developing the 4.01 standards they considered what the standards really should be, not just what the current practices were. While they were developing the 4.01 standards they considered what needed to be added to cover multimedia uses, scripting languages and style sheets.[9]
Since the standards began to support style sheets, it was now possible to separate your data from your display. This separation was one of the big steps towards increasing accessibility. With this separation someone who reads the actual text of the page and not the rendered visual page is able to just read through the actual data and does not have to read through all the elements that were just there to make the page look nice. Style sheets also helped to bring on better printing capabilities by allowing a developer to produce a style sheet to be applied when the page is printed allowing the developer to develop different styles on the viewing and printing of a page.[9]
Another advantage these standards brought about was the initial consideration of internationalization. A developer is now able to specify what language the page is written in. A developer can also specify a language for a particular attribute or section of a page which could be useful on a page teaching a new language, so the page could have one language in general and then individual sections could have another.[9]
The html 4.01 standard also introduced the idea of different levels of implementation of the standards. There were three sets of standards, the strict, transitional and frameset. The strict set of the standards only includes the current set of attributes removing everything that has been deprecated. The transitional set includes all the current attributes and all of the old attributes. The idea is to use the transitional set while the developers are moving the web page from the old standards to the new standards so the pages can still be conforming to some standards until the transition is completed. Unfortunately many developers take this as an invitation to just write to the transitional standards and never completely move on to the new standards. The last set of standards is the frameset which include everything in the transitional and everything needed to accomplish frames in a web page. This idea of these three sets continues to be used in the newest of the XHTML standards.[9]
The ISO and IEC groups also created web standards. These standards which are numbered 15445 were originally based on the 4.0 standards and in 2003 they were updated to be based on the 4.01 standards.[6]
XML
All the standards so far were based on the Standard General Markup Language (SGML). This language is very powerful but can quickly become complex. All the standards after this will be based on the Extensible Markup Language (XML) which was created out of the desire to keep the power and flexibility of SGML but to remove the complexity. The main goals when creating XML were ease of use, compatibility with SGML, and documents using XML should be human readable.[3]
The first recommendation for XML standards was originally released in February 1998. The 3rd edition was released in 2004 and the current edition which is the 4th edition was released in the fall of 2006.[3]
There are three main parts to the extensible markup language, the entity, element, and attribute. An entity is like a #define in C or a little like a constant. When an XML tool goes through a XML document it replaces entities in the document with their value. Elements are the main building block of an XML document. Each element has a type or name and then can have attributes and even contain other elements. Each element must have a start and end tag or a self ending start tag. Attributes are contained by elements and each must have a name and value with the value being surrounded by double quotes. Each XML document must have a root element which contains all other elements and attributes in the document.[3]
XML documents use Document Type Definition (DTD) documents to outline the requirements for that type of XML document. The DTD document covers what elements can be in the XML document. It also lists the attributes each element can or must contain and what elements the parent element can or must contain and how many of each element can or must be contained. The DTD document also covers what kind of data each attribute must have and what type of data each element must contain if any. A DTD document can be contained at the top of the XML document or contained by a separate file and then referenced at the top of the XML document. The XML document may also reference multiple DTD documents but then all these DTD documents are the requirements for that XML document and all included DTD documents may not have conflicting requirements.[3]
XHTML
Extensible Hypertext Markup Language (XHTML) standards are the newest html standards that are based on XML rather than SGML. XHTML 1.0 was originally released in January of 2000 and last updated in August of 2002. One of the advantages to XHTML is that it is modularized, so if a developer wants to add a set of elements not part of the standard XHTML set of elements the developer can easily expand the standards by creating their own DTD to expand the XHTML requirements. Since XHTML is modularized it is also easy to expand the standards without having to throw out the original standards, for example if a developer wants to display mathematical equations within their web pages they can simply include the MathML standards that add on top of the XHTML standards instead of having to create all new standards to include these new requirements.[11]
Another advantage of XHTML documents are that since they are XML documents a developer can use any XML tools on the XHTML document. This makes it possible to use XML parsers to dissect an XHTML document. The biggest advantage is that any XML document validation tool will work to validate XHTML documents. Validation is how a developer can make sure the XML document meets the standards outlined in the DTD document.[11]
XHTML standards really push for the separation of presentation or design from structure and data. XHTML once again has three sets of standards, the strict, transitional, and frameset. Strict is for documents completely converted to the XHTML standards and transitional is for those documents in the process of being converted to the new standards. Frameset would still be for the documents that still use frames even though frames generally are not used with the power available with the use of CSS.[11]
The one main disadvantage is that the DTD document used to validate the XHTML document is unable to specify tags or elements that are not allowed to nest within themselves. The actual documented standards for XHTML do specify elements that are not allowed to be nested but the standard XML validation software is unable to detect improper nesting of the same element.[11]
Some general requirements are that all element and attribute names must be lower case. Every element must have an end tag or the starting tag must be a self ending tag. Also all entity references must be lower case. Also since the XHTML standards are based on XML all attribute values must be surrounded by double quotes.[11]
CSS
Many advantages of Cascading Style Sheets (CSS) have already been mentioned but there are a few more advantages. As mentioned before using CSS separates HTML and code from the design which eases maintenance and debugging. Using CSS also makes it very easy to totally change the look of a page since the actual HTML and code does not need to be change, the developer simply needs to create a different CSS file or set of CSS files and change the reference within the HTML to the new CSS file. CSS also helps create consistency across multiple pages since they can all reference the same CSS file. The fact that one file can be used to style multiple pages also helps lower bandwidth costs since the file only needs to be downloaded once by a browser. Using CSS files also makes HTML pages more accessible to blind people since it removes the design from the basic data and structure.[10]