LOGFILE ANALYSIS

Literature Review

Brett Shelton

November 5, 1999 (Final)

OVERVIEW:

This literature review attempts to further the inquiry on how best to analyze the current orthopedic web site for design modification. The writings reviewed are focused on web server log data, its acquisition and inspection, and improvement of web design as based on the survey of similar studies.

CITE:

Wyman, S. K., et al. (1997). Developing system-based and user-based criteria for assessing federal websites. In Proceedings of the ASIS Annual Meeting, 34, 78-88.

MAIN POINTS:

Wyman outlines a method for the current analysis of federal web sites and suggests ways of improvement in their design. Their goals include implementing a management tool for the process of identifying evaluative criteria and continual design improvement resulting in increased usage.

SIGNIFICANCE:

Wyman and his colleagues attempt a task similar in scope to that of the current PETTT project in the evaluation and redesign of the orthopedics web site. They utilize web site mapping and server log analysis, among other qualitative techniques, to obtain the data to fulfill their objectives. The results of this examination are not explicit within the article, but the study provides a good framework of reference for methods of examination and recommendation of web site effectiveness.

CITE:

Ramey, J. (1999). Heuristics: Web data collection for analyzing and interacting with your users. In Web Evaluation Heuristics. University of Washington Department of Technical Communications: unpublished report.

MAIN POINTS:

Dr. Ramey highlights numerous important points about information gained in gathering data from web logs while pointing out the dangers of assuming facts about the data. Her outline defines a productive protocol in analyzing the data, what useful information can be obtained, and offers suggestions on what type of questions that data can be used to answer.

SIGNIFICANCE:

I would recommend the outline contained in the article for use as a tool when attempting to learn about a web site’s audience. She outlines pitfalls when parsing and analyzing the data, providing a strong influence when making decisions about what the data may suggest. Her tables are more helpful for those already with a data analyzing software package or service, or to those with a knowledge of data manipulation with a database software package (with knowledge of SQL), but information can still be garnered by novices with web site interests.

SITE:

Chen, H., Wigand, R. T., & Nilan, M. S. (1999). Optimal experience of web activities. In Computers in Human Behaviour, 15(5), 585-608.

Chen, H., Wigand, R. T., & Nilan, M. S. (1998). Effective utilization and management of emerging information technologies. In 1998 Information Resources Management Association International Conference, 633-636. Hershey, PA: Idea Group Publishing.

MAIN POINTS:

This study outlines the idea of flow (control, attention focus, curiosity, interest) in order to investigate the user’s experience in web navigation. They utilize a “pop-up” program in the user’s browsers at random intervals that asks them a series of questions designed to capture their psychological “web state” at a particular moment. They collected and analyzed this data to determine what factors were involved in maintaining an optimal web experience for the user. The article did not describe the technical aspects (creation/implementation) of the “pop-up” program.

SIGNIFICANCE:

This study used an alternative form of information gathering about the users than analyzing web log data. In essence, their technique provided an automated yet cumbersome method which surely must have become tiring to the respondent. Nevertheless, it seems like an appropriate method in which to capture the user’s emotional state during web activity with perhaps less direct involvement by the researcher.

SITE:

McNichols, C. (1999). Web-based opinion polling with active server pages. In WEB Techniques, 4(6), 81-85.

MAIN POINTS:

The author of the article has created a downloadable program that simplifies the process of creating and maintaining an on-line poll consisting of radio buttons and/or checkboxes. It has a browser-based interface, but quite a bit of server interaction will be necessary in its creation, as well as familiarity with web-server scripting and database manipulation. Respondents can view results of the poll (if the pollster wishes) immediately after the polling process, and the pollster can get feedback at any time during the survey process.

SIGNIFICANCE:

The idea of the program is substantial, as it could be a great benefit to anyone who wishes a “quick and dirty” way of gathering poll-type data and giving immediate feedback to those polled. The downside is the knowledge one must have of creating such a poll, the limited type of data collected, the fact that the pollsters are aware that they are answering specific questions (not optimal in many instances), and the access and manipulation of many programs/data. There exists a lot of overhead in computing, resources, and authoring for the pollster.

SITE:

Palmer, J. W. & Griffith, D. A. (1998). An emerging model of web site design for marketing. In Communications of the ACM, 41(3), 44-51.

MAIN POINTS:

The article analyzes web sites of Fortune 500 companies to see how they are utilizing the available technology the web has to offer. They analyze and categorize (presumably through observer-based inquiry) the nature of the company and the products they are producing/representing, then assign values to the information to determine the characteristics of the sites in terms of their marketing strategies and techniques.

SIGNIFICANCE:

The results of the research determine many important details of the effectiveness of the sites, depending on the nature of the company and the purpose of the marketing. For instance, the firms with more information intensive products utilize the web more effectively for information translation between suppliers and customers. The bottom line is that managers should look at their audiences carefully, and mesh the technology with their marketing strategy in order to utilize the web in the best possible manner, and they offer suggestions on which questions to answer in order to do so. The article is less supportive on ways to measure use of their web site through data investigation.

SITE:

Drott, C. D. (1998). Using web server logs to improve site design. In Sixteenth Annual International Conference of Computer Documentation. Conference Proceedings. Scaling the Heights: Future of Information Technology, 43-50. New York: ACM.

MAIN POINTS:

This article provides a brief outline of web server logs; their creation and content from simple to complex, and gives examples of how to wean valuable content from these logs. He offers ways of manipulating the log files during their creation to reduce the file size and answer specific questions such as those involved in tracking links, searches, paths, and initial contact by the user. He also points out what limitations of the data exist within the log files, and ends the paper with basic suggestions on how to view the data.

SIGNIFICANCE:

The information contained in this article is valuable to a novice’s interpretation of viewing server log files to determine information about who is accessing the site. It offers a few examples of what the data might actually look like before it is parsed into different fields, so that a person just viewing the logs “cold” can gather information about the users. It is specific in its examples and provides some helpful tabled information.

SITE:

Almedia, V. A. F., et al. (1999). Efficiency analysis of brokers in the electronic marketplace. In Computer Networks, 31(11-16), 1079-1090.

MAIN POINTS:

The article provides an introduction to e-brokering (a site that mediates between buyer and seller on the internet) then offers a quantitative study on the workload characteristics of such a site describing insights on certain cultural nuances between customers before finally examining the efficiency of a specific e-broker. It provides an overview of the e-brokering process and examines usage of customers through server log files in determining the efficiency of one e-broker versus another.

SIGNIFICANCE:

The article contains a fine example of how server log files can be probed to determine user inquiry and data translation. They specifically used log file time and date, search information input by the user, returned information by the query, and response time as guidelines in the study in order to compare two different e-brokers. They used “click-through frequency” to relate amount of information provided by a site to the number of times a user continued navigating the site.

SITE:

Eder, L. B. & Darter, M. E. (1998). Physicians in cyberspace. In Communications of the ACM, 41(3), 52-54.

MAIN POINTS:

The article attempted to answer 3 questions related to medical practice and web sites: 1) Do physicians use the web for medical information gathering? 2) Do physicians recommend the internet as a source for medical information? 3) Are physicians using the web as a marketing tool, and if so, has it had an impact? The results of the survey found that 50% of the physicians polled use the web for medical information gathering, but only 14% recommended it as a useful tool for patients. Roughly 25% of the physicians had a professional web site (current or in-the-works) and they reported very little if any positive impact on the success of their practice due to their sites.

SIGNIFICANCE:

While the article does not help in the field of user log data file gathering or analysis, it is helpful to realize how many physicians are using medical information on the web. As the medical resources become more commonplace and reliable, we might expect more physicians recommending specific web sites for information gathering. Thus, increases in hits by physicians and their patients for viable medical web sites can be projected.

SITE:

Garton, L., Haythornthwaite, C. & Wellman, B. (1997). Studying online social networks. In Journal of Computer Mediated Communication, 3(1),

MAIN POINTS:

Garton makes an argument for utilizing social network approach for studying computer-mediated communication between groups. The argument includes references do data gathering and the use of electronic mediums in which to gather and analyze data. The author points out the tedious nature of analyzing log file data and brings into question the ethical limitations of identification of the subjects.

SIGNIFICANCE:

This article has more to do with gathering and analyzing social network technological mediums, such as chat interfaces between workers, than web log data gathering and analysis. She offers interesting insights of methods in which to gather data within the workplace and makes recommendations on software packages to do so.

SITE:

Stout, R. (1997). Web Site Stats. Tracking Hits and Analyzing Traffic. Berkeley: Osborne/McGraw-Hill.

MAIN POINTS:

The book is for teaching novices about what information is contained within log files, what part of the files are important for analyzing, and “how to get the most out of them.” He devotes entire chapters on analyzing log data (and cookie data) and the science of summary statistics.

SIGNIFICANCE:

An excellent reference for dissecting log file data, this book also provides suggestions on how to design web sites tailored for specific user audiences. It also provides suggestions on numerous software data analyzing packages and gives an overview, complete with samples, of many of them.

SITE:

Buchanan, R. W. & Lukaszewski, C. (1997). Measuring the Impact of Your Web Site. Proven Yardsticks for Evaluating. New York: John Wiley.

MAIN POINTS:

This book takes the entire development of effective web sites and places measurement in consistent phases, throughout appropriate stages, along the design process. Buchanan and Lukaszewski discuss planning, site justification, strategies, and features before describing different measures, analysis, and action.

SIGNIFICANCE:

A particularly valuable portion of this book is in the chapter relating information on evolving content of the web site as dictated from analyzed data. It holds suggestions concerning content mix, presentation, navigation, and transmission. This book is an excellent reference when deciding what changes should be made to a side as dictated from interpreting server log files.

SITE:

Abels, E. G., White, M. D., & Hahn, K. (1999). A user-based design process for web sites. In OCLC Systems & Services, 15(1), 35-44.

Abels, E. G., White, M. D., & Hahn, K. (1997). Identifying user-based criteria for web pages. In Internet Research: Electronic Networking Applications and Policy, 7(4), 252-262.

MAIN POINTS:

The first article provides an argument for web page design based on user-based criteria, as provided by feedback from a number of focus groups identifying their preferred and least-liked design features. The second article focuses on the development of a sample page designed with the identified criteria as its model.

SIGNIFICANCE:

These articles contain a limited amount of useful information concerning user-gathered information from existing web sites (a process likely to be found in Phase IV in their study). The study does provide an excellent paradigm for in designing for a specific user-based population and describes a method in which to implement a bridge from information, based upon user-based criteria, into the creation of a web site.

REFLECTION:

This review resonates what is an emerging field of web site analysis. As web site usage continues to grow, so will the importance of analyzing the who, what, when, where, and why of a specific audience on the internet. Specific articles have highlighted a direction in which to pursue in the quest to determine the best path for gathering a picture of a scientific web site, its audience, and its redesign. The next step is to implement some of the techniques highlighted by this literature and perform appropriate audience reviews. The process of analyzing a web site for optimal performance and its redesign is ultimately resigned to a continuum.

BIBLIOGRAPHY:

Abels, E. G., White, M. D., & Hahn, K. (1999). A user-based design process for web sites. In OCLC Systems & Services, 15(1), 35-44.

Abels, E. G., White, M. D., & Hahn, K. (1997). Identifying user-based criteria for web pages. In Internet Research: Electronic Networking Applications and Policy, 7(4), 252-262.

Almedia, V. A. F., et al. (1999). Efficiency analysis of brokers in the electronic marketplace. In Computer Networks, 31(11-16), 1079-1090.

Buchanan, R. W. & Lukaszewski, C. (1997). Measuring the Impact of Your Web Site. Proven Yardsticks for Evaluating. New York: John Wiley.

Chen, H., Wigand, R. T., & Nilan, M. S. (1998). Effective utilization and management of emerging information technologies. In 1998 Information Resources Management Association International Conference, 633-636. Hershey, PA: Idea Group Publishing.

Chen, H., Wigand, R. T., & Nilan, M. S. (1999). Optimal experience of web activities. In Computers in Human Behaviour, 15(5), 585-608.

Drott, C. D. (1998). Using web server logs to improve site design. In Sixteenth Annual International Conference of Computer Documentation. Conference Proceedings. Scaling the Heights: Future of Information Technology, 43-50. New York: ACM.

Eder, L. B. & Darter, M. E. (1998). Physicians in cyberspace. In Communications of the ACM, 41(3), 52-54.

Garton, L., Haythornthwaite, C. & Wellman, B. (1997). Studying online social networks. In Journal of Computer Mediated Communication, 3(1),

McNichols, C. (1999). Web-based opinion polling with active server pages. In WEB Techniques, 4(6), 81-85.

Palmer, J. W. & Griffith, D. A. (1998). An emerging model of web site design for marketing. In Communications of the ACM, 41(3), 44-51.

Ramey, J. (1999). Heuristics: Web data collection for analyzing and interacting with your users. In Web Evaluation Heuristics. University of Washington Department of Technical Communications: unpublished report.

Stout, R. (1997). Web Site Stats. Tracking Hits and Analyzing Traffic. Berkeley: Osborne/McGraw-Hill.

Wyman, S. K., et al. (1997). Developing system-based and user-based criteria for assessing federal websites. In Proceedings of the ASIS Annual Meeting, 34, 78-88.

FinalPage 1 of 7