Errata for Data Mining Introductory and Advanced Topics by Margaret H. Dunham

Updated 3/3/05.

Chapter 1:

Chapter 2:

Chapter 3:

  • Example 3.3, p54: Formula for P(h1 | x4 ) has an extra “)”. It should read:

.

Chapter 4:

  • Last paragraph on p79 correct fallout/recall as follows (Thanks to Nick Street) :

“fallout(percentage of irrelevant that areretrieved) versus recall (percentage of relevant that are retrieved).”

  • Last sentence on the bottom of page 79/top of page 80 should be changed to read: “The curve is constructed by examining tuples classified as relevant in a particular order, such as descending order of similarity.”
  • Equation 4.26 on p101 should read .
  • Corrections to calculations in Example 4.9 on p102 can be found at .
  • Exercise 2 on p121 should read “that the Output1 column is the correct classification and Output2 is what is seen.”
  • Exercise 3 on p121 should read “assuming Output2 is the correct assignment”.
  • Exercise 7 on p122 should replace <Jim,M,2.0> with <John,M,2.5>.
  • Exercise 21 on p 122 should make guideline plural.

Chapter 5:

  • P 144, total cost complexity for PAM should be k(n-k)**2 (Thanks to Lars Helge Hass).
  • Example 5.9 on p158 uses a threshold of 0.2 (not 0.6) (Thanks to Aryya Gangopadhyay)

Chapter 6:

Chapter 7:

  • Page 198, Last sentence prior to section 7.2.1 should read : “An alternative markup language such as extensible markup language (XML), provides structured documents and facilitates easier mining.
  • Page 212, Table 7.1, 4th row should be labeled as: “Maximal forward references”
  • Page 212, last line should read “in Example 7.4 is shown in Example 7.5.”
  • Page 213, first line after Algorithm 7.2 should read: “… for Example 7.4 is shown …”
  • Page 215, Example 7.7, second line should read: “data in Example 7.4 …”
  • Page 215, Paragrph label at bottom of page should read:

“Maximal Frequent Forward References”

  • Page 216, second line should read: “… Looking at Example 7.7 and the …”
  • Page 216, the sequence on lines two and three should be <A, B, C, A, C, B, C, A, C, D, C, E>
  • Page 216, the maximal forward references on line four should be :

<A,B,C>, <A,C,B>, <A,C,D>,<A,C,E>

  • Page 216, the first line in second paragraph should end with: “... mine maximal frequent forward references “
  • Page 216, title of Algorithm 7.3 should be: “Maximal frequent forward references algorithm”
  • Page 218, Exercise 5, should read:

“… indicate the sequential patterns, maximal forward references, and maximal frequent sequences …”

  • Page 219, The last two lines of Exercise 6 should read:

“Identify the maximal frequent forward references when the users can be distinguished as well as when they can not be. Assume that a sequence must occur twice to be large.”

Chapter 8:

Chapter 9:

Appendix A:

Appendix B:

Please let me know of any additional corrections which should be included. Thanks, Maggie Dunham