A Visualized Product Recommendation System using Fisheye Views and Data Adjacency

1. Introduction

Recommendation for personalization enables web stores to gain customer loyalty, sales, and advertisement profit, and it also reduces the time and effort for consumers to search products effectively.In fact, these recommendation systems have been already introduced in the famous electronic commerce companies like Yahooand Amazon.com[1][2][14].The on-line users’history, which is called web log data,attracts great attention by researchers as a data for personalized recommendation systems.

In this study we present a visualized product recommendation system based on data adjacencytheory and fisheye view.

2. Literature Review

2.1 Web mining

Web mining is a process of analyzing the data in Internet and using the information. [6][13].Web mining has three categories:web structure mining, web content mining, and webusage mining[6][11][12].Web structure mining is to mine summary of web structure. Web content mining is to extract information of meaning and details in the web.Web usage mining is for web page reconstruction, discrimination, and finding navigation patterns.

2.2Personalized Recommendation Systems

Web personalization means that a web site is designed for a specific consumer or group. A product recommendation system gives products or related information for a customer based on demographical data, transaction data, and web log data[10].Eirinaki and Vazirgiannis(2003) classified data analyzing techniques into contend based filtering, collaborative filtering, rule based filtering, and web usage mining[7].

2.3Visualization

Data visualization is defined as a technique that represents the information in a visual form[5]. It provides users with easy understanding of the datain a short time. Becker et al.(1997)described multi-dimensional data visualizationas a main tool for KDD(Knowledge Discovery in Databases) in data bases[3]. Visualization methods help users find outliers and distribution without analysis of data. Fisheye View is one of the methods that graphically represent program, database, and online text efficiently[8].Fisheye Viewrepresents“local detail” and “global context”differently.

2.4Data Adjacency and Adjacency Matrix

Data adjacency is the useful form in understanding various relationships of objects in a decision space.It shows that whether item i and item j are co-purchased, or the purchase of item i results in that of item j. Data adjacency is based on graph theory dealing withfinite points and lines.If two points are linked by a line, the relationship is represented by 1, and 0 if they are not linked, in an adjacency matrix.

3. Fisheye Views and Data Adjacency

3.1Data Acquisition and Transformation

The data set in this study is collected from aninternet shopping site.This on-line company sells computers and computer-related items such as laptop computers, USB HDDs, desktops, and other peripheral devices.Access histories of 3,000 on items in product categories are collected. As web log data has lot of information including date, IP address, server name, and time, it is important to refine the dataset on proper purpose[9].All products in the company are assigned new serial numbers(ex. P1, P2, …, Pn).

3.2 CFM(Connection Frequency Matrix)

Internet users surf the web sites under a certain order and direction. Therefore, theory on adjacency matrix can be applied to internet space[4]. Assume that an on-line shopping site sells five products. If three users navigate sites for above five products as inFig. 1 respectively, web surfing histories by these three userscan be represented as in Fig. 2.

User 1 = { A B D E C A }
User 2 = { A B C E A }
User 3 = { A D E A B D E C A }

Fig. 1.Three users’ activating in our assumed web

As the the left figure of Fig. 2, the graph has directed arrows showing direction of move from an item to another item on the web.CFM based on the graph of Fig. 2 is in the right side of Fig. 2. The value of an element in CFM represents the frequency of visits from an item i to j. For example, there are item A and item B in this web. The value means that there are three moves from item A to item B.

i j / A / B / C / D / E / Sum
A / - / 3 / 0 / 1 / 0 / 4
B / 0 / - / 1 / 2 / 0 / 3
C / 2 / 0 / - / 0 / 1 / 3
D / 0 / 0 / 0 / - / 3 / 3
E / 2 / 0 / 2 / 0 / - / 4
Sum / 4 / 3 / 3 / 3 / 4 / 17

(Graph) (CFM)

Fig. 2.CFM and Its Representation in a Graph

To explain easily in this study, we select 7 products in a category in this on-line shopping site.Table 1 shows CFM from histories on 7 products(P1, P2, P3, P4, P5, P6, and P7).

Table 1.An Example of CFM

P1 / P2 / P3 / P4 / P5 / P6 / P7 / Sum
P1 / - / 45 / 46 / 25 / 9 / 1 / 1 / 127
P2 / 3 / - / 1 / 52 / 47 / 45 / 1 / 149
P3 / 24 / 45 / - / 38 / 22 / 6 / 87 / 222
P4 / 47 / 91 / 44 / - / 106 / 33 / 44 / 365
P5 / 57 / 65 / 37 / 40 / - / 37 / 12 / 248
P6 / 41 / 27 / 45 / 37 / 35 / - / 9 / 194
P7 / 54 / 51 / 16 / 4 / 5 / 4 / - / 134
Sum / 226 / 324 / 189 / 196 / 224 / 126 / 154 / 1439

Table 1 shows that customers moved from P1to P245 times. However, it also shows that customers moved from P2to P1 only three times.

3.3 Implementation of Suggested Methods

Assume a customer is now viewing a product P1. In this case, he can move to any of 6 other products:P2, P3, P4, P5, P6, and P7. The recommendations in this situation can be suggested based on the numbers in Fig. 3.

P1 / P2 / P3 / P4 / P5 / P6 / P7 / Sum
P1 / - / 45 / 46 / 25 / 9 / 1 / 1 / 127

Fig. 3.A Situation Where a Customer is Visiting P1

The numbers in P1 row of Fig. 3 represents the frequency of moves from product 1 to other 6 products. A web design ignoring these weights(i.e. frequency) can be in Fig. 4. Each sub area in this figure represents a product. The efficient design, however, better includesthese frequencies by increasing the areas of interests and reducing the areas not interested in.

P5
P1
P3 / P4 / P7
P2 / P6

Fig. 4.A Web Design on Products without Weights on Them

Fig. 5 shows a visualized web considering these weights on each product. It shows that commonly visited sites from P1such as P2 and P3have large areas compared to P6 and P7. It means that users tend to move from P1to P2 or P3more than from P1to P6 or P7.

P2 / (35%) / P5
P1 / P4 / (7%)
(reserved / (20%)
area) / P3 / (36%) / P6(1%)
P7(1%)

Fig. 5.A Web Design on Products with Weights on Them

3.4 A View on Treemap

Treemap is a useful structure showing relationships among objects. When a user is viewing a product Piand the whole number of products isn, the relative weights on other products can be calculated again as bellow:

/ (1)

Therefore, CFM of Table 1 could be represented as inTable 2.

Table 2.Relative Weights between Pi and Pj

P1 / P2 / P3 / P4 / P5 / P6 / P7 / Sum
P1 / - / 0.35 / 0.36 / 0.2 / 0.07 / 0.01 / 0.01 / 1
P2 / 0.02 / - / 0.01 / 0.35 / 0.32 / 0.3 / 0.01 / 1
P3 / 0.11 / 0.2 / - / 0.17 / 0.1 / 0.03 / 0.39 / 1
P4 / 0.13 / 0.25 / 0.12 / - / 0.29 / 0.09 / 0.12 / 1
P5 / 0.23 / 0.26 / 0.15 / 0.16 / - / 0.15 / 0.05 / 1
P6 / 0.21 / 0.14 / 0.23 / 0.19 / 0.18 / - / 0.05 / 1
P7 / 0.4 / 0.38 / 0.12 / 0.03 / 0.04 / 0.03 / - / 1
Sum / 1.1 / 1.59 / 0.99 / 1.1 / 0.99 / 0.61 / 0.62 / 7

For example, when a user is viewing P3, the weight on P4can be calculated as W(P34) = {38 / (24+45+38+22+6+87)} = 38/222=0.17. Fig. 6 shows a connected graph representing relationships among items in Table 2 using lines in different types. Bold lines represent strong relationships. Dotted lines represent weak relationships.

Fig. 6.A Connected Graph Based on Relationships on Table 2

Now we can present a treemap view with focus ona certain product. If P1 is focused, strong links can be visualized with a treemap inFig. 7. The numbers in parentheses are the cell values inTable 2, which are weights among products.

Fig. 7.A Tree for Customers Viewing P1

Fig. 7shows next items a user can move from P1. These are selected nodes linked with weights more than 0.3. This treemap shows probable travel patternsin which users most commonly visited from P1. to other products.

4. Tests

4.1 Recommendation with Association Rules

To verify the effectiveness of the suggested method, a recommendation system using association rules is compared with the method. Association rules provide information on relationship among items. In this study, rules withmore than 50% Support valueand,more than 1 in Lift value were selected. Table 3shows the selected association rules.

Table 3.Selected Association Rules

Rule No. / Support(%) / Confidence(%) / Lift / Rule and its content
1 / 19 / 62 / 2 / P479 P477
(Samsung Laptop LG-IMB Laptop)
2 / 16 / 63 / 2 / P479 P439
(Sony Laptop Dell Desktop)
3 / 11 / 57 / 2 / P479 P446
(USB HDD Samsung Laptop)
4 / 10 / 64 / 1.5 / P479 P102
(MP3 Player Digital Camera)
5 / 10 / 51 / 1.5 / P479 P465
(Digital Camera Cell Phone)
… / … / … / … / …

4.2Research Design

An empirical test is designed to compare a visualization technique based on association rules and the other based on the suggested method in this study. Two different web sites are constructed for these two methods. One is based on association rules and the other is based on the adjacency and treemaptheories. Fig. 8 shows a web page based on association rules.

Fig. 8.A Web Page based on Association Rules

Product on the right side of Fig. 8 shows a related products with a product on the left.Figure 9 shows the recommended products in treemap based on our method.

Fig. 9.A Web Page based on CFM

320 users participated on the tests that compare the efficiency of above two recommendation systems. To reduce the bias, the users are divided into two groups: one for association rule based system(A) and the other for the suggested system(B) in this study. These two groups evaluate two recommendation systemsseparately. The variables used in evaluation are shown in Table 4. The results from ANOVA tests are in Table 4.

Table 4.Descriptive Statistics

Measure / Subjects
Groups / No. of
Sample / Mean / Std.
Dev. / ANOVA
(F-value) / Sig.
Loyalty / User
Satisfaction / A / 160 / 3.21 / 0.56 / 24.362* / .000
B / 160 / 3.58 / 0.38
Total / 320 / 3.35
Intention
to Re-visit / A / 160 / 3.40 / 0.75 / 12.822* / .000
B / 160 / 3.67 / 0.54
Total / 320 / 3.54
Intention
to Purchase / A / 160 / 3.20 / 0.77 / 38.636* / .000
B / 160 / 3.68 / 0.63
Total / 320 / 3.44
Web
Usability / System
Quality / A / 160 / 3.66 / 0.58 / 18.033* / .000
B / 160 / 3.87 / 0.64
Total / 320 / 3.77
Information
Quality / A / 160 / 3.27 / 0.63 / 22.874* / .000
B / 160 / 3.55 / 0.60
Total / 320 / 3.41
Service
Quality / A / 160 / 3.52 / 0.62 / 11.276* / .004
B / 160 / 3.67 / 0.67
Total / 320 / 3.60

* Significant at α= 0.05, all constructs are five-point scales with the anchors 1=Very Disagree, 3=Neutral, 5=Very Agree.

ANOVA test shows that there is a significant statistical difference between two groups.Table 4 shows thatthe suggestedsystem has higher mean than the association rules based system in loyalty(user satisfaction, intention to re-visit, intention to purchase) and web usability(system quality, information quality, service quality).

5. Conclusions

This study suggests a visualized product recommendation system based on data adjacencyand fisheye views. To test the effectiveness of the suggested system, it is compared with arecommendation system based on association rules. The results from the tests prove that the suggested method has high performance. Analysis from the tests confirms that it has greater loyalty and web usability compared to the other system.

There are limitations in this study too. Firstly, the number of products in the test is comparatively small.Secondly, our method should be compared with more diverse methods in the literature of product recommendation systems.

References

1. Allen, C., Kania, D., Beth, Y.:Internet World Guide to One-to-One Web Marketing. John Wiley & Sons Inc., New York (1998)

2. Ansari, A., Essegaier, S., Kohli, R.: Internet Recommendation Systems. Journal of Marketing Research, Vol. 37, No. 3.(2000) 363-375

3. Becker, R.A., Cleveland, W.S. Martin, R.D.:Trellisgraphics displays: a multi dimensional data visualizationtool for data mining.Proceedings of 3rd Annual Conference on Knowledge Discovery in Databases, Newport Beach, CA. August(1997)

4. Błażewicz, J., Pesch, E., Sterna, M.: A novel representation of graph structures in web mining and data analysis. Omega, Vol. 33, No. 1.(2005) 65-71

5. Cleveland, W.S.:Visualizing Data. AT&T Bell Laboratories,Murray Hill, NY.(1993)

6. Cooley, R., Mobasher, B., Sirvastava, J.: Web Mining: Information and Pattern Discovery on the World Wide Web. Proceedings of the 9th IEEE International Conference. (1997) 558-567

7. Eirinaki, M., Vazirgiannis, M.: Web Mining for Web Personalization.ACM Transactions on Internet Technology, Vol. 3, No. 1.(2003) 1-27

8. Furnas, G.W.: Generalized Fisheye Views.Published in Human Factors in Computing Systems CHI ‘86 Conference Proceedings. (1986)16-23

9. Jicheng, W., Yuan, H., Gangshan, W., Fuyan, Z.: Web Mining: Knowledge Discovery on the Web. Systems, Man, and Cybernetics, IEEE SMC '99 Conference Proceedings,Vol.2.(1999) 137–141

10. Kim, Jongwoo, Bae, Sejin, Lee, Hongjoo: Sparsity Effect on Collaborative Filtering-based Personalized Recommendation. The Journal of MIS Research. The Korea Society of Management Information Systems, Vol. 14, No. 2. (2004) 131-149

11. Kosala, R., Blockeel, H.: Web Mining Research: A Survey.SIGKDD Explorations: Newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining, ACM, ACM Press, Vol. 2, No. 1. (2000) 1-15

12. Madria, S.K., Bhowmick, S.S., Ng, W.K., Lim, E.P.: Research Issues in Web Data Mining. Proceedings of the 1st International Conference on Data Warehousing and Knowledge Discovery (DAWAK 99). (1999)

13. Mobasher, B., Cooley, R., Srivastava, J.: Automatic Personalization based on Web usage mining. Communications of the ACM, Vol. 43, No. 8. (2000) 142-151

14. Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., Riedl, J.: GroupLens: An Open Architecture for Collaborative Filtering of Netnews. Proceedings of the ACM 1994 Conference on Computer Supported Cooperative Work, New York, ACM. (1994) 175-186