However, I Have the Privilege Should I Introduce It Now?

> Lev Nachmansen: Everybody is familiar with his similar work on treemaps and many other works that have to do with information visualization and computer human computer interaction. And of course he has a long list of awards that I cannot talk about because if I say all of this then we wouldn't have time to hear his speech.

However, I have the privilege should I introduce it now?

> Ben Shneiderman: Please.

> Lev Nachmansen: I have the privilege to announce that he received the IEEE Career Award on visualization, which will be awarded in the InfoVis next week.

> Ben Shneiderman: That's right.

[applause]

> Ben Shneiderman: All right. Thank you for that kind of introduction. I'm very, very pleased to be here. I've for years followed the Graph Drawing Conference, but I don't think I've ever been. I was invited to be on the program committee. So I'm very pleased that you're seeking the interactive element as part of this community.

My background isn't very much in this framework of algorithm design for database and file strategies and indexing and search, andthose kind of algorithms still to me are the heart of computer science, but I become what I say is 20 percent of an experimental psychologist in trying to study the way people use technology.

So I was pleased that at least one of the talk had some empirical results about people, not just empirical results with data.

And so I'm here today in appreciation and recognition of your work, but also I hope to change your position a little, your attitude, and maybe shift your attention towards the real opportunities. Because 20 years ago when graph drawing started, it was a very different world. Networks were a rare beast. It was hard to get the data. There weren't that many people who were interested. Now suddenly we're surrounded by social media, and the opportunities and the demand and the pressure and the interest in visualization overall, the number of blogs and the cultural phenomena that's become visualization, is startling.

So I'm very pleased with that. So maybe I'll just plant one idea in your mind; that maybe graph drawing should rename, still keep a GD, but call it graph discovery. Because the idea is discovering and making insights. The purpose of drawing graphs is not pictures, it's insight. And that's what I hope to show and promote.

But first I have great appreciation to the organizers, Morizio [phonetic] and Walter, and a copy of the book, which I've signed for them, so

>: Thank you very much.

> Ben Shneiderman: Very much appreciate the opportunity, and thank you very much.

So today is meant as kind of a review, and you can look in the paper for a more detailed analysis, and I'm pleased that Cody Dunne, a Ph.D. student working on these problems, will present the latest last part of the talk about his work.

So also I'm proud to represent the human computer interaction lab, which this year celebrates its 30th anniversary. And I gave up being director to Ben Bederson and then Allison Druin, and Jen Golbeck's now the director. We are supported and administered by both computer science and the College of Information Studies and many relations around campus with different departments, including the wonderfully titled Maryland Institute for Technology and the Humanities.

So these humanities applications are increasingly interesting ones. In fact, I'm working with a classics professor who has the social network of Alexander the Great, over 32 years of his life as he traveled around and 650 connections, and it's a fascinating story, and how do we make a visual representation of that social network is the kind of challenge that she's asking.

So MITH is the Maryland Institute for Technology and Humanities. If you visit our lab's Web site, you'll find 650 technical reports, 200 videos, 40 pieces of software, and lots more about our projects.

I hope you know me from the book Designing the User Interface, which is now in 5th edition. It's written my coauthor for the fifth edition is Catherine Plaisant who's been my collaborator for 25 years, and the work you'll hear about has partially come from her, but also the wonderful graduate students that I've had the pleasure to work with over the years.

The story for you here is to recognize that when the 5th edition came out it had a whole new section on social media. In 2004, when we did the 4th edition, there was no Twitter, there was no Facebook, YouTube was small, Wikipedia was just starting. And now all of a sudden we're surrounded by social networks. If you haven't heard, that's the hot story around.

And so visualization has also gained a separate chapter, and so those are the important issues.

With Stu Card and Jock Mackinlay we tried to lay out the basis for this new field. This is a book from that goes back a few years now. And Stu Card gave the title Using Vision to Think; that visual representations are not just a representation but it's a way of solving problems. And that was really the significant point; that within 400 milliseconds, if the interface is correctly designed with color, shape, size, and proximity, then you will be able to spot clusters, gaps, outliers, and trends in that short amount of time. So there's many implications, and we've explored that. This book collects 47 papers from different sources and 60,000 words of our own work.

I think I just can't resist telling you about Spotfire, our early work. The paper was in 1994 at the CHI conference, remains one of the most cited papers and led to the company formed by Chris Ahlberg in '97 which grew to 200 people by 2007 and was purchased by Tibco. So it was great success story.

Here we're looking at 15,000 births in Washington, D.C. The red dots are girls, the blue dots are boys. The age of the mother is over here. You can see they go from about 12 to 50. The age of the father from about 13 to 65. And you get to see many, many patterns, these multiple coordinated windows. And the dynamic query sliders were the key features of that invasion.

Spotfire has grown to be a place toanalyze tool for analyzing large complex datasets, and one of the lessons we've learned that's appropriate for today's talk is that one single visualization is not the way to show a complex amount of information, but here are 27 windows that are coordinated and so that if you filter, it filters everywhere. If you select in one window, it highlights in the other windows. That's the way to deal with complexity in data, not by trying to pack everything into one screen.

Okay. So the visual world is getting richer and richer, and here you see examples of the kind of environments people are working in to increase productivity, make better decisions and understand the world around us.

And, as I said, there's just a rich cultural phenomena around just last night I saw there's a new there's two new blogs and a new conference. The New York Times will have a conference November 8th and 9th in New York called Visualized, which brings together 25 designers who are look at or making creative visualizations.

Control rooms with lots of visual information and collaborative environments are becoming more and more the way realtime decisions are made. People, this is the counterterrorism center.

Also on small devices we see increasing use of visualization, and that's become another popular phenomena. Can even see some treemaps over here to get an idea of what's going on.

So we learn from that. I wrote down one day in a very playful way and called it the information seeking mantra, and I wrote it in this paper, 12 lines, each one represents one project where we struggled for weeks or months to find the right design, and it turned out to be show the overview first, even if it's a million or a billion items, so the users can get an understanding of the range of data, the clusters, the size of the clusters, the gaps, the outliers, and so on, and then allow the user to zoom in on what they want, filter out what they don't want, and click for details on demand.

And this has collected almost 2,000 citations, which is kind of amazing, and people who use it, people who contradict it, people who extend it, people who make jokes about it, so it's gained its own kind of little phenomena.

And I think what people like about it is that it asserts this neutrality of human decisionmaking [phone ringing] that's embarrassing the neutrality of human decisionmaking where the user gets the overview, the user zooms in on what they want and then filters out what they don't want.

So we're not talking about algorithms or data mining. We're talking about a process by which users make decisions, make discoveries and make insights.

I was very pleased, for example, in March the White House issued its statement about big data and its expenditure, about $220 million in this country from seven different research agencies, and I had some influence, I'm pleased to say, but in that threepage press release the word "visualization" appeared five times. The words "data mining" did not appear at all.

So we're seeing this sort of shift in understanding that visual analytics and visual approaches are the way people make discoveries and that we support discovery by people aided by rich and powerful statistical methods; that integration of statistics and visualization is what I really want to stress with you.

So a little bit of my way of seeing the field. We have the traditional field of scientific visualization, has 50year history of including geographic information systems and medical and architecture and so on. These are great success stories, especially if you go to Hollywood movies or play video games.

But the story that I'm talking about is here. Multivariate data, where Spotfire has been joined by very effective competitors like Tableau and many other tools, temporal data series, tree structures, and seems many people know about treemaps, so I'm pleased about that.

And then I save for myself in my work networks for last, because they seem to me the most difficult aspect of the work; that is, by when I think of networks, I think of nodes and edges, but the nodes may have many attributes and the edges may have many attributes and the problems we have to ask against those networks are very complex. And so that I felt was a substantive challenge.

And so I've become more and more devoted to this issue, especially because the social media have produced such huge resources and such important questions that we need to understand, not just for entertainment or ecommerce, but also for important national priority, such as disaster response, health care, community safety, just so many ways that the benefits I noticed outside I think there's a sign left from yesterday where Chris Dockus [phonetic] was speaking here. Maybe someone attended his talk. But he's said the key figure of Harvard Medical School has promoted the notion that if you study the networks of patients, you will find out that patients become obese if their friends become obese, they lose weight if their friends lose weight, they stop smoking if their friends stop. And the social networks determine these medical outcomes in a way that's remarkably powerful. In fact, so powerful that there are many sceptics of Chris Dockus' work.

We've run this Summer Social Webshop with 50 doctoral students around the country twice now and been a great success story, and we're just happy to continue that.

And I just want to end the introduction by saying I hope you will think every time you go do your work that some way you're contributing to these important priorities of not only national but international, and I like to use this as an illustration, the goal set out by the United Nations in the year 2000 of ending poverty and hunger, universal education, gender equality, child health, maternal health, combat HIV/AIDS, environmental sustainability, and global partnership.

In some way I want my discipline and the work I do, and I hope you devote yourself also, to working in ways that your work gets applied to making the world a better place.

Okay. So we turn now narrowly to focus on networks, and hope some of you know this wonderful Web site by Manuel Lima called Visual Complexity, ironically called Visual Complexity. He has a new book out called Visual Complexity that I think you might want to take a look at, beautifully produced book that shows many network drawings. And he has 772 examples of network systems and endpointers to those working tools. And as you can see many of them are very colorful and very beautiful, but many of them are also a mess, and the usual talk of hairball or bird's nest or spaghetti is what we see.

So some of them are beautiful and we might admire them, like Hubble Telescope photographs, and we can say something about the clusters and the size of groups here, but it's pretty hard to make sense of it. Some of you might want to frame it and put it on the wall, but I'm not sure if you can make any insights or discoveries in which you would make a decision to change things.

And some more examples, these tangled messes where you cannot see what's going on, there are some labels, but you don't even know what the labels are connected to, et cetera.

Okay. So one time to continue the mantra idea I made this little phrase of NetViz Nirvana. Our goal I would say for network visualization is that every node should be visible. I think you all agree with that and the metrics developed, Peter. I should say it's great to be in the room with heroes of mine like Peter Eades and Milor Brandis [phonetic] other leaders and Roberto Tomasia [phonetic] and others. And actually all four authors of the great graph drawing book are in the room together, which is quite a wonderful thing. And also new younger stars of people who are working and doing great work in this area.

So, I mean, the idea that every node be visible is pretty common in this community, and there are metrics for visibility, et cetera, but for every node you can count its degree, for every link you can follow it from source to destination, and for the cluster you can even see them all and maybe see their sizes and also spot the outliers.

So I wrote this down in a rather playful way, but it's become a pretty important thing. And like nirvana, it's never really attainable. We're not always attainable. But it's something we should strive for in order to make graphs visible, comprehensible in a way that people can make insights that they can depend on, that they can make a decision, that they can commit action to.

So here's the outline for the talk. There are four methods I want to talk about. These are all interactive and dynamic approaches that we have been developing and refining inside the tool NodeXL. That's the book that I handed out, and you'll see more of that.

And their basic ideas of filtering, the dynamic filtering queries are alive and well in NodeXL, double box sliders by which you can filter out the low edge density or the high edge density or both, or you can look for the high eigenvalue centrality or low eigenvector centralities. All these different metrics are built in. And then we'll look at clustering, grouping, and motif simplification. So that's the goals here.

And in a way I see this as the beginnings, the beginnings of a process model. What do I do first? Well, first I want to filter to look at a simplified graph. Let me try that, see what I can learn from filtering, then let me try clustering, see what I get from that, grouping, maybe grouping first or clustering first, and then we'll see about motif simplification.

Okay. So we just start, and we'll just take quick examples of these. There's more examples in the paper.

So here is a great story that came to us from a practical problem. A journalist named Chris Wilson working in Washington, D.C., for Slate Magazine wanted to analyze the senate voting pattern. So there are in the U.S. 100 senators, and he had the data for the year 2007. And what he was trying to look at is the similarity in voting patterns. Okay. So the strength of each edge is an indication of how many times they voted the same way on a bill. Okay. So if there are a hundred senators, how many edges are there?

>: [inaudible]

> Ben Shneiderman: No, no. Not N squared. 100 choose two, which is?

[laughter]

> Ben Shneiderman: I'll wait.

>: [inaudible]

> Ben Shneiderman: Let's see. This is who is this story? This is