[ Table of Contents | NEXT ARTICLE ]

LOOKING AT VISUALIZATION
by Ed Colet


Previous articles that have appeared among the online pages of DS* have attested to the importance and value of data visualization. In general, visualization has been (and continues to be) useful in helping one see subtle trends and patterns in large amounts of data, patterns that otherwise may not have been apparent. As such, visualization has a close parallel with data mining. But like many data mining tools, visualization tools and software often also require high levels of technical sophistication on the part of users in order for it to be effective. In this column, we'll take a look at some end-user issues that influence the effectiveness of visualizations.

A Human Factors and Usability perspective explicitly considers some cognitive factors in using graphics and visualizations. From this perspective, there are at least three important issues. These are (1) what mental and cognitive processes are required of the user? (2) what are the characteristics of the data that make visualizations effective for end users? and (3) what are the background and skills of the users?

Mental and cognitive processes in using visualizations:

Early research on visualization asked the obvious question - which is more effective, tables of data, or graphs of such data? Effectiveness has typically been measured in terms of accuracy and/or speed of the user's response or judgment. But the results have been mixed, making it difficult to come to a definitive conclusion. This is due in part to the fact that the early research question proved to be too broad and the answer depends on other factors. But these studies can be organized in terms of the mental demands placed on end-users - and lead one to make some general conclusions. (for details, see: Aaronson & Colet, "Computer-based visualization for communicating multivariate information: Cognitive-Factors considerations", 1998).

In general, for "low-level" tasks, presenting the raw data (perhaps in a table) seems to be better than via a graph, but for "high-level" tasks it appears that graphics and visualizations provide an advantage. This of course assumes that regardless of the presentation method, it is well designed and information is well organized. Low-level tasks are such tasks in which information is directly observable from the presentation. For low-level tasks, cognitive processing is primarily of a perceptual nature - i.e., seeing and looking is all that is required by the user. In contrast to low-level tasks, high-level tasks are tasks in which information has to be mentally processed in some way - via remembering, integrating with other information, and/or involving judgments. High level processing typically requires comparisons, mental computations, or transformations of information held in human memory. If such processing is required by the user, then visualizations or some form of graphical presentations appear to be better. High-level processing may require "deeper and more elaborate mental processing" - and thus information also tends to be remembered well. This would suggest that simple data querying to retrieve a fact may not be best presented via visualization. But if high-level thinking is required, as in a trend analysis, then visualization tools can effectively support this.

Data characteristics for effective visualizations:

A second consideration pertains to the data attributes. In general, the larger the amount of data, the more effective visualization is for providing the user with an overall feel for the data. No one would seriously expect to get a feel for a large data set by viewing tables. The benefit of visualization holds only if the data attributes are compatible with the particular visualization. For example, if the data are proportions then this is suitable for a pie-chart presentation, and if the data represent a time series, then this is suitable for various types of line graphs. The association of a data attribute with its representation is not usually as clear cut as in the previous examples. For example, some research on clustering approaches suggests that if the data are not inherently spatial in nature (e.g. data are about several product categories), then it can be difficult to effectively present results in a spatial manner. One can creatively attempt to "invent" a new graphical presentation for particular data attributes - but this would first require the user to learn to map particular visualization features (e.g., color or shading) onto the appropriate data attribute. If this is not intuitive to the user, then visualizations may not be effective.

End-user background:

A third consideration is the user's background and experience. Using graphics and visualization requires graphic specific skills in reading axes, knowing to compare slopes of lines, and interpolating or estimating values in high dimensional spaces. These skills can for the most part be learned. Visualization also requires general skills in spatial visualization or spatial cognition. For the most part, these are influenced by experience. Evidence has shown that many childhood activities involving spatial play are important in developing this ability. An observed consequence of this appears to be some gender differences in spatial abilities - males tend to be better at spatial tasks than females. A controversy and ongoing debate centers on the causal factors responsible for this difference, but this brings us beyond the scope of this column.

In addition to general influences and abilities, day to day experience can influence the perception of how effective a visualization may be. Studies have shown that business students were adept and effective at working with tables of data rather than graphs, and the opposite pattern was true for engineering students - i.e., they were better at working with graphs rather than tables. This is attributed to day to day experience that they've grown accustomed to. Regardless, of this, the point is that end users' skills and background are important considerations.

To conclude, visualization can and has been important. Just as in data mining, much of it requires high levels of sophistication on the part of the end user. But with careful attention to the characteristics of the user(s), tools and technologies can be developed to enhance their usefulness - and ultimately reveal hidden and subtle patterns that are useful.

---

For more information, see http://www.virtualgold.com.


[ Table of Contents | NEXT ARTICLE ]