In preparation for a project, I've been reading papers about data sonification, which were suggested to me by a former colleague,
Jordan Wirfs-Brock, who knows a lot about data sonification. They were insightful. Here's the highlights for me.
The first paper was
Is Sonification Doomed to Fail? by John G. Neuhoff, which details problems that data sonification has yet to shake off.
In humans, there are approximately ten times as many cortical neurons devoted to vision as there are to hearing. It should come to as no surprise then that in all by the perception of time, perceptual judgments made with the eyes are usually more precise than those made with the ears.
I've tried to convey magnitude with picture in the past, and yes, it is really hard to pick out fine differences in pitch, especially they need to be really fine in order to represent a really large range of values. It is, however, very interesting that hearing is better at picking up changes in time.
Perhaps the most critical difference between musicians and non-musicians in extracting information from sonification lies in the ability to segregate auditory streams.
Musicians are used to separating a sound into components that come from different sources. Moreover, the paper points out all sorts of assumptions that don't hold across the general population that by data sonification designers make because they are musicians themselves.
Leveraging audition's temporal advantage would likely be a more fruitful approach than concentrating on other perceptual dimensions (e.g. pitch and loudness). Similarly, although the spatial resolution of the visual system is better than that of the auditory system, we can only see a limited field of vision while we can hear in 360 degrees.
In general, I've noticed organizations (and individuals) have difficulty exploiting the actual advantages of a medium to which they are unaccustomed, preferring instead to translate whatever works in their comfort medium to something else.
The second paper is
Consistency of Magnitude Estimations with ConceptualData Dimensions Used for Sonification. It is a writeup of two experiments conducted to determine how people perceived changes in various aspects of sound in terms of magnitude. The results from both experiments were highly correlated.
Experiment 1 found that:
- Increasing urgency can be reliably represented by increasing pitch, tempo, and especially brightness. (19 of 19 participants thought that increasing brightness meant increasing urgency.)
- Increasing brightness also indicated increasing proximity for a majority of subjects (12 out of 19). The pitch and tempo did not work here.
In most cases, changes in brightness sent a clearer message than the other aspects. (In the intro, the paper mentions that sometimes people can't correctly identify the direction in a pitch change. Each experiment includes a test to see how subjects correlate changes in a sound element to itself. e.g. "The pitch went up; subject reports that the pitch went up." Sort of an A-A test.)
Experiment 2 had more "wins" for brightness. Significant majorities of subjects thought that an increase in brightness worked as an indicator for increases in size, temperature, pressure, and velocity.
The only other results that stand out to me are that 1) pitches getting higher reliably suggests increases in temperature and 2) pitches getting lower suggests greater attractiveness. (17 out of 20 subjects thought that decreases in pitch suggested increases in attractiveness.) I'm not actually sure that it is possible to explain a consistent definition of attractiveness to test subjects, though. As the author says:
However, it is possible that the term attractiveness is strongly associated with aesthetic judgements and rarely considered as a data dimension, so it maybe that some listeners simply made subjective assessments of the sounds themselves,despite the instructions to consider the sounds in terms of the data values they would represent.
I think the biggest takeaways for me were:
1) I shouldn't count on pitch or tempo to represent most data dimensions. (However, I still believe that using tempo to represent time — basically, using time to represent time — is worth trying.)
2) Brightness is a reliable indicator of a lot of data dimensions; worth considering for usage.