Scarlet & Grey
Ohio State University
School of Music


Consonance and Dissonance - Roughness Theory

Helmholtz (1877) proposed the notion that dissonance arises due to beating between adjacent harmonics of complex tones. In effect, dissonance arises due to rapid amplitude fluctuations. Helmholtz proposed that maximum dissonance would arise between two pure tones when the beat rate is roughly 35 cycles per second.

It is important to note that Helmholtz himself was aware that beats alone could not account for the phenomenon of dissonance. He noted, for example, that for a fixed interval size, beating is highly dependent on the register:

"Observation shews us, then, on the one hand, that equally large intervals by no means give equally distinct beats in all parts of the scale. The increasing number of beats in a second renders the best in the upper part of the scale less distinct. The beats of a Semitone remain distinct to the upper limits of the four-times accented octave [say 4000 vib.], and this is also about the limit for musical tones fit for the combinations of harmony. The beats of a whole tone, which in deep positions are very distinct and powerful, are scarcely audible at the upper limit of the thrice-accented octave [say at 2000 vib.]. The major and minor Third, on the other hand, which in the middle of the scale [264 to 528 vib.] may be regarded as consonances, and when justly intoned scarcely shew any roughness, are decidedly rough in the lower octaves and produce distinct beats.
"On the other hand we have seen that distinctness of beating and the rougness of the combined sounds do not depend solely on the number of beats. For if we could disregard their magnitudes all the following intervals, which by calculation should have 33 beats, would be equally rough:
the Semitoneb'c''[528-495-33]
the whole Tonesc'd'[major, 297-264] and d'e' [minor 330-297]
the minor Thirde g[198-165]
the major Thirdc e[165-132]
the FourthG c[132-99]
the FifthC G[99-66]
and yet we find that these intervals are more and more free from roughness.* [pp. 171-172]
Helholtz himself advocated a sort of "duplex" theory in which dissonance is attributable to two factors: (1) the overall size of the interval, and (2) any beating.
"The roughness arising from sounding two tones together depends, then, in a compound manner on the magnitude of the interval and the number of beats produced in a second." p.172
[quotations from Hermann Helmholtz, On the Sensations of Tone As a Physiological Basis for the Theory of Music. Translation by Alexander J. Ellis of the German Edition of 1877. Second English Edition.]

Status of Theory

Several researchers have suggested that maximum dissonance arises for a fixed beat rate. Plomp and Levelt concurred with Helmholtz that maximum roughness occurs for a beat rate of approximately 35 cycles per second. However, Zwicker and Fastl (1990) collected data showing a maximum roughness of 70 cycles per second.

In modern times, this theory has been broadened to refer to any temporal or time-related disturbance in the auditory system.

The following description comes from message posted Aug. 1 1999 on the Auditory List by Prof. Bill Hartmann at Michigan State University (hartmann@pa.msu.edu).

Comments and Conjectures on Roughness

The question on sensory dissonance from Leman and Sethares reminds me of the thesis work on roughness that Jian-Yu Lin did in my lab in 1995.

Jian-Yu studied roughness produced by both amplitude modulation of sine tones and and beats of sine tones. He studied center frequencies from 70 Hz to 2000 Hz. He found, like everybody else, that as the center frequency increases, the modulation rate or beat rate that produces maximum roughness also increases. Beyond that, we ran into two kinds of troubles.

Theoretical trouble: Like others, we imagine that roughness is caused by fluctuations of excitation in the auditory system. This involves three components: (1) The spectral components must excite the same auditory filter. This is the critical-band connection with roughness noted by everybody. (2) There is a temporal modulation transfer function. The TMTF is a lowpass filter on fluctuation rates because the auditory system cannot follow very rapid fluctuations. (3) There is a "speeding factor" whereby increased fluctuation rate leads to increased sense of roughness. The need for the speeding factor is clear because the other two factors favor small separations between the spectral components. Without the speeding factor, one would expect maximum roughness at the lowest possible modulation frequencies or beat rates. Because of all these factors, we do not expect, a priori, any simple relationship between maximum roughness and critical band.

Experimental trouble: We found it difficult to get consistent judgements of maximum roughness over time. We tend to think that roughness may be multidimensional, perhaps related to the tradeoffs between the three factors above. In the end, we asked our listeners a different question. We asked them to find the highest modulation frequency or beat rate that produced a large sense of fluctuations. This question was based on our observation that as the modulation frequency increases, there comes a point where listeners experience a rather rapid falloff in perceived fluctuation strength. We got stable results with this question. We think that it is a better question than maximum roughness.

Asking listeners to find this maximum modulation rate or beat rate has the additional advantage of eliminating the speeding factor. Instead of asking listeners to maximize a perceptual quantity (roughness) this new question asks listeners to find the limiting characteristic. Therefore model calculations intended to explain the results do not need to include the speeding factor. It is enough to consider auditory filtering and TMTF.

Now to the question from Leman and Sethares: When Jian-Yu did the experiment in this way, he found the 70 Hz limit suggested by Zwicker and Fastl. As the center frequency of the components increases (frequency of the carrier for AM or mean frequency of a beating pair) the "highest-modulation-rate-before-a-dramatic-decrease-in-fluctuation" increases until about 1000 Hz, and there it tends to saturate at a value of about 70 Hz.

More trouble: We think that it is important to study both AM and beats. They sound rather similar, and any good model ought to be able to deal with them both. For the same fluctuation rate, the AM signal has a bandwidth that is twice as large as the beating sines, and that is where the model problems begin. In the end, we were not satisfied with our ability to find a model that would explain both. Therefore, this work appears only in a an ASA abstract, JASA volume 97, page 3275.

Conjecture about the 70 Hz limit: This conjecture must be seen as heretical because it is outside the bounds of any modeling concept for roughness, including our own. There are also no data to support it. Nevertheless, here is is: As the modulation frequency or beat rate grows to become as large as 70 Hz, the listener starts to hear the fluctuation rate as a low pitch. Maybe it's like the missing fundamental effect or maybe it's a difference tone. Whichever it is, the presence of this low pitch tends to homogenize the auditory sensation and lead to a dramatic decrease in perceived roughness. That is why one does not see maximum roughness fluctuation rates greater than 70 Hz.

Bottom line: If this conjecture is correct, then it means that there is a 70-Hz boundary. Studies of critical bands and roughness really ought not to go there because the rules suddenly change at fluctuation rates that high. That, in turn, would limit roughness studies to center frequencies less than 1000 Hz.

Bill Hartmann