Music coloring project

Multivariate Modeling Sound and

Interactive Multi-Media Workshop

School of Computer Science, Tel-Aviv University

Student: Mor Dunsky

ID: 039713870

Course Number:0368-3500-07 (Autumn 2004-5)

Instructor: Prof. Nathan Intrator
Table of Contents

Introduction

Theoretical Background

The frequency spectrum

Reverberation

modify~

Input

Output

Example

Implementation

balance~

Input

Output

Example

Implementation

fixvolume~

Input

Output

Example

Implementation

References

Introduction

This project was developed as part of a workshop in Multivariate ModelingSound and Interactive Multi-Media, course number 0368-3500-07, autumn 2004-5 under the supervision of Prof. Nathan Intrator. It consists of several Max/MSP patches (a patch is a Max/MSP document). The top level patch (modify~) enables the manipulation of an input signal (typically a music WAV file) in the following aspects:

  • reverb time – influences the amount of reverberation added to the signal.
  • high frequency roll-off – influences the nature of the reverberation added.
  • pitch shift – shifts the signal up or down one octave.
  • left-right balance
  • volume

The patches were developed in the Max/MSP environment using its powerful built-in objects and signal processing capabilities. This project is the final component of the whole workshop. The other components generate the parameters that control the five characteristics mentioned above.

The following sections provide some theoretical background and then describe the project structure in detail. Each module is described in general, its input/output parameters are documented, an example is provided and finally a comprehensive explanation of its implementation is presented.

Theoretical Background

The frequency spectrum

The frequency spectrum shows the decomposition of a signal, into its frequency components (the sinusoidal basis functions of the Fourier series). It gives information about the amplitude and the phase of each frequency component. For audio signals amplitude corresponds to the air pressure, or to the movements of the diaphragm of a speaker. Its logarithm is usually measured in dB, so a null amplitude corresponds to −∞ dB.

Example:

The frequency spectrum can be found from the result of a Fourier-related transform. Ideally, one would use the full Fourier transform, but often only a finite set of discrete samples is available. In this case the discrete Fourier transform can be applied.

Note:

In this project the MSP object pfft~ is used. This object enables spectral audio processing by performing the Fast Fourier Transform and the Inverse Fast Fourier Transform. Inside this object, the gizmo~ object preforms a pitch shift by finding the peaks in the frequency spectrum, and shifting them along the frequency axis to transpose the sound.

Reverberation

Excerpts from “Electronic Methods for Simulating Acoustic Reverberation”

by Mike Wozniewski

Reverberation can be thought of as the combined effect of all sound reflections that a sound produces in a room. Take the simple example of a rectangular room with nothing inside except a theoretical source of sound, and a theoretical listener:

Fig. 1

The direct signal will arrive at the listener first, since the reflections have a greater distance to travel. Furthermore, the amplitude of a sound becomes inversely proportional to the distance that it travels through its medium.

There are several properties that characterized a particular room's reverberation. One is the reverberation time, which is the amount of time required for a sound to decay away to 1/1000 (-60 DB) of its amplitude. Another is the frequency dependence of this decay. Low frequencies reflect better and therefore usually decay more slowly. A third property is the time delay until the first reflection is heard. This gives the listener some indication of the general size of the room. The final important characteristic is the rate of buildup in echo density. In a smaller room, echoes from reflections will buildup in density much faster than in bigger rooms.

Simulating reverberation
Many digital modules that simulate reverberation have been designed. The comb filter is a basic component in such modules. It is used to simulate delayed reflections of sound.

Fig. 2

When a signal passes through such a filter, one copy of the signal will pass through to the output and another copy will be fed back into the system. The fed back copy will be delayed by time d and multiplied by a user-controlled gain value g. This new signal is then processed again. If we pass a sound pulse through this filter, we will get an impulse response similar to the one illustrated in figure 3a:

Fig. 3a , Fig. 3b

Furthermore, since the response is a train of equally spaced pulses, it will output the pulses at a frequency of 1/d, which is termed the natural frequency of the filter. This can be seen in the spectral representation of the response, where the peaks are on multiples of 1/d (figure 3b). The response also decays exponentially, just as natural sound does within a room.

In order to simulate a complicated hall using comb filters alone, one would have to combine many units in parallel. Ideally, for every path that sound can take from source to listener, there should be an equivalent filter. Such a representation is impractical, requiring an arbitrarily large amount of these units.

One method for enriching the reverberation without having to include a large number of extra comb filters is to apply some all-pass filters to the signal in series after the comb filter network.

Fig. 4

An all-pass filter (in figure 4) equally passes all frequency components of a signal. Thus the spectral components of the signal are not altered, yet the phases of the components do change. This means a steady state sound will be perceptually unchanged by the filter, yet sharp changes will ring slightly at the filters natural frequency. For example, if a tone was played then abruptly stopped like the one depicted in the following diagram, the sudden change would result in ringing at the filters natural frequency.

Fig. 5

Since this natural frequency represents the travel time of sound reflections, this ringing will sound like reverberation within the room and will add to the density of echoes.

The following is an example of a system suggested by M.R. Schroeder:

Fig. 6

The network of comb filters initiates feedback similar to acoustic feedback in a room, and the all-pass filters help to increase the echo density of the reverberation.

Note:

In this project reverberations were added to the signal by use of the patch yafr (yet another free reverb) by Randy Jones of 2up technologies. This object enables modification of the reverb time, reverb/early reflection balance and high frequency roll off. It’s implementation includes components discussed above.

modify~

A patch which receives a signal, a mode flag and 5 additional parameters. Each of the 5 parameters modifies the signal in a distinct manor. The patch has two modes, normal and sensitive. The parameters affect the reverb time, high frequency roll-off, pitch shift, left-right balance and volume.

Input:

signalIn 1st inlet: The signal to be modified according to the parameters received in the other inlets.

0 or 1In 2nd inlet: 1 sets the patcher mode to sensitive, 0 sets it to normal. Parameters received in the 3rd–7th inlets are interpreted differently in the different modes. The default mode is normal.

floatIn 3rd inlet: The reverb time sampled from a Poisson Distribution. Negative values are clipped to 0. Larger values signify longer reverb time. When in sensitive mode, modify~ is most sensitive to changes in [0, 0.75]. The default value is 0.15.

In 4th inlet: The amount of high frequency roll-off sampled from a Poisson Distribution. Negative values are clipped to 0. Larger values signify less absorption of high frequencies. When in sensitive mode, modify~ is most sensitive to changes in [0, 0.75]. The default value is 0.15.

In 5th inlet: The pitch shift sampled from a Gaussian Distribution with mean 0 and variance 1. Values out side the range [-2, 2] are clipped. -2 means one octave down, 0 means no change in pitch and 2 means one octave up. The default value is 0.

In 6th inlet: The left-right balance sampled from a Gaussian Distribution with mean 0 and variance 1. Negative values create a shift left effect, positive values create a shift right effect. The absolute value determines the amount of shift. When in sensitive mode, modify~ is most sensitive to changes in [-0.5, 0.5]. The default value is 0.15.

In 7th inlet: The volume sampled from a Poisson Distribution. Negative values are clipped to 0. 0 means half of the maximum volume, higher values raise this percentage. When in sensitive mode, modify~ is most sensitive to changes in [0,0.75]. The default value is 2.

Output:

signalIn left outlet: The left channel of the modified signal.

In right outlet: The right channel of the modified signal.

Example:

see modify~.help

Implementation

All input is passed through the “validate_input_type” sub patch which prints error messages to the max window in case of a type mismatch. (The route object sends input of the correct type to the outlet, and triggers an error message if illegal input is received.)


Input from the 3rd – 7th inlets is then transposed in two ways. For normal mode, the “transpose_normal” sub patch linearly transposes the input to the desired intervals, aside from the pitch shift which is transposed using the function .

For sensitive mode, the “transpose_sensitive” sub patch transposes the input as follows:


The reverb time, high frequency roll-off and volume are first mapped to [0.5, 1] using the function:


This magnifies changes in the interval [0, 0.75]. The result is then linearly mapped to the appropriate range if needed.


The left-right balance is mapped to [-1, 1] using the function:

This magnifies changes to the input value in the interval [-0.5,0.5].


The pitch shift input is mapped to [0.5, 2], 0.5 indicating one octave down, 2 indicating one octave up, 1 indicating no shift) using the function:


In the sub patch “transpose_input” the router object routes either the output of “transpose_normal” or the output of “transpose_sensitive” to the outlets, according to the current mode (normal/sensitive). The patch also displays the current parameter settings in the sliders at the bottom. The matrix (object matrixctrl) displays the connections in the router object reflecting the current mode.

The yafr object (yet another free reverb) is used to add reverberations according to the 3rd and 4th parameters (reverb time and high freq. roll-off). The “reverb/early reflections balance” value required by this object is automatically set to 30 on load.
The resulting signal is then mixed with the original signal in the mix sub patch.

pfft~ is then used to control the pitch shift of the signal according to the 5th parameter. An additional patch (named “pfft_gadget” in this case) is needed for this object to work.

The signal is then passed through the fixvolume~ patch (described in a different section) in order to keep the volume constant and multiplied by the 6th parameter (volume).

Finally the balance~ patch (described in a different section) is used to control the left-right balance according to the 7th parameter (balance). balance~ outputs two balanced channels which are then sent to the two outlets.

The presets object and number boxes are used to store and load default values when the patcher is loaded. (Loads default values for the 6 parameters, the “reverb/early reflections balance” value for yafr and the patcher mode (normal mode)).

balance~

An object which receives a signal and a float between –1 and 1. The object creates an effect of left – right balance without modifying the volume of the signal. The float indicates the level of balance.

Input:

signalIn left inlet: The signal to be balanced.

FloatIn right inlet: between -1 and 1. Negative values create a shift left effect, positive values create a shift right effect. The absolute value determines the amount of shift. Values outside the range are clipped. The default is 0.

Output:

signalIn left outlet: The left channel.

In right outlet: The right channel.

Note:

The balance~ object does not change the amplitude of the signal. The shift effect is a achieved by delaying one of the channels.

Example:


see balance~.help

Implementation

First the input is sent to the validate_input_type sub patch which prints error messages to the max window in case of a type mismatch.

(The route object sends input of the correct type to the outlet, and triggers an error message if illegal input is received.)

A positive float received in the right inlet indicates the left channel’s delay time in milliseconds (values over 1 are clipped). The float is converted to number of samples using the mstosamp~ object. The result is sent to the delay~ object connected to the left outlet (corresponding to the left channel) along with the original signal. The right channel is left unchanged. This causes the shift right effect.

A negative float received in right inlet indicates the right channel’s delay time in milliseconds (values under -1 are clipped). The float is multiplied by -1 and symmetrically causes the right channel to be delayed while the left channel is left unchanged. This causes the shift left effect.

fixvolume~

A patcher which receives a signal and normalizes it. The result is an output signal of constant volume.

Input:

signalIn left inlet: The input signal which will be normalized in order to keep it’s volume constant.

intIn left inlet: Sets the interval in samples for the STD calculation. Larger values result in a more stable estimated STD value and smoother output signal. The default value is 2,000.

Output:

signalThe normalized signal.

Example:

see fixvolume~.help


Implementation

First the input is sent to the validate_input_type sub patch which prints error messages to the max window in case of a type mismatch.

(The route object sends input of the correct type to the outlet, and triggers an error message if illegal input is received.)

fixvolume~ estimates the STD of the signal using the average~ object with the rms flag (root mean square). (This is a good estimation providing the signal has a mean of 0).
The signal is divided by 5 times the estimated STD in order to keep most output samples between -1 and 1. (Otherwise clipping occurs in the dac~ object and the sound is distorted)

Dividing by 5 was chosen as a compromise between clipping and volume.

(It is possible to increase the volume of the signal by multiplying it by a factor > 1. Beware of clipping.)

References

  1. Max/MSP documentation, specifically: “Max Getting Started”, “Max Tutorials and Topics”, “Max Reference”, “MSP Tutorials and Topics” and “MSP Reference”.

  2. Electronic Methods for Simulating Acoustic Reverberation -Mike Wozniewski
  3. Wikipedia - Frequency spectrum, Fourier transform, Short-time Fourier transform.

1