1
13th EAEEIE conference, York, 2002
An introductory curriculum on computer multimedia using the signal processing approach
Jean-Paul Stromboni, Maître de Conférences
Ecole Supérieure des Sciences Informatiques
930, route des Colles, B.P. 145, 06905 Sophia Antipolis France
, Phone (33) 04 92 96 51 64,
and
M@inline project, I3S Laboratory, CNRS
1
13th EAEEIE conference, York, 2002
Abstract
Is it possible to teach Signal Processing in a Computer Science Department ? This was difficult a few years ago, but due to the multimedia abilities gained by up to date computers, it becomes not only a possibility but a necessity.
It is the challenge in this curriculum to manage to show to our students in the field of Computer Science and programming that many parameters, formats, bounds, features, tools they will have to deal with when processing multimedia documents can be explained from signal theory. This contribution proposes then a selection of knowledge, experiments, and skills, that can easily be included into such an introductory curriculum on Signal Processing for Computer Science.
1. Introduction
What makes this curriculum innovative and attractive is first its subject, second the use of digital sounds as an illustration and experiment, and third the extensive aid of computer web and multimedia.
About the curriculum topic, let us say that it is an introduction to signals and systems in computer science (in french S.S.I.). All of the students are concerned during the first year they spend in ESSI.
They will become engineers in the field of computer science and therefore their main interest is programming, building man machine interfaces, graphic user interfaces GUIs, and so on. The purpose is then to show them that many problems they will encounter when using multimedia computers can be understood using some basic knowledge from signal theory and electronics.
To help this purpose in this a context, digital sounds offer an attractive means : they are easy to sample, to record or to synthesize, to filter, store, and to play, using computers equipped with sound cards, using simulation softwares and cost effective headsets (a microphone and loud speaker).
Third, using a local URL to gather all the documents used in the curriculum gives another motivation to our students. The teachers can then easily dispatch a set of self tests allowing their students to train and evaluate themselves anytime and from anyplace in the school
2. Curriculum contents
As told before, our curriculum contains at once knowledges, skills and means devoted to students self testing.
For this, a set of learning objectives, the key knowledge and skills, had first to be defined, depending from the context, the students, and the general purpose of the curriculum. Let us first list some of the key knowledge retained.
2.1. Some key knowledges
They are given to the students in explicit sentences so as to show precisely what is the knowledge addressed. The first thing that any student should learn is :
- “multimedia computers use digital signals to interact with the continuous analog world”, the teacher should emphasize that sounds, images, movies, are concerned, even text processing. He must explain what a signal is: a sound is not a signal, but the microphone output is one, and the ADC output in the sound card is another signal of a different kind.
- “second, digital means periodically sampled and quantized with a sample length of B bit”. Even if difficult to see, digital sounds are discrete time signals with a sampling frequency fs and only 2B different quantization steps. This allows to compute the wav files size where PCM format is used.
- “in order to sample properly a sound, the Shannon constraint must be fulfilled”. If not, there is a loss of information. Combined with human hearing physiological limits, i.e. 20 Hz – 20 kHz, this explains the sampling frequency 44kHz used for CD audio.
- To quantize properly, one should take into account that “when the sample length B used for a signal decreases, the quantization noise increases as the signal to noise ratio (SNR) is decreased”. As an order of magnitude, SNR=72 dB is associated to B= 12 bit for analog telephone.
- “the widely used signal time representation is equivalent to a frequency representation called spectrum”. Fourier series and Fourier transform are the tools which translate time to frequency. The Fourier transform is defined here, with four basic properties (linearity, duality, convolution transform and delay theorem) and a small set of transforms (cosine, Dirac impulse, rectangle) and the link with spectrum (magnitude, phase and power spectrum) is made.
- “to compute the Fourier transform, computers use the Fast Fourier Transform algorithm (FFT)”. The resulting spectrum is periodic for frequency. Only N samples are used (e.g. N=1024) and only M spectrum frequencies are computed (e.g. M=N=1024) for one spectrum period.
- “using only N successive signal samples, e.g. rectangular windowing, interferes with the signal spectrum”. The shape of the window Fourier transform is replicated at every frequency included in the signal spectrum. A trade-off must be made when choosing the window shape.
- “sampling a signal has the effect of aliasing its spectrum”. It results in infinite replication of the signal spectrum along the frequency axis, at the sampling frequency rate. One can show here the why and how of Shannon constraint, and define the Shannon filter in charge of alias cancellation. This offers an occasion to introduce undersampling and over-sampling and the effect on spectrum.
- “A filter is a signal spectrum modifier” That’s the definition of low pass, high pass, and band pass filters. Four basic properties for filters can be presented there : linearity, time invariance, causality and stability. Three equivalent models can be given for filter representation : the discrete convolution, the difference equation, and the transfer function with the associated frequency response.
- “Even continuous time processes can be digitalized when driven by a computer through a zero order hold.” Their differential equation is then replaced by a difference equation obtained from a discretization formula.
- “Using a nonlinear quantization characteristic may result in a signal compression ratio up to 2” as used for voice in mu-law and A-law CODECs. The underlying principle is a better use of the quantization steps and an amplification of low level signal values that are the most frequent.
- “MPEG Audio layer 3 is the most efficient way now for audio compression, it is based upon the frequency analysis of the signal”
2.2. Some tutorial experiments
Digital sound is used to illustrate the knowledges presented before, because it is enough with a headset connected to the sound card to carry out many experiments sometimes attractive and closely related to multimedia, and then to acquire several extra skills.
Oppositely, in a previous curriculum which was dealing with control theory, it was quite impossible in school context to perform relevant experiments during the tutorials ; we did some simulation using MATLAB M-files (Matrix Laboratory), but this was not sufficient to convince the students of the importance of the subject which seemed to be weakly related to their future job and computer science.
Let us list now some of the experiments that could be made with a multimedia computer equipped with a sound card, the simulation software MATLAB, the sound processing shareware GoldWave, and a headset (microphone+loud speakers) :
- First, the students must observe and measure the very nature of digital sound, we mean the sampling period and quantization step, and find the relation with the sound file size. It is better to restrict B to 8 bit and fs to 8kHz for easy measure with a sound processing shareware as Goldwave for example (
- More, they can distinguish between music and voice, find the sound envelope and zoom to see the pitch. They can relate the sound strength to the signal magnitude and height to its frequency.
- Using dedicated simulation software as for instance MATLAB (MathWorks) or Scilab (french INRIA) or even Goldwave again, they could quickly generate, plot and play some dicrete time signals, cosines or short Fourier series. They were asked to synthesize the A3 note (440 Hz) or A major chord or a jingle using these notes with a 1.5s duration exponential envelope.
- To understand the principle of Fourier series, one can build a 440 Hz triangle wave using only the first three harmonics, it works pretty well and better than for a square wave, plot it, hear it, create a wave file, and find back the three harmonics in the triangle spectrum
- With MATLAB, the students can investigate the FFT algorithm for basic signals (constant, cosine) and highlight the synchronisation problem.
- They can discover what happens to a note of music when the Shannon constraint is not fulfilled.
- They can investigate the effect of window shape when windowing a discrete time sine signal, and compare for instance a gaussian window with a rectangular one.
- It is easy to hear the quantization noise when B is small enough and relate this noise with the signal to noise ratio. Check how using mu-law, a nonlinear quantization law, can improve the SNR and provide a compression ratio.
- Build a filter with MATLAB or Goldwave, a low pass first order filter, and determine its effect on a short musical sound when the cut off frequency is varied.
- Build a rectangular antialiasing filter of given width fs/N in order to extract the first harmonic of a square or triangle wave.
- Build a filter bank and analyze a given digital sound (wav file) into N frequency bands. Show that every band can then be undersampled with a ratio N and that the initial sound can be in someway recovered.
- Wind up with the principle of the MPEG audio layer 3 compression using an energy criterion and a simple strategy for bit allocation.
2.3. Self tests
For every key knowledge in the curriculum contents, at least one self test is provided, available onto a local URL. The principle is that the questions and the right answers are in the same document. The answers are hidden but can be accessed by the user at wish.
Several alternatives techniques were used to do this, MS Word and hypertext, MS Powerpoint and movies, and Macromedia Flash with reactive buttons.
For student evaluation, there is only a final exam, this may not satisfy Shannon constraint. That’s why several student works are evaluated and graded :
(1)the tutorial reports, gathered into a public directory.
(2)the work we call TL (means free work) where teams of students working in teams have to find some information related to the curriculum contents and put in a web site using html, java, ...
(3)This year, we asked several short quizzes dealing with the last lecture to detect some possible mistakes.
3. Conclusion
The planned evolution is threefold :
- make a wider use of famous sound processing softwares, as Goldwave for instance which allows to record in many formats, to compute spectrum, sonograms, to compress/uncompress wav into mp3, to apply various effects
- After simulation with MATLAB, we should show to our students how to process sounds on their personal computer (with Java, C and/or C++).
- Sound is the only media considered, an introduction to digital image should be relevant in this curriculum, and even also to movies to better describe the reality of computer multimedia.
- Start all collaborations proposed by colleagues involved in teaching in this field (email, internet …).
8. Appendix and acknowledgments
Thanks to my colleagues, particularly, Prof. Joël Leroux, and also Olivier Meste, and Elena Fortunato for contribution to the curriculum and thanks to my students, where ever their minds may be .
References
[1]J.P. Stromboni, “Enseigner l’Automatique dans une école d’ingénieurs informaticiens …”, Atelier TICE 2000, UTT, Troyes, Octobre 2000.
[2]J.Y. Tigli, P. Sander, J.P. Stromboni, , “Some experiments in computer aided teaching”, poster au 20th world conference on Open Learning and Distance Education, April 01-05, Düsseldorf, Germany
[3] contains all the documents of the curriculum documents used in year 2001