Multimedia Systems Giorgio Leonardi A.A.2014-2015 Lecture 2: A brief history of image and sound recording and storage
Overview Course page (D.I.R.): https://disit.dir.unipmn.it/course/view.php?id=639 Consulting: Office hours by appointment: giorgio.leonardi@mfn.unipmn.it Office #182 (in front of Sala Seminari ) Email me any time
Outline From physical phenomenon to signals Representation of audio and light as recordable entities: electric signals What is sound Sound recording What is light Image recording Images as electric signals Analog video encoding a little history of media supports!
Multimedia representation Multimedia can originate In digital form E.g., a picture created using a painting program E.g., a document created with a word processor E.g., CAD schemas From a physical phenomenon E.g., sound from a loudspeaker In both cases, they can be formally described by means of signals We will learn how to transform sound and images into analog signals, and then into their digital representation
Sound Sound perception is due to the variations of air pressure around our eardrum Vibration of a source object disturbs the particles in the surrounding medium; those particles disturb those next to them and so on, thus creating a wave pattern (compression/decompression), which propagates through the air This wave reaches our ear, making our eardrum vibrate. Through the complex ear s organic system, the vibration is then received, stored and interpreted by the brain.
Sound waveforms We are used to «see» sound as waveforms Waveform is the physical transform of the air pressure which changes in the space, to a signal which changes in time
Wave pattern direction Sound waveforms A waveform can be recorded as a signal, which defines the variations of our eardrum s position in time, while we are hearing a sound: Eardrum vibration extension Time direction Transformation from space domain: y=f(s) to time domain: y=f(t)
Sound waveform The transformed sound wave can be stored on an analog media, as an analogic electric signal, captured by means of a condenser microphone From left to right: The longitudinal sound wave hits the microphone s diaphragm, which vibrates just as the eardrum does Diaphragm s vibration generates an electric signal, proportional to the strenght of this vibration This signal is recorded on a magnetic media
Definition of signal A signal g is a mathematical abstraction representing a quantity whose values change as a function of an independent parameter k K Usually, the parameter k represents (belongs to) either the time or the space Without loss of generality, in the following we assume that k represents the time Either k or g(k) can be multidimensional
Properties of analogic signals The recorded signal has physical and mathematical properties Among all of them, we will define: Waveform Volume/Amplitude Wavelength/Frequency Pitch Bandwidth Timbre
Waveform The shape of a signal is called waveform Shapes of the base periodic waves
Waveform Signals can be classified as: Continuous: lim x x0 g x = g x 0 = c, x Dom(g) Non Continuous: all the signals without the latter property
Waveform Signals can be classified as: Periodic: Signals repeating within a fixed period T: Aperiodic: all the signals without the latter property
Volume/Amplitude Volume of a sound signal is proportional to its amplitude The more the eardrum extends from its initial position, the higher the amplitude of the signal (therefore, its volume). Amplitude id defined as: 1. Peak amplitude (Û) 2. Peak-to-peak amplitude (2Û) 3. Root mean square amplitude (Û/ 2)
Wavelength/Frequency The wavelength λ is the distance the wave travels through its medium within a period It is inversely proportional to frequency f: Where v is the speed of sound In dry air (i.e., at 0% humidity) at τ C v (331.3+0.606τ) m/s E.g., at 20 C, v 343.4 m/s
Pitch Pitch is often related to the (perceived) frequency of a sound wave Frequency is measured in Hertz (Hz): 1 Hz = signal oscillating 1 time per second 1 khz = signal oscillating 1000 times per second High Pitch Low Pitch
Pitch When playing instruments, a different pitch defines a different note which can be played Note Frequency (Hz) Wavelength (cm) A 3 220.00 156.82 B 3 246.94 139.71 C 4 261.63 131.87 D 4 293.66 117.48 E 4 329.63 104.66 F 4 349.23 98.79 G 4 392.00 88.01 A 4 440.00 78.41 B 4 493.88 69.85 C 5 523.25 65.93
Pitch Pitch is the responsible of the «doppler» effect: let a sound source produce a soundwave with a defined frequency f: When the sound source moves in our direction, the its soundwave is «squeezed» and perceived at a higher pitch : Finally, when the sound source leaves us, the soundwave comes to our ears as «stretched», therefore perceived ad a lower pitch:
Doppler effect Doppler effect rule: given a sound source producing: A waveform with frequency f s, and moving to/leaving an observer at speed V s The waveform «hitting» an observer will have frequency f o : Where v is the speed of sound at τ C: v (331.3+0.606τ) m/s v f o = f s v vs 1) Ambulance does not move. V s = 0 f o = f s (same pitch) 2) Ambulance moves to obs. V s > 0 f o > f s (higher pitch) 3) Ambulance leaves obs. (lower pitch) V s < 0 f o < f s
Other devices exploiting the Doppler effect?
OUCH!
Bandwidth A signal may be composed by (as the sum of) multiple frequency components Frequency components: periodic sine waves, each one with a particular frequency The component at lower frequency is called Fundamental The others components add Harmonics to the fundamental wave = + + + +
Bandwidth Spectrum: the range of frequency that a signal contains Fmin= Fundamental frequency Fmax= Higher harmonic frequency Bandwidth (also called width of the signal): the width of the spectrum In this example: Fmin = Fundamental frequency = 50 Hz Fmax = Harmonic 9 frequency = 450 Hz Spectrum: S= [Fmin; Fmax] = [50Hz; 450Hz] Bandwidth: W= Fmax Fmin = 450Hz 50Hz = 400Hz
Bandwidth Finite bandwidth: the total signal can be fully reconstructed by adding a fundamental and a finite number of harmonics: = + + + + Periodic sine-based signals usually have limited bandwidth
Bandwidth Infinite bandwidth: To reconstruct perfectly the total signal, a fundamental and an infinite number of harmonics must be added: Total Fundamental + Harmonic 3 + Harmonic 5 = Approx. sum of finite components Digital signals, such as (a), need the sum of infinite components (b), (c), (d) to obtain the original form. The sum of finite components generates only the approximation (e)
Bandwidth A second example of a signal with infinite bandwidth: a sawtooth function
Timbre A pure sound is a wave made of only a single frequency and has a sinusoidal form In nature, there are no pure sounds Pure sounds can be produced artificially E.g., a tuning fork (or diapason) is an acoustic resonator which vibrates at a precise frequency
Timbre In general, sound sources vibrate in more complicated ways, creating the rich variety of sounds and noises we are familiar with The timbre of a sound is defined by its wave form, which means the «shape» of its wave Different instruments playing the same note (same pitch, different waveforms)
What about images?
Photography The term «Photography» derives from two greek words: Phos, which means «light» Graphos, which means: «to draw» What we see, when we look at a picture, is a «drawing with light» performed by the photographer But what is light, and how can we «capture» the light?
Light Visible light (or visible, or, simply, light) is an electromagnetic (EM) radiation that is visible to the human eye An EM radiation is a transverse wave, that is a wave that is oscillating perpendicularly to the direction of propagation
Light waves properties EM radiations propagate at the speed of light c which in the vacuum is ~2.998 10 8 m/s Frequency ν and wavelength λ are strictly related by λν=c According to the particle-wave duality (from quantum mechanics), EM radiations can be thought both as propagating waves and as a stream of elementary (massless) particles (called photons), each traveling in a wavelike pattern and moving at the speed of light
Light waves properties Each colour we can see has a different wavelength/frequency Red has the longest wavelength and violet has the shortest wavelength
Visible light Visible light represents a very small portion of the EM spectrum Visible light has wavelengths between ~380 nm (violet color) and ~740 nm (red color)
The human eye: Analogic recording of light Lightwaves reflected by the cyclist hit the cornea and, through the pupil, are focused by the lens From the lens, the light reaches the retina and is captured by its photoreceptors Finally, the (upside-down) image is transmitted to the brain by the optic nerve
Analogic recording of light The analogic photocamera: Photocameras use the same principle, to store analogic images on chemical photographic films Light hits the camera s lens and is focused by the photographer, who moves some of the lens to adjust focusing The photographer opens the shutter for a fraction of time, during which the light hits the chemical photographic film, impressing the chemical material in it Lens Shutter Photographic film
Images as signals Even images can be represented as electrical signals Remember what we told before? A signal g is a mathematical abstraction representing a quantity that changes its values as a function of an independent parameter k K Usually, the parameter k represents either the time or the space Without loss of generality, in the following we assume that k represents the time Either k or g(k) can be multidimensional
Grayscale value Monodimensional signal Considering only a horizontal «line» of the following image, it can be represented as a monodimensional signal k represents the horizontal position in the image g(k) is the variation of the greyscale values from left to right g:r R y= g(k) Pixel position
Multidimensional signal A complete black & white image is a signal from 2D points (positions in the space) to light intensities It is a 2D surface g:r 2 R y= g(k,j)
Multidimensional signal A color image is a signal from 2D points (positions in the space) to red, green and blue light intensities It becomes three 2D surfaces: one for each colour g:r 2 R 3 <r, g, b> = g(k, j)
Analog video encoding
What is video A video is a sequence of images, played at a constant framerate: PAL: 25 frames/sec NTSC: 29.97 frames/sec Images are recorded in a negative film strip (such as the super 8mm, in use nowadays), where each frame is impressed with (about) the same technique we have seen for analog photocameras Video encoding (for video transmission over media) is a 2-pass process
Video encoding Pass 1 Each frame is treated separately: Frame F is divided into lines (called scanlines) Each scanline generates a 1-dimensional electrical signal (as seen for still images)
Video encoding Pass 2 Video is then recorded (on a magnetic tape, remember VHS?) or transmitted with a timedependent signal, encoding the sequence of all the scanned images
Why going digital?
Problems in analog processing Magnetic devices used to store the analog signals are: Affected by mechanical noise Perishable, with loss of quality in time Space-hungry Treatment of analog signals requires: Dedicated hardware Very complex real-time calculations (real-time integration, as we will see) Image processing is a completely phisical/chemical process Analogical recovery of video quality is limited
Analog audio: vinyl records Sound is literally carved into a phonograph record, because the groove undulations are analogous to the sound waves they represent. A classical 33 RPM record contains about 20-30 minutes of analog audio per side, compared to the CD, which holds from 80 minutes to 800 minutes of digital audio, depending on format (CD-DA or MP3). Don t laugh, it is still my favourite media!!! And the favourite choice of audiophiles: all the harmonics are intact
Analog audio: tapes Analog tapes record a magnetical pattern, proportional to the electrical signal A blast from the 80 s! It revolutioned the world of media recording, because everyone could record its own music collection, or its favourite radio programs on-the-fly, with a cheap device and physical storage media. Up here you will find a common, low-cost audio cassette for home and «portable» use. The one to the left is a professional recording and production device. Many masterpieces from the late 70 and 80s were recorded on a master tape, instead of a master vynil as used before
Analog tapes for computer storage Analog tapes were also used to store backup and computer programs and games The sequence of 0 and 1 was coded as an electric (digital) sound signal and stored as if it were real audio The world-famous IOMEGA zip tape devices for incremental data backup One of the much more famous Commodore 64 cassettes!
Analog photography: 35mm Images are stored in negative 35mm Color/BW films, or in single 35mm positive diapos Images can be viewed only when printed, or by means of special (and costly) visors Anyway, analog photography is still a good alternative to the digital one (better colours and gamut), but: Necessity to shoot accurately (no possibility of taking 1.200 pictures and then choose, it will cost as a new car!) You will know the result of a photographic session only after the development of the negative/positive film No EXIF data available! If you want to try different setting, you must write/remember all the exposure data by yourself!
Analog video: The VHS The VHS was the «DVD system of the 90 s», and allowed the large-scale diffusion of motion pictures, and the recording of the favourite programs at home. VHS system consists of a magnetic tape, whose storage technique is the union of the audiocassette and the coding of analog video signals The video signal is transformed into an alectric signal in the way we have seen before, and this signal is recorded on the VHS tape as the corresponding magnetical pattern VHS for PAL and VHS for NTSC were not compatible, due to the difference in frame rate and in color representation
and nowadays? Digitized audio, video and pictures can be stored in cheap, high-capacity, reliable(?) and portable devices Is it all gold? What are the pros and cons?