Computer Audio and Music

Similar documents
Ch. 1: Audio/Image/Video Fundamentals Multimedia Systems. School of Electrical Engineering and Computer Science Oregon State University

Data Representation. signals can vary continuously across an infinite range of values e.g., frequencies on an old-fashioned radio with a dial

9.35 Sensation And Perception Spring 2009

CTP431- Music and Audio Computing Musical Acoustics. Graduate School of Culture Technology KAIST Juhan Nam

Digital Representation

Motion Video Compression

Digital Media. Daniel Fuller ITEC 2110

Audio and Other Waveforms

8/30/2010. Chapter 1: Data Storage. Bits and Bit Patterns. Boolean Operations. Gates. The Boolean operations AND, OR, and XOR (exclusive or)

Music Representations

Video coding standards

1 Introduction to PSQM

A System for Generating Real-Time Visual Meaning for Live Indian Drumming

Analysis, Synthesis, and Perception of Musical Sounds

Introduction to Digital Signal Processing (DSP)

Digital Television Fundamentals

MULTIMEDIA COMPRESSION AND COMMUNICATION

APPLICATION OF A PHYSIOLOGICAL EAR MODEL TO IRRELEVANCE REDUCTION IN AUDIO CODING

Understanding IP Video for

Evaluation of SGI Vizserver

2018 Fall CTP431: Music and Audio Computing Fundamentals of Musical Acoustics

Processing. Electrical Engineering, Department. IIT Kanpur. NPTEL Online - IIT Kanpur

Music Understanding and the Future of Music

Music Representations

ELEC 691X/498X Broadcast Signal Transmission Fall 2015

Bit Rate Control for Video Transmission Over Wireless Networks

CTP 431 Music and Audio Computing. Basic Acoustics. Graduate School of Culture Technology (GSCT) Juhan Nam

Constant Bit Rate for Video Streaming Over Packet Switching Networks

The MPC X & MPC Live Bible 1

Understanding Compression Technologies for HD and Megapixel Surveillance

CSC475 Music Information Retrieval

Digital audio and computer music. COS 116, Spring 2012 Guest lecture: Rebecca Fiebrink

P1: OTA/XYZ P2: ABC c01 JWBK457-Richardson March 22, :45 Printer Name: Yet to Come

Introduction to image compression

MPEGTool: An X Window Based MPEG Encoder and Statistics Tool 1

16.5 Media-on-Demand (MOD)

DESIGN PHILOSOPHY We had a Dream...

Data Storage and Manipulation

Introduction to Data Conversion and Processing

Psychoacoustics. lecturer:

Creative Computing II

New forms of video compression

Announcements. Project Turn-In Process. and URL for project on a Word doc Upload to Catalyst Collect It

Joint Optimization of Source-Channel Video Coding Using the H.264/AVC encoder and FEC Codes. Digital Signal and Image Processing Lab

Information Transmission Chapter 3, image and video

1/29/2008. Announcements. Announcements. Announcements. Announcements. Announcements. Announcements. Project Turn-In Process. Quiz 2.

Announcements. Project Turn-In Process. Project 1A: Project 1B. and URL for project on a Word doc Upload to Catalyst Collect It

Topic 10. Multi-pitch Analysis

Digitizing and Sampling

COMP 249 Advanced Distributed Systems Multimedia Networking. Video Compression Standards

Advance Certificate Course In Audio Mixing & Mastering.

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

EMBEDDED ZEROTREE WAVELET CODING WITH JOINT HUFFMAN AND ARITHMETIC CODING

AT65 MULTIMEDIA SYSTEMS DEC 2015

Example: compressing black and white images 2 Say we are trying to compress an image of black and white pixels: CSC310 Information Theory.

Daniel Hertz Master Class vs. Analog Master Tape. Background

Chapt er 3 Data Representation

A video signal consists of a time sequence of images. Typical frame rates are 24, 25, 30, 50 and 60 images per seconds.

Implementation of an MPEG Codec on the Tilera TM 64 Processor

Introduction to Video Compression Techniques. Slides courtesy of Tay Vaughan Making Multimedia Work

Frame Interpolation and Motion Blur for Film Production and Presentation GTC Conference, San Jose

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)

AN MPEG-4 BASED HIGH DEFINITION VTR

Lab 5 Linear Predictive Coding

Graphics Concepts. David Cairns

OVE EDFORS ELECTRICAL AND INFORMATION TECHNOLOGY

Professor Laurence S. Dooley. School of Computing and Communications Milton Keynes, UK

OPEN STANDARD GIGABIT ETHERNET LOW LATENCY VIDEO DISTRIBUTION ARCHITECTURE

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

How do you make a picture?

Digital Image Processing

Digital Video Telemetry System

The H.263+ Video Coding Standard: Complexity and Performance

ATSC vs NTSC Spectrum. ATSC 8VSB Data Framing

Hugo Technology. An introduction into Rob Watts' technology

How Does H.264 Work? SALIENT SYSTEMS WHITE PAPER. Understanding video compression with a focus on H.264

Using the BHM binaural head microphone

Math and Music: The Science of Sound

CacheCompress A Novel Approach for Test Data Compression with cache for IP cores

CM3106 Solutions. Do not turn this page over until instructed to do so by the Senior Invigilator.

ATI Theater 650 Pro: Bringing TV to the PC. Perfecting Analog and Digital TV Worldwide

Auditory Illusions. Diana Deutsch. The sounds we perceive do not always correspond to those that are

INTERNATIONAL TELECOMMUNICATION UNION. SERIES H: AUDIOVISUAL AND MULTIMEDIA SYSTEMS Coding of moving video

Melody Retrieval On The Web

By Tom Kopin CTS, ISF-C KRAMER WHITE PAPER

Speech and Speaker Recognition for the Command of an Industrial Robot

RTVF TV Producing & Directing - Parts I & II

Multimedia Systems Video I (Basics of Analog and Digital Video) Mahdi Amiri April 2011 Sharif University of Technology

Supplement to the Operating Instructions. PRemote V 1.2.x. Dallmeier electronic GmbH. DK GB / Rev /

QRF5000 MDU ENCODER. Data Sheet

An Exclusive HARMAN Patented Technology

MULTIMEDIA TECHNOLOGIES

FPFV-285/585 PRODUCTION SOUND Fall 2018 CRITICAL LISTENING Assignment

A few white papers on various. Digital Signal Processing algorithms. used in the DAC501 / DAC502 units

VIDEO 101: INTRODUCTION:

Physical Modelling of Musical Instruments Using Digital Waveguides: History, Theory, Practice

Jam Tomorrow: Collaborative Music Generation in Croquet Using OpenAL

Musical Acoustics Lecture 15 Pitch & Frequency (Psycho-Acoustics)

WAVES Scheps Parallel Particles. User Guide

Multimedia Communication Systems 1 MULTIMEDIA SIGNAL CODING AND TRANSMISSION DR. AFSHIN EBRAHIMI

Transcription:

Music/Sound Overview Computer Audio and Music Perry R. Cook Princeton Computer Science (also Music) Basic Audio storage/playback (sampling) Human Audio Perception Sound and Music Compression and Representation Sound Synthesis Music Control and Expression Waveform Sampling and Playback Sample and Hold Sample Rate vs. Aliasing Quantize Word Size vs. Quantization Noise Reconstruct: Hold and Smooth (filter) Waveform Sampling: Quantization Quantization Introduces Noise

Compression and Representation (Why Bother??) So Many Bits, So Little Time (Space) CD audio rate: 2 * 2 * 8 * 44100 = 1,411,200 bps CD audio storage: 10,584,000 bytes / minute A CD holds only about 70 minutes of audio An ISDN line can only carry 128,000 bps Even a cable modem might carry only 1Mbps Security: Best representation removes all recognizable about the original sound Graphics people get all the bandwidth, cycles, memory Expression, composition, interaction wanted too! Views of Sound Sound is Perceived: Perception-Based Psychoacoustically Motivated Compression Sound is Produced: Production-Based Physics/Source Model Motivated Compression Music(Sound) is Performed/Published/Represented: Event-Based Compression Sound is a Waveform / Statistical Distribution / etc. (these are not very good ideas in general, unless we get lucky (LPC)) Psychoacoustics Human sound perception: Ear: receive 1-D waves Cochlea: convert to frequency dependent nerve firings Auditory cortex: further refine time & frequency information Brain: Higher level cognition, object formation, interpretation Perceptual Models Exploit masking, etc., to discard perceptually irrelevant information. Example: Quantize soft sounds more accurately, loud sounds less accurately Generic, does not require assumptions about what produced the sound Drawbacks: Highest compression is difficult to achieve

Production Models Build a model of the sound production system, then fit the parameters Example: If signal is speech, then a well parameterized vocal model can yield highest quality and compression ratio Highest possible compression Drawbacks: Signal source(s) must be assumed, known, or identified Audio Compression Classical Data Compression View: Take advantage of Redundancy/Correlation Statistics (Local/Global) Assumptions / Models Problem: Much of this doesn t t work directly on sound waveform data Transform (Subband) Coders Split signal into frequency subbands,, then allocate bits to regions adaptively, based on where ear is most sensitive Lossless (variable bit rate & comp. ratio) Lossy (fixed rate and ratio) MP3 Production Models Build a parametric model of the production system, then either Fit the parameters to a given signal Use signal processing techniques to extract parameters Drive the parameters directly (no encode/decode) Examples: Rule system to drive speech synthesizer MIDI file to drive music synthesizer

Speech Coders (production) Assume speech is produced by a source-filter system (vocal folds/noise + vocal tract tube) Identify filter, type of source, then code parameters Takes advantage of slowly varying nature of vocal tract shape and other speech parameters Future: Multi-Model Parametric Compressors? Analysis front end identifies source(s) Audio is (separated and) sent to optimal model(s) High compression Other knowledge Drawbacks: We don t t know how to do all this yet Sound Analysis and Classification Cochlear Modeling Multi-feature analysis(tzanetakis) Segmentation, Classification, Annotation, Thumbnails MIDI and Other Event Models Musical Instrument Digital Interface Represents Music as Notes and Events and uses a synthesis engine to render it. An Edit Decision List (EDL) is another example. A history of source materials, transformations, and processing steps is kept. Operations can be undone or recreated easily. Intermediate non-parametric files are not saved. Speaking of MIDI and scores, a brief aside on Computing History:

History of Programmable Machines First programmable system was the early printing process developed in China circa 800 C.E. Gutenberg s Printing Press (circa 1450) Main Contribution: First program was perhaps Chinese translation of Buddhist Canon (the Tipitaka) Just a few basic instructions (smaller alphabet size) suffice. Jacquard s Loom (circa 1810) Punched Cards stored program for weaving patterns. Wait: Gutenberg 1450 Huygen s Pendulum Clock (circa 1650) Jacquard 1810 Are we missing something here????? Nothing happened in 350 years? Main Contribution: Timing, clock ticks increase accuracy

Musical Machines: Barrel Organs (1500!) Music boxes (between) Player Pianos (c. 1700) Main Contributions: Drive cylinder or disk with pins (bits!!) which play notes at the right time Change disk -> change song (programmable!) Charles Babbage (1822-64): Input -- Punched Cards Hardware -- general-purpose mechanical mathematical system (Analytical Engine) -- never built Could be programmed punched card could say: Go back 5 punched cards Instructions could be Executed repeatedly, or in different order. Jacquard s Loom (circa 1810) Punched Cards stored program for weaving patterns. The Modern Computer von Neumann (1945) Princeton, NJ Basic Idea still the same: A machine that can execute certain instructions. Machine instructions represented by sequences of 0 s and 1 s (Machine Language) Instructions stored in Memory

Anyway, Event Based Music Representation MIDI MIDI and Other Scorefiles A Musical Score is a very compact representation of music Even the score itself can be compressed further Highest possible compression Encodes expression Drawbacks: Cannot guarantee the performance Cannot assure the quality of the sounds Cannot make arbitrary sounds (yet) Event Based Representation Enter General MIDI Guarantees a base set of instrument sounds, and a means for addressing them, but doesn t t guarantee any quality Better Yet, Downloadable Sounds Download samples for instruments Does more to guarantee quality Drawbacks: Samples aren t t reality Event Based Representation Downloadable Algorithms Specify the algorithm, the synthesis engine runs it, and we just send parameter changes Part of Structured Audio (MPEG4) Can upgrade algorithms later Can implement scalable synthesis Drawbacks: Different algorithm for each class of sounds (but can always fall back on samples)

Physical Modeling for Music Strings Strings (plucked, (plucked, struck, struck, bowed) bowed) Winds Winds (clarinet, (clarinet, flute, flute, brass), brass), voice voice Synthesizing Solids O Brien, Cook, O Brien, Cook, and and Essl Essl SIGGRAPH SIGGRAPH 01 01 Plates, Plates, membranes, membranes, bar bar percussion percussion Shakers, Shakers, scrapers scrapers The The Voice Voice Physical Modeling: the Real World Sounds PhOLISE) Sounds Effects Effects ((PhOLISE) Composition and Creation Garton Rough Garton Rough Raga Riffs Raga Riffs Riffs Expression and Control Cook/Morrill Trumpet Lansky mild Lansky Lansky mild und leise und leise leise Music for Unprepared Piano Bargar, Choi, Betts, Cook Other Controllers Trueman Trueman:: BoSSA BoSSA

PICOs (musical and real-world sonic controllers) K-Frog K-Frog J-Mug J-Mug P-Pedal P-Pedal PhilGlas PhilGlas P-Grinder P-Grinder T-shoe T-shoe Tbourine T-bourine Pico Pico Glove Glove P-Ray s Cafe P-Ray s Audio and Computer Music Questions?