English Project. Contents

Similar documents
SYMBOL CONVERSION LONG-TERM EQUITY OPTIONS EXPIRING IN JANUARY AND MARCH 2007

Clarification for 3G Coverage Obligation Verification Data

Finding List by Question by State

Finding List by Question by State *

Spare Parts, Accessories, Consumable Material for Older Design Recorders

KACO-display. Wireless Solar Monitoring System. Operating Instructions KACO-display. full of energy...

Friday 5 June 2015 Morning

2017 Pocket Planners

Essential Learning Products

SYSTIMAX Solutions InstaPATCH 360 Pre-Terminated Fiber Solution Guide

Leading Provider of Cable Management Solutions. M8 & M12 Cordsets & Accessories

PicoScope 3000 Series Automotive User guide

Keysight Technologies N1090A, N1092A/B/C/D/E and N1094A/B DCA-M Optical and Electrical Sampling Oscilloscopes

Maths-Whizz Investigations Paper-Back Book

Analog Input Module HART Ex n Inputs, 8 Channels Series 9461/15

Local Television Advertising Effectiveness Study. Kathleen Keefe Vice President, Sales March 21, 2008

LIQUID FILTER MAINTENANCE KITS

RETHINKING SCHOOLS PROOFREADING AND STYLE SHEET (November 2002)

RDR 2060 WEATHER RADAR UPGRADE

P190 Control Box USERS MANUAL 24V DC GEAR MOTOR. Swing Gate Opener FOR RESIDENTIAL USE ONLY

ITER Product Catalogue. Total Pressure Measurement V Page 1 / 15

LadyBug Technologies, LLC LB5908A True-RMS Power Sensor

INDEPENDENT PUBLISHER BOOK AWARDS

BOOK AWARDS GENERAL/REGIONAL CATEGORIES EBOOK CATEGORIES RECOGNIZING EXCELLENCE IN INDEPENDENT PUBLISHING

Technical Manual Pico Timer - Programmer EM

Page: 1 of 15 FCC TEST REPORT. : MEGAVIEW DIGITECH LIMITED : C Fu Gui Yuan, Block 80, Bao an District, Shenzhen, China

ITER Product Catalog Total Pressure Measurement. ITER Product Catalogue Total Pressure Measurement V Page 1 / 15

Multi Application Test System

N1000A DCA-X. Wide Bandwidth Oscilloscope Mainframe and Modules. Find us at Page 1

PicoScope 4000 Automotive PC Oscilloscopes

SLD. Board FC/APC Connector. TEC Control. Board. Control Circuits. Control interface DB9 (Female)

Plug-in modules for total pressure gauges and controllers TPG 300 and IMG 300

Iu Greensleeves Variants Robert E. Foster (ASCAP) j r. n r^ni. ^rtr. m ^Tn * ^ $ n~n. Boldly (J = ) (2+2+3) Boldly (J = )

74F273 Octal D-Type Flip-Flop

What is the minimum sound pressure level iphone or ipad can measure? What is the maximum sound pressure level iphone or ipad can measure?

AC diagonal module Ø 200 mm

Innovation & Excellence. Index. Index. Innovation & Excellence. Introduction 4-5. Our Features 6-7. Applications

APPLICATION NOTE VACUUM FLUORESCENT DISPLAY MODULE

USA WESTBOUND LCL SAILING SCHEDULES

A NOTE ON THE ERGODIC THEOREMS

Chapter 6: Thermochemistry

Software Package WW 9038 for the Sound Intensity Analysing System Type 3360 or the Digital Frequency Analyzer Type 2131

Benchtop/Rackmount Programmable Switches. Key Features & Benefits. SB/SC/SCG Series. Applications INSTRUMENTATION CATALOGUE - 73

CytoFLEX Flow Cytometer Quick Start Guide

Cryoelectronics. MS-FLL User s Manual. Mr. SQUID Flux-Locked Loop. STAR Cryoelectronics 25 Bisbee Court, Suite A Santa Fe, NM U. S. A.

Section 2.1 How Do We Measure Speed?

Meet 12 Famous Composers Through Song Arranged, with new Words, by Sally K. Albrecht. Recording Orchestrated by Alan Billingsley and David Hagy

Multiple Band Outdoor Block Up- and Downconverters

TABLE OF CONTENTS. Instructions:

case 5 temperature sensor cosine sensor sine sensor ground + 5V phase 2- phase 2+ phase 1- phase 1+

QCTV PROGRAM REPORT. Council Chambers Presentation Audiovisual Systems. Member Cities: Andover, Anoka, Champlin, and Ramsey

Se t t i ng up Libby Begin by installing the Libby app on your device. You will find the free app in your device s app store.

Lessons On Movies.com THE SHINING.

8500A. Advanced Test Equipment Rentals ATEC (2832) channel capability. For tests on pulse mod- SERIES PEAK POWER METERS

1310nm Single Channel Optical Transmitter

Calibration of auralisation presentations through loudspeakers

Data Pattern Generator

U.S. LOCAL HISTORY, LATIN AMERICAN HISTORY (Classification F)

Initialisms are abbreviations made from the first letter of each of the words in a title or name.

CHM-201 General Chemistry and Laboratory I Unit #3 Take Home Test Due April 18, 2018

Speaker Recognition: Building the Mixer 4 and 5 Corpora

DA E: Series of Narrowband or Wideband Distribution Amplifiers

Additional Units with Trade Packs. Additional Units without Trade Packs. Trade Pack

RIA45. Technical Information. Panel meter Digital panel meter with control unit for monitoring and visualizing analog measured values

FILM AND PERFORMING ARTS (FLPA)

PSI-MOS-RS232/FO 850 E Serial to Fiber Converter

No. 122 supplement - (Vol.VII) October 1996

SuperTRISTAN. A possibility of ring collider for Higgs factory. 13 Feb K. Oide (KEK)

E TM. Fiber. Fully. high speed downlink and low. CoaXPress. PHT4-x-x. fiber LED. status Compact size. High. products) pharmaceutical.

Data Pattern Generator

POLYESTER CAPACITORS

MK2010 ASSEMBLY AND CALIBRATION

INDUSTRY REQUIREMENTS FOR AND COMPETENCE OF ENGINEERING GRADUATES - A STUDY

fast and easy RF Switch IC Guide Making your Switch Selection A World Leader in RF Switch ICs with Over 50 Years of Wireless Experience

CHM-201 General Chemistry and Laboratory I Unit #3 Unit Test Version B April 18, 2018 CORRECTED

The Art of Engineering

Douglas D. Reynolds UNLV UNIVERSITY OF NEVADA LAS VEGAS CENTER FOR MECHANICAL & ENVIRONMENTAL SYSTEMS TECHNOLOGY

Loudness and Pitch of Kunqu Opera 1 Li Dong, Johan Sundberg and Jiangping Kong Abstract Equivalent sound level (Leq), sound pressure level (SPL) and f

M-8460Se Printer Standard Unit

Model 4700 Photodiode Characterizer

Selector Switches 30-mm Selector Switches Universal Design. Emphasis on Color Coding, Workability, and Safety. Operation Unit Colors

Commissioning ICAF-System New Touch Screen

Inductive sensor NI3-EG08K-Y1-H1341

FREQUENCY CONVERTER HIGH-PERFORMANCE OUTDOOR BLOCK UP AND DOWNCONVERTERS. Narda-MITEQ 1 FEATURES OPTIONS

1550 nm TX / 1310 nm RX / 3 Gb/s Medium Power 1-Fibre SM Video SFP Transceiver

Ocean Sensor Systems, Inc. Wave Staff, OSSI F, Water Level Sensor With 0-5V, RS232 & Alarm Outputs, 1 to 20 Meter Staff

ELIGIBLE INTERMITTENT RESOURCES PROTOCOL

The extremely compact laser head is approximately 480 mm long and can

74F574 Octal D-Type Flip-Flop with 3-STATE Outputs

Artisan Technology Group is your source for quality new and certified-used/pre-owned equipment

V DD V DD V CC V GH- V EE

Ecoline S series. Precision solutions for quality production. Innovations for a better world.

OPTICAL MEASURING INSTRUMENTS. MS9710C 600 to 1750 nm OPTICAL SPECTRUM ANALYZER GPIB. High Performance for DWDM Optical Communications

LWC Series LWC-80. Design. LWC Series Laser Wire Counters. Product name: Accessories: LWC-80

F250. Advanced algorithm enables ultra high speed and maximum flexibility. High-performance Vision Sensor. Features

Temporal control mechanism of repetitive tapping with simple rhythmic patterns

USB Smart Power Sensor

Why do we need to debounce the clock input on counter or state machine design? What happens if we don t?

PM Series Microwave Power Calibration System

CWDM Optical Transceiver

Transcription:

English Project Contents Introduction 2. Many-Talker Prompt-File Distribution 3. Few-Talker Prompt-File Distribution 4. Very-Few-Talker Prompt Files Introduction This report documents the subjects, equipment, recording environments, materials and the file structure details involved in making the speech and laryngograph recordings for the English section of the SAM EUROM1 speechdatabase. The corpus consists of 4 components: a) systematically structured C(C)VC monosyllables to be produced in isolation and in a number of controlled precursive and following contexts (see section 5.1). b) selected numbers from 0-9999, such that all the phonotactic possibilities of the English number system were covered (see section 5.2). c) short passages containing 5 thematically connected sentences (see section 5.3). d) sentences composed to compensate the phoneme-frequency imbalance resulting from the thematic (i.e. not structurally orientated) composition of the passages (see section 5.4). Different parts and increasing amounts of this material were recorded by three sets of native speakers, recruited from Southeast England: i) Many Talker Set (MT) - (30 women, 30 men): 100 numbers 3 passages 5 sentences

ii) Few Talker Set (FT) - (5 women and 5 men selected from MT): Isolated C(C)VCs 5 x 100 numbers 15 passages 25 sentences For the few talker data set laryngograph signals were recorded together with the acoustic speech signal. However, for the C(C)VC material only the first of the five repetitions is available on CD5. See CD5_E.TXT for more information. iii) Very Few Talker Set (VFT) - (1 woman and 1 man selected from FT): Contextualised C(C)VCs 5 x Context words For the very few talker data set laryngograph data is available for all the recordings. The recordings were carried out in anechoic rooms at University College London and at the National Physical Laboratories (NPL), Teddington, and SAM colleagues from UCL and NPL provided technical support in calibration tests of the recording room to comply with the conditions stipulated in the Recording Protocol document (SAM- RSRE-15, Dec. 1990). (Files recorded before 30/5/91 were made in the Anechoic Chamber at UCL. Files recorded on or after 30/5/91 were made in the Anechoic Chamber at NPL). Two operators shared the task of recording. Calibration was carried out for each subject prior to recording, and continual monitoring of each speaker's performance ensured that a minimum of deviations from the prompt text, and a minimum of articulatory lapses are contained in the recordings. Any error noted by the operator led to a repeat recording of the prompt item (i.e. a block of CVCs, a block of 20 numbers, a five-sentence passage, or a block of five sentences). Inspection and backup (on Exabyte) of the recorded material followed immediately after each session. The subjects were selected so that there was an equal number of women and men, as good a coverage of age groups, and as wide a range of voice types as possible (cf. SAM-UCL-030, May 1991). There was also a considerable variation in body size and no direct means of calculating vocal-tract dimensions was available for the subjects who were not recorded using the laryngograph. For the FT and VFT subjects, who all gave simultaneous speech and laryngograph recordings, the precise positioning of the microphone relative to the subjects lips gives a basis for vocal tract length estimation (see Appendix A). Age groupings of subjects are given in the following table: Age Group Subject Code (male) Subject Code (female)

20-29 15 *MP, MR, MU, MV, MX, MY, MZ, NA, ND, NI, NJ, NK, NL, OG, OL 30-39 6 *MB, *, NE, NF, NT, NV 40-49 4!MA, MM, MO, NO 50 + 5 NC, NG, NQ, *NX, NZ 13 *MC, *MD, MN, MS, MT, MW, NB, NM, NN, NS, NW, NY, OA 5 *MK, ML, NH, OE, OJ 7 MQ,,, OF, OH, OI, OK 5!MJ, NP, NR, NU, * Table 1: The subject codes given are those used in the speech-signal filenames. The codes preceded by * refer to the subjects belonging to the FT set, and those preceded by! are those who were also in the Very-Few-Talker set. All subject details are given in section 7. The distribution of texts by subject is given in the following table. All MT-subjects produced one repetition of the numbers (prompt files N1-N5). 2. Many-Talker Prompt-File Distribution Passage Speaker (male) Passage Speaker (female) O1 MA MT NG NU OG O2 MA MT NH NU OG O3 MA MU NH NU OI O4 MK MU NH NV OI O5 MK MU NI NV OI O6 MK MV NI NV OH O7 MB MV NI NW OH O8 MB MV NJ NW OH O9 MB MW NJ NW OJ Q1 MD NA NN, OA Q2 MN NA NN Q3 MN NA NO Q4 MN NB NO Q5 MO NB NO Q6 MO NB NP Q7 MO NC NP Q8 MP NC NP Q9 MP NC NQ

O0 ML MW NJ NX OJ P1 ML MW NK NX OJ P2 ML MX NK NX OK P3 MC MX NK NY OK P4 MC MX NL NY OK P5 MC MY NL NY OL P6 MM MY NL NZ OL P7 MM MY NM NZ OL P8 MM MZ NM NZ MJ P9 MD MZ NM OA MJ Q0 MP ND NQ R1 MQ ND NQ R2 MQ ND NR R3 MQ NE NR R4 MR NE NR OE R5 MR NE NS OE R6 MR NF NS OE R7 MS NF NS OF R8 MS NF NT OF R9 MS NG NT OF P0 MD MZ NN OA MJ R0 MT NG NT OG Sentence Speaker F1 MA MQ NA NK NU F2 MK MR NB NL NV OE F3 F4 MB MS NC NM NW OF ML MT ND NN NX OG F5 MC MU NE NO NY OI F6 MM MV NF NP NZ OH F7 MD MW NG NQ OA OJ F8 MN MX NH NR OK F9 MO MY NI NS OL F0 MP MZ NJ NT MJ

3. Few-Talker Prompt-File Distribution The Few-Talker set subjects were drawn from the Many-Talker Group. The relations between the Many-Talker and Few-Talker codes are as follows: MA = FA MB = FB MC = FC MD = FG = FE MJ = FJ MK = FF MP = FD NX = FI = FH All FT-subjects recorded 5 repetitions of the numbers (N1 - N5) and 5 repetitions of the isolated C(C)VC material (S1 - S5). Passages Speaker Sentences Speaker O1 - O5 FA FE FD FI F1 - F5 FA FB FG FE FI O6 - O0 FA FF FD FI F6 - F0 FC FJ FF FD FH P1 - P5 FA FG FF FI P6 - P0 FC FG FF FJ Q1 - Q5 FB FC FG FJ Q6 - Q0 FB FC FJ FH R1 - R5 FB FE FH R6 - R0 FE FD FH 4. Very-Few-Talker Prompt Files The two VFT subjects were members both of the Many-Talker set and the Few-Talker set. The same subject codes were used in the Very-Few-Talker set as in the Few Talker set. As immediately above, FA corresponds to MA and FJ to MJ. These two subjects recorded ALL the contextualised C(C)VC stimuli (files T1 - T5, U1 - U5, V1 - V5, W1 - W5, X1 - X5) and 5 repetitions of the context words (Z1) in isolation. All their recordings were made using both condenser microphone and laryngograph signals.