ASPA: Automatc speech-pause analyzer* D. GERVERt and G. DNELEY Unversty of Durham, Durham, England ASPA: The Programs Snce the actual detals of nterface samplng, dsk storage routnes, etc., wll depend upon the user's own partcular computer confguraton, the programs are presented n flow-chart form wth textual descrptons. Ths paper descrbes a sute of computer programs for montorng patterns of speakng and pausng from twn-channel tape-recorder output. The programs to be descrbed below were developed n connecton wth research on patterns of speakng and pausng n smultaneous conference nterpretaton. The am was to montor, n real tme, twn-channel tape recordngs of speech n order to produce as output a dgtal record of the presence or absence of speech on each channel, together wth the length of tme the channels remaned n any partcular state: Channell on Channell on Channel 1 off Channel 1 off - Channel 2 on - Channel 2 off - Channel 2 on - Channel 2 off The programs were developed wth the followng ponts n mnd: (1) n order to provde a more consstent waveform for samplng, the envelope' of the speech waveform, rather than the speech waveform tself, was to be sampled. (2) n order to reduce the amount of data handled by the programs, data should be ntally processed n the samplng program. (3) Snce the sgnal beng sampled (.e., the envelope of the speech waveform) would be of a low frequency, and n order to reduce the amount of data handled, a relatvely slow samplng rate (100 tmes/sec) was decded upon. (4) For speed and convenence, the data were to be stored n dsk fles. (5) The crteron for a pause was to be J.4 sec. Any break n speech on ether channel of less than J.4 sec was to be dscounted, and the nterval regarded as contnuous wth the speech sgnal on ether sde of t on the relevant channel. (6) For smplcty, and n order to facltate debuggng, the programs should be developed n modular form. The flow chart n Fg. 1 llustrates the sequence of eght programs. These are descrbed more fully below. (1) CLEAR CLEAR works entrely wth dsk output. The sole purpose of CLEAR s 8,000 16 bt words of core storage and to clear all dsk data fles used to zero. dsk drve holdng 512,000 words; also Ths s necessary at the begnnng of a 1134 paper-tape reader and a 1131 each run n order to avod pckup of console typewrter output. data from prevous runs. See Fg. 2. (2) nterface: Wth analog nput and (2)OMOTS output, varable-rate nterrupt OMOTS conssts essentally of three generator, and clock. programs that sample the dgtzed (3) Osclloscope: Four-track slow envelope of the speech waveform at scan. the nterface and store the results of (4) Stereo tape recorder. the samplng on dsk. The (5) Two envelope follower crcuts, confguraton of equpment necessary each consstng of a smple rectfer for operatng ths program s shown n wth capactor smoothng and a decay Fg. 3 and s explaned n the text resstor. followng. @1 ~'!R T~Pt! -... "'r:.-- D... '4,... tj 4-- -- OUTPUT :... c;;j \'- KEYBOARD..-... -t::....-... ::=~--lj DSK FLES..---~ ~ tj tj ~...--' '- - - - -t>- - - - -' --------t>-------- : : :::: :1'NTmAC' ".........r:.... tj r----a.--,... STEMP _... --...4- EQUPMENT (1) Computer: BM 1130, wth *The research for whch these programs were developed was supported by the Socal Scence Research Councl. trequests for reprnts should be addressed t,p Dr. D. Gerver, Department of Psychology, Queens Unversty, Kngston, Ontaro. Canada...- 1::>"",. "",. -- ---- -t>- - - - - Fg. 1. Sequence of programs and operatons n ASPA. Behav. Res. Meth. & nstru., 1972, Vol. 4 (5) 265
f STAlT Of S COND fl.e O CR M(N ~ C'OR COUNT O CREMENT SECTOR COUNT The composte program frst reads a pece of paper tape whch defnes the maxmum number of dsk fle sectors whch can be used, the trgger levels, and bas for samplng the envelope of the speech sgnal. The program then prepares tself to accept nterrupts and wats. Ths allows the operator to prepare the equpment,.e., advance the expermental tapes to the pont requred and start the replay. When the Program Start key on the computer keyboard s pressed, the generaton by TMON of Level 3 nterrupts at the nterface s ntated. These are tmed to arrve at 10-msec ntervals. On recept of an nterrupt, the program samples two of the nterface analog nputs and decdes whether they exceed a certan crtcal D CREMENT SECTOR COUNT C l \-'-'-'-:,-.-~-'-'-'---'-'l Fg. 2. CLEAR. SET!. ~~" LEVEL 3 OMOTS. ~T!. _._.~_. -:>-- WDV L':._._._._._t_._._._._._..-J t OUTf'Ul. Of SECtORS ~. USEoetC L:/. level, whch s ndependently varable on both channels by means of the trgger levels and bas defned on the paper tape, The program then decdes whether one or both channels have changed from one state to another snce the prevous sample. t constructs a data set of the form TTl'S and stores t n a buffer n the program; where TTT '" tme n 1/100 sec snce the last change, and S = present state of the channels: 0= both off 1 = Channel lon, 2 off 2 = Channel 1 off, 2 on 3 = both on Ths program has two buffers so that when the frst s full t can nterchange them, storng data n the second whle wrtng the contents of the frst onto dsk. Ths process of double bufferng enables the program to contnue samplng unmpared, even though t s smultaneously samplng and transferrng data to dsk and despte the fact that data are beng produced from an rregular samplng of the channels. On recevng a sgnal from the keyboard (.e., on termnaton of the relevant porton of tape beng analyzed) the program commands the nterface to stop generatng nterrupts, wrtes the contents of the last buffer onto the dsk and ends, leavng the computer ready for the next program. The program also has a faclty for clampng both channels so that nether can change state more frequently than a predetermned rate (n ths case 5/100 sec); ths was ncorporated to reduce the amount of data produced. OMOTS and the sequence of subprograms are shown n Fgs. 4 and 5. Fg. 3. nput confguraton. foul CHANNU TAPE ECQlDl OS'lAY OSCllOSCoPE 266 Behav. Res. Meth. & nstru., 1972, Vol. 4 (5)
CALL SETTO STEAL NTERJPTS AHA OVER PARAMETERS PAUSE WHLST EQUPMENT S SET UP START NTERRUPTS CAlL TMON 2 s leal LEVEl :J NTERRUPT AODRESS ~1EAl NTERRUPT REQUEST Al0RESS GRAB PARAMETERS from MAN LNE The followng nputs and outputs are employed wth OMOTS: (1) Paper-tape nput descrbng varous parameters: length of dsk fle to be used, trgger level, bas for dfferental trgger levels, f requred. (2) Analog nputs on the nterface for montorng tape recorder output. t should be stressed that the tapes to be analyzed should be as free as possble of any background nose. Where necessary, a bandpass flter should be used to elmnate unwanted hgh- or low-frequency nterference. (3) Dsk output, for recordng the data. (4) Analog outputs from the nterface. These (as shown n Fg. 3) are dsplayed on two channels of the four-channel osclloscope and provde the operator wth a pcture of what the program "thnks t s hearng." These can be compared wth the audtory sgnals and ther dsplays on the other two osclloscope channels, and the level of the tape-recorder output can be adjusted untl the operator s satsfed that the computer s satsfactorly trackng the recorded sgnals. Before analyzng each tape, OMOTS s run n order to set the approprate levels. Ths s done usng a test tape of known sgnal-slence tmes, and the actual expermental tape n queston. When settng up the equpment for use, the test tape should be prepared and run through ASPA, comparng the fnal output wth the known test tape events and event tmes. RESTOt 3 Fg. 4. OMOTS. (3) SHUFF Ths program (see Fg. 6) reads from dsk and wrtes back onto dsk. t nverts the data produced by OMOTS, convertng an assembler array nto a FORTRAN-compatble array. t also converts the data words nto a new form, TTTS, by buldng up a new word, usng the tme from the present data word and the state from the prevous data word. One data word s lost n ths process. Ths program s necessary, snce the data produced by OMOTS represent the tme the system was n a state before t changed to ts current state, whereas the fnal analyss requres the tme the system s actually n a partcular state. Bebav. Res. Meth. & nstru., 1972, Vol. 4 (5) 267
TO STOP SM'.G Fg. 6. OMOTS (SET, TMON, RET). 268 Behav. Res. Meth. & nstru., 1972, Vol. 4 (5)
fnd THE ~TART OF BOTH fles Fg. 6. SHUFF. NVERT ~ECTOR NTO BUffER ARRAY SHfT BACK TMES Ul..TVE,0 STATE, (DROPPNG fr~t TME l"'st WORD Of BUff. o TME OF FRST WORVOf :;ECTOR + ~TATE OF LA~T WORD Of BUFfER READ A SECTOR from frst FLE r-'-'-'-----'-'-----'_'_'_' ; j. t«l : :=:=:=:=:=:=:="=:=:=;=:=:=j 1 Fg. 7. DROPM..! _.... j L._._._._._._._._._._._._._._. Behav. Res. Meth. & nstru., 1972, Vol. 4 (5) 269
(4)DROPM Ths program (Fg. 7) reads from dsk and wrtes back onto dsk, preparng the data from SHUFF so that pauses of less than JA sec can be elmnated by effectvely swtchng back on any channel whch has swtched off for less than JA sec. Ths nterval can, of course, be vared to sut the user. 1-'-'-'-'-'---'---'---'----1,--> - :. :=:=:=:=:=:=::=j ~ f :. : 1 : ~-.-! ;_._._._._;_._._._. Fg. 8. CMPRS.._._.1 (5) CMPRS Ths program (Fg. 8) also works from dsk to dsk, takng the data from DROPM and compressng any sequences of data words whch have the same state nto one data word whch has as ts tme the sum of sequence tmes. (6) CLEAN CLEAN s used to elmnate nose at the start of the record. Ths could be due to swtchng on the tape recorder, nose on the tape, etc. The program (Fg. 9) prnts out the start (e.g., the frst 10 events and tmes) of the data produced by the prevous programs, and wats for a number to be typed n. The number s selected by the operator on nspectng the data prntout. For nstance, n the experment for whch these programs were developed, all data sets should have started wth a short nterval (e.g.,.8 sec) wth Channell on, Channel 2 off. t' the data set showed 0102 0053 0041 0033 0801, the operator would type n the last fgure, and all precedng data words would be elmnated for the fnal analyses. (7) CRASH CRASH reads from dsk, analyzes the data, and prnts the results. Snce the type of analyss requred (e.g., tme x state frequency dstrbutons, means and varances of tmes spent n each state, etc.) depends on the user, no descrpton of ths program s provded here. Fg. 9. CLEAN. (8)SHOWD Reads from dsk fles and prnts the stored data. Snce ths depends on the user's ndvdual requrements, no further descrpton s provded. 270 Behav. Res. Metb. & nstru., 1972, Vol. 4 (5)