Integrating Word Processing, Term Management, and Machine Translation

Similar documents
The Ohio State University's Library Control System: From Circulation to Subject Access and Authority Control

Automatic Analysis of Musical Lyrics

End users' perceptions concerning computer applications implemented in broadcast stations

How Scholarly Is Google Scholar? A Comparison of Google Scholar to Library Databases

Before the Federal Communications Commission Washington, D.C ) ) ) ) ) ) REPLY COMMENTS OF THE NATIONAL ASSOCIATION OF BROADCASTERS

Demand-Driven Acquisitions for Print Books: How Holds Can Help as Much As Interlibrary Loan

What's New in Technical Processing

The Collateral Source Rule in Georgia: A New Method of Equal Protection Analysis Brings a Return to the Old Common Law Rule

15th International Conference on New Interfaces for Musical Expression (NIME)

PHYSICAL REVIEW E EDITORIAL POLICIES AND PRACTICES (Revised January 2013)

Types of Publications

Retiming Sequential Circuits for Low Power

Fieldbus Testing with Online Physical Layer Diagnostics

Machine Vision System for Color Sorting Wood Edge-Glued Panel Parts

Secrecy in Limbo: What the Most Recent Settlement with the IRS Means for UBS and the Rest of the Swiss Banking Industry

[MB Docket Nos , ; MM Docket Nos , ; CS Docket Nos ,

At-speed testing made easy

OVERVIEW. YAMAHA Electronics Corp., USA 6660 Orangethorpe Avenue

Chapter 3 Components of the thesis

THE AUTOMATING OF A LARGE RESEARCH LIBRARY. Susan Miller and Jean Yamauchi INTRODUCTION

A Case Study of Web-based Citation Management Tools with Japanese Materials and Japanese Databases

tech paper 2015 Effective feedback control

Dietrich Schüller. Safeguarding audiovisual information for future generations. Inforum 2016 Prague May 2016

A QUANTITATIVE STUDY OF CATALOG USE

Implementation of LED Roadway Lighting

Data Collection at SSRL: practical aspects

ADS Basic Automation solutions for the lighting industry

A General Framework for Interactive Television News

Design and Implementation of a Digital Teleultrasound System for Real-Time Remote Diagnosis

Author Guidelines. Table of Contents

Syrah. Flux All 1rights reserved

ELIGIBLE INTERMITTENT RESOURCES PROTOCOL

Real-time interaction with television content

THE "ANNUAL BUYERs' GuiDE" in the

A repetition-based framework for lyric alignment in popular songs

McGill-Harvard-Yenching Library Joint Digitization Project: Ming-Qing Women's Writings

Licensing & Regulation #379

Using KPIs to Improve Profitability White Paper

Discovery has become a library buzzword, but it refers to a traditional concept: enabling users to find library information and materials.

Power Consumption Trends in Digital TVs produced since 2003

The software concept. Try yourself and experience how your processes are significantly simplified. You need. weqube.

Optimization of Multi-Channel BCH Error Decoding for Common Cases. Russell Dill Master's Thesis Defense April 20, 2015

PHYSICAL REVIEW B EDITORIAL POLICIES AND PRACTICES (Revised January 2013)

Achieving Faster Time to Tapeout with In-Design, Signoff-Quality Metal Fill

Alcatel-Lucent 5620 Service Aware Manager. Unified management of IP/MPLS and Carrier Ethernet networks and the services they deliver

Note for Applicants on Coverage of Forth Valley Local Television

The Cognitive Nature of Metonymy and Its Implications for English Vocabulary Teaching

Jazz Ensembles Handbook

Comparison between PR China and USA in the Field of Library and Information Sciences

The software concept. Try yourself and experience how your processes are significantly simplified. You need. weqube.

A Top-down Hierarchical Approach to the Display and Analysis of Seismic Data

Authority Control in the Online Environment

1ms Column Parallel Vision System and It's Application of High Speed Target Tracking

A. Almeida.do Vale M. J. Dias Gongalves Zita A. Vale Member,IEEE

PHYSICAL REVIEW D EDITORIAL POLICIES AND PRACTICES (Revised July 2011)

The Myth of Dvorak. Joey Day. Writing 2010, Section 032. Michael White. April 5, 2002

AUDIOVISUAL COMMUNICATION

Project Summary EPRI Program 1: Power Quality

Real-time Chatter Compensation based on Embedded Sensing Device in Machine tools

INFS 427: AUTOMATED INFORMATION RETRIEVAL (1 st Semester, 2018/2019)

Musical Creativity. Jukka Toivanen Introduction to Computational Creativity Dept. of Computer Science University of Helsinki

August 7, Legal Memorandum

LOOK BELOW THE SURFACE

Final Report on Pinyin Conversion by the CEAL Pinyin Liaison Group

Tutorial 11 ChipscopePro, ISE 10.1 and Xilinx Simulator on the Digilent Spartan-3E board

Image Acquisition Technology

1. PARIS PRINCIPLES 1.1. Is your cataloguing code based on the Paris Principles for choice and form of headings and entry words?

INFORMATION FOR AUTHORS OF GRADUATE THESES (IN ENGLISH) IN THE FIELDS OF ENGLISH LANGUAGE TEACHING, LINGUISTICS, AND APPLIED LINGUISTICS

Understanding Compression Technologies for HD and Megapixel Surveillance

Seen on Screens: Viewing Canadian Feature Films on Multiple Platforms 2007 to April 2015

AP Music Theory Syllabus

of New York, Inc. Original Sheet No. 81 SCHEDULE 3 Black Start Capability

Axle Assembly Poke-Yoke

Imagine... continuous rapid processing maximizing productivity. Tissue-Tek Xpress x Series. Continuous Rapid Tissue Processor

Effect of Video Camera-Based Remote Roadway Condition Monitoring on Snow Removal-Related Maintenance Operations

Msquare Innotech Solutions Pvt. Ltd. Complete integration of business solution. About Us: Mission:

Looking Back: WPA Library Work in Kentucky

Advanced Coding and Modulation Schemes for Broadband Satellite Services. Commercial Requirements

P1: OTA/XYZ P2: ABC c01 JWBK457-Richardson March 22, :45 Printer Name: Yet to Come

Ending the Multipoint Videoconferencing Compromise. Delivering a Superior Meeting Experience through Universal Connection & Encoding

JOB DESCRIPTION FOR PICTURE EDITOR VISUAL JOURNALISM ARABIC SERVICE

Failure Analysis Technology for Advanced Devices

TITLE OF THE PAPER. Picture of Author mm. 40 mm ( good resolution) AUTHOR`S NAME 2

IEEE (INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS)

The CIP Motion Peer Connection for Real-Time Machine to Machine Control

* This configuration has been updated to a 64K memory with a 32K-32K logical core split.

Should the Journal of East Asian Libraries Be a Peer- Reviewed Journal? A Report of the Investigation and Decision

A Different Approach to Evaluating State Music Festival Performances. Davine Davis, Missouri September 20,2017

Advanced Techniques for Spurious Measurements with R&S FSW-K50 White Paper

Digitization Project of the Historical Archives of Macao

LOOK BELOW THE SURFACE

CRS Report for Congress Received through the CRS Web

The new benchmark for efficiency in the field The ZVH cable and antenna analyzer

Seattle IFMA Education Symposium June The Riddles of LED Lighting. Chris Lewis, CFM

AutoChorale An Automatic Music Generator. Jack Mi, Zhengtao Jin

Name Identification of People in News Video by Face Matching

Bit Swapping LFSR and its Application to Fault Detection and Diagnosis Using FPGA

Type-2 Fuzzy Logic Sensor Fusion for Fire Detection Robots

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.

Controlling Peak Power During Scan Testing

Transcription:

Deseret Language and Linguistic Society Symposium Volume 8 Issue 1 Article 16 3-26-1982 Integrating Word Processing, Term Management, and Machine Translation Alan K. Melby Follow this and additional works at: http://scholarsarchive.byu.edu/dlls BYU ScholarsArchive Citation Melby, Alan K. (1982) "Integrating Word Processing, Term Management, and Machine Translation," Deseret Language and Linguistic Society Symposium: Vol. 8: Iss. 1, Article 16. Available at: http://scholarsarchive.byu.edu/dlls/vol8/iss1/16 This Article is brought to you for free and open access by the All Journals at BYU ScholarsArchive. It has been accepted for inclusion in Deseret Language and Linguistic Society Symposium by an authorized administrator of BYU ScholarsArchive. For more information, please contact scholarsarchive@byu.edu.

INTEGRATING WORD PROCESSING, TERM MANAGEMENT, AND MACHINE TRANSLATION Alan K. Melby Linguistics Department Brigham Young University At last year's DLLS symposium (March,1981), the author proposed on a "suggestion box" translator aid. In October 1981, the system became operational and was tested by the students in a translation seminar. Further consideration of the problem of computer aids for translation, together with the many good ideas put forth by the seminar students, has resulted in a proposal for a significantly expanded system which includes the "suggestion box" aid as one component. This new translator aid system integrates word processing, term management, and machine translation. Traditionally, machine translation systems were designed with the long-range goal of replacing the human translator. The system proposed in this paper, on the other hand, is designed to be a tool for a human translator, never a replacement. The new system will have three levels. Level two corresponds to the "suggestion box" aid of last year. Level one is a lower level which.. does not even require the source text to be available in machine-readable form. ' Level three is the highest level and requires a remote machine translation system which can operate without the presence of a translator. Levels one and two are now being programmed on the IBM 370/138 computer at the BYU Humanities Research Center. Work on level three will begin next year. THE "ALL OR NOTHING" SYNDROME Originally, fully automatic high-quality translation was the only goal of research in machine translation. Until recently, there seemed to be a widely shared assumption that the only excuse for the inclusion of a human translator in a 15.1

machine translation system was as a temporary, unwanted appendage to be eliminated as soon as research progressed a little further. This "all or nothing" syndrome drove early machine translation researchers to aim for a fully automatic system or nothing at all. It is now quite respectable in computational linguistics to develop a computer system which is a tool used by a human expert to access information helpful in arriving at a diagnosis or other conclusion. Perhaps, then, it is time to entertain the possibility that it is also respectable to develop a machine translation system which includes sophisticated linguistic processing yet is designed to be used as a tool for the human translator. If each sentence of the final translation is expected to be a straight machine translation or at worst a slight revision of a machine translated sentence, then disappointment is probable. After experimentation, Brinkmann concluded that "the post-editing effort required to. provide texts having a correctness rate of 75 or even 80 percent with the c6-rrections necessary to reach an acceptable standard of quality is unjustifiable as far as expenditure of money and manpower is concerned" (Brinkmann,1980). Thus, a strict post-edit approach must be nearly perfect or it is almost useless. Many projects start out with high goals, assuming that post-editing can surely rescue them if their original goals are not achieved. But even post-editing may not make the system viable. A PROPOSED ALTERNATIVE This paper proposes that an interesting alternative to the "all or nothing" approach is to anticipate from the beginning that not every sentence of every text will be translated by computer and find its way to the target text with little or no revision. Then an effort can be made from the beginning to provide for a smooth integration of human and machine translations. The proposed translator-aid system (TAS) will have three integrated levels of aid under the control of the translator. We will now describe the three levels. Level one translator aids can be used immediately even without the source text being in machine-readable form. In other words, the translator can sit down with a source text on paper and begin translating much as if at a typewriter. Level 15.2

one includes a word processor with integrated terminology aids. For familiar terms that recur there is a monolingual expansion code table which allows the user to insert user-defined abbreviations in the text and let the machine expand them. This feature is akin to the "macro" capability on some word processors. The key can be several characters long instead of a single control character, so the number of expansion codes available is limited principally by the desire of the translator. Level one also provides access to a bilingual terminology data bank. There is a term file in the microcomputer itself under the control of the individual translator. The translator may also have access to a larger, shared term bank (through telecommunications or a local network). Level one is similar to a translator aid proposed by Leland Wright, a well-known professional translator. Ideally, the translator would also have access to a data base of texts (both original and translated) which may be useful as research tools. Level two translator aids require the source text to be in machine-readable form. Included in level two are utilities to process the source text according to the desires of the translator. For example, the translator may run across an unusual term and request a list of all occurrences of that term in that text. Level two also includes a "suggestion box" option (Melby,1981) which the translator can invoke. This feature causes each word of the current text segment to be automatically looked up in the term file and displays any matches in a field of the screen called the suggestion box. If the translator opts to use the suggested translation of a term, a keystroke or two will insert it 'into the text at the point specified by the translator. If the translator desires, a morphological routine can be activated to inflect the term according to evidence available in the source and target segments. Level three translator aids integrate the translator work station with a full-blown machine translation (MT) system. The MT component can be any machine translation system that includes a self-evaluation procedure. The system uses that procedure to asssign to each of the translated sentences a problem rating (e.g. "A" means no detected problems, "B" means some uncertainty about parsing or semantic choices made, "C" means probable flaw, and "D" means severely deficient). 15.3

The actual machine translation for level three is done remotely on a separate computer without the direct involvement of a human translator. Then the segmented source text and the machine translation for each segment, together with its self-assigned "grade II, are placed on a diskette and sent to the translator. The translator works at a small station which, ideally, is a self-contained microcomputer which is programmed to support all three levels of aid. Level one, as mentioned previously, requires no diskette containing source text. This means that at level one, the translator can get straight to work on a new document. At level two, a diskette containing source text is needed before the translator can begin work. And at level three, a diskette containing source text and machine translation is needed before work can begin. At level three, on any se&"ment, the translator may request to see the machine translation of that segment.,if it looks good, the translator can pull it down into the work area, revise it a~ "needed, and thus incorporate it into the translation being produced by the translator. Or the translator may request to see all those machine translations that have a rating above a specified threshold (e.g. above "e"). Of course, the translator is never obliged to use the machine translation unless the translator feels it is more efficient to use it than to translate manually. No pressure is needed other than the pressure to produce rapid, high-quality translations. If using the machine translations make the translation process go faster and better, then the translator will naturally use them. A positive aspect of this three level approach is that while level three is dramatically more complex linguistically and computationally than level two, level three appears to the translator to be very similar to level two. Level two presents key terms in the sentence; level three presents whole sentences. At level three, any segment which does not have a qualifying machine translation will cause a smooth, automatic shift to level two for that segment and back to level three for the next qualifying segment. So, when good level three segments are available, it can speed up the translation considerably, but their absence does not stop the translation process or even greatly hinder it. Thus, a multi-level system can be put into production much sooner than a conventional post-edit system. And the sooner a system is put into production, the sooner useful feedback is obtained from the users. 15.4

CONCLUSION The multi-level approach described in this paper is designed to please (a) the sponsors (because the system is useful early in the project and becomes more useful with time), (b) the users (because they are in control and choose the level of aid), and (c) the linguists and programmers (because they are not pressured to make compromises just to get automatic translation on every sentence). Future papers will report on progress and problems in the design and implementation of the translator aid system described in this paper. REFERENCES (1) Andreyewski, Alexander, Translation: Aids, Robots, and Automation, META Vol. 26, No.1 (March 1981) 57-66. (2) Baudot, Jean, Andre Clas, and Irene Gross, Un modele de mini-banque de terminologie bilingue, META, Vol. 26, No.4 (1981) 315-331. (3) Boitet, Ch., P. Chatelin, P. Daun Fraga, Present and Future Paradigms in the Automatized Translation of Natural Languages, in: COLING80 (Tokyo, 1980). (4) Brinkmann, Karl-Heinz, Terminology Data Banks as a Basis for High-Quality Translation, in: COLING80 (Tokyo, 1980)... (5) Kay, Martin, The Proper Place of Men and Machines in Language Translation, Xerox Palo Alto Research Center Report (October 1980). (6) Lippman, Erhardt, Computer Aids for the Human Translator, Report presented at the VIII World Congress of FIT, Montreal (1977). (7) Melby, Alan K., Melvin R. Smith, and Jill Peterson, ITS: Interactive Translation System, in: COLING80 (Tokyo, 1980). (8) Melby, Alan K., Linguistics and Machine Translation, in: James Copeland and Philip Davis (eds.), The Seventh LACUS Forum 1980 (Hornbeam Pre~s, Columbia, SC, 1981). (9) Melby, Alan K., A Suggestion Box Translator Aid, in: Proceedings of the annual symposium of the Deseret Language and Linguistic Society, (Brigham Young University, Provo, Utah, 1981). 15.5