Submitted by: National Virtual Translation Center (NVTC)

Speaker: Carol Van Ess-Dykema and Flo Reeder

Topic: Paralinguist Assessment Decision Factors For Multi-Engine Machine Translation Output

Describes a study that looks at whether Machine Translation (MT) enables translators to translate faster while at the same time producing better quality translations than without MT. It examines developers’ automatic metrics correlation with the ability to post-edit a text by a human translator. This study also seeks to find decision factors that enable a translation professional, known as a Paralinguist, to determine if MT output is of sufficient quality to serve as a “seed translation” for translators. These decision factors, unlike developers’ metrics, must function without a reference translation. This is the report of that study.

The study consists of two investigations. The first investigation answers the question: Can we post-edit MT produced “seed translations” while increasing translator speed and accuracy? The first step is to machine translate candidate texts, selected on the basis of subjects and genres. Then, translators are asked to post-edit the output of the MT and their words per hour translation rate are measured. Then, the post-edited MT output is assessed by quality control personnel using a US Government assessment standard. Analysis occurs, comparing translator speed and accuracy for test and control conditions. Additionally, developers’ metrics for MT are compared with translator words per hour; translators’ opinion of the post-editing activity and the quality control score.

The second investigation answers the question: Which decision factors aid a paralinguist in determining whether MT output is post-editable? The second investigation starts with the MT output from the first investigation and utilizes the scores determined during that investigation. Then, candidate decision factors are analyzed for correlation with translator words per hour; translators’ opinion of the post-editing activity and the quality control score. A Paralinguist will not have the benefit of a reference translation to use these metrics. Therefore, part of this study is a search for easily calculated metrics that do not require a reference translation yet yield indicators about a document’s suitability for post-editing.

Submitted by: Language Weaver, SDL

Speaker: Daniel Marcu

Contributors: Dr. Kathleen Egan, Chuck Simmons, Ning-Ning Mahlmann

Topic: Utilizing Automated Translation with Quality Scores to Increase Productivity

Automated translation can assist with a variety of translation needs in government, from speeding up access to information for intelligence work to helping human translators increase their productivity. However, government entities need to have a mechanism in place so that they know whether or not they can trust the output from automated translation solutions.

In this presentation, Language Weaver will present a new capability – TrustScore – an automated scoring algorithm that communicates how good the automated translation is, using a meaningful metric. With this capability, each translation is automatically assigned a score from 1 to 5 – the TrustScore. A score of 1 would indicate that the translation is unintelligible; a score of 3 would indicate that meaning has been conveyed and that the translated content is actionable. A score approaching 4 or higher would indicate that meaning and nuance have been carried through. This automatic prediction of quality has been validated by testing done across significant numbers of data points in different companies and on different types of content.

After outlining TrustScore, and how it works, Language Weaver will discuss how a scoring mechanism like TrustScore could be used in a translation productivity workflow in government to assist linguists with day to day translation work. This would enable them to further benefit from their investments in automated translation software. Language Weaver would also share how TrustScore is used in commercial deployments to cost effectively publish information in near real time.

Submitted by: United Nations Translation Services

Speaker: Li Zuo

Topic: Machine translation from English to Chinese: A study of Google’s performance with the UN documents.

The present study examines from users' perspective the performance of Google’s online translation service on the documents of the United Nations. Since at least 2004, United Nations has been exploring, piloting, and implementing computer assisted translation (CAT) with Trados as an officially selected vehicle. A more recent development is the spontaneous adoption of Google translation among Chinese translators as an easy, versatile, and labor-saving tool. With machine translation getting real among developers and end-users, there seems to be a need to conduct a reality check to see how well it serves its purpose. The current study examines Google translation and its degree of assistance to the Chinese professional translators at the United Nations in particular. It uses a variety of UN documents to test and evaluate the performance of Google translation from English to Chinese. The sampled UN documents consist of 3 resolutions, 2 letters, 2 provisional agendas, 1 plenary verbatim, 1 report, 1 note by the Secretariat, and 1 budget.

The results vindicate Google’s cutting edge in machine translation when English to Chinese is concerned, thanks to its powerful infrastructure and immense translation database. The conversion between the two languages takes only an instant, even for a fairly long piece. On top of that, Google gets terminology right more frequently and seems better able to make an intelligent guess when compared with other translation tools like MS Bing. But Google’s Chinese is far from intelligible, especially at the sentence level, primarily because of serious problems with word order and sentence parsing. There are also technical problems like adding or omitting words and erroneous rendering of numbers.

Nevertheless, Google translation offers translators an option to work on its rough draft for the benefit of saving time and pain in typing. The challenges of post-editing, however, may offset the time saved. Even though Google translation may not necessarily net in speed gains when it is used to assist translation, it certainly is a beneficial labor saver, including mental labor when it performs at its very best.

Submitted by: National Air and Space Intelligence Agency

Presenter: Chuck Simmons

Topic: Foreign Media Collaboration Framework (FMCF)

The Foreign Media Collaboration Framework (FMCF) is the latest approach by NASIC to provide a comprehensive system to process foreign language materials. FMCF is a Services Oriented Architecture (SOA) that provides an infrastructure to manage HLT tools, products, workflows, and services. This federated SOA solution adheres to DISA’s NCES SOA Governance Model, DDMS XML for Metadata Capture/Dissemination, and IC-ISM for Security.

The FMCF provides a cutting edge infrastructure that encapsulates multiple capabilities from multiple vendors in one place. This approach will accelerate HLT development, contain sustainment cost, minimize training, and brings the MT, OCR, ASR, audio/video, entity extraction, analytic tools and database under one umbrella, thus reducing the total cost of ownership.

Submitted by: Technical Support working Group

Presenter: Kathleen Egan

Topic: Cross Lingual Arabic Blog Alerting (COLABA)

Social media and tools for communication over the Internet have expanded a great deal in recent years. This expansion offers a diverse set of users a means to communicate more freely and spontaneously in mixed languages and genres (blogs, message boards, chat, texting, video and images). Dialectal Arabic is pervasive in written social media, however current state of the art tools made for Modern Standard Arabic (MSA) fail on Arabic dialects.

COLABA enables MSA users to interpret dialects correctly. It helps find Arabic colloquial content that is currently not easily searchable and accessible to MSA queries.

The COLABA team has built a suite of tools that will offer users the ability to anonymously capture online unstructured media content from blogs to comprehend, organize, and validate content from informal and colloquial genres of online communication in MSA and a variety of Arabic dialects

The DoD/Combating Terrorism Technical Support Office/Technical Support Working Group (CTTSO/TSWG) awarded the contract to Acxiom Corporation and partners from MTI/IBM, Columbia University, Janya and Wichita State University to bring joint expertise to address this challenge.

The suite has several use applications:

· Support for language and cultural learning by making colloquial Arabic intelligible to students of MSA

· Retrieval and prioritization for triage and content analysis by finding Arabic colloquial and dialect terms that today’s search engines miss; by providing appropriate interpretations of colloquial Arabic, which is opaque to current analytics approaches; and by Identify named entities, events, topics, and sentiment.

Enabling improved translations by MSA-trained MT systems through decreases in out-of-vocabulary terms achieved by means of colloquial term conversion to MSA.

Submitted by: National Air and Space Intelligence Agency

Presenter: Weimin Jiang

Topic: Pre-editing for Machine Translation

It is common practice that linguists will do MT post-editing to improve translation accuracy and fluency. This presentation however, examines the importance of pre-editing source material to improve MT. Even when a digital source file which is literally correct is used for MT, there are still some factors that have significant effect on MT translation accuracy and fluency.

Based on 35 examples from more than 20 professional journals and websites, this article is about an experiment of pre-editing source material for Chinese-English MT in the S&T domain. Pertinent examples are selected to illustrate how machine translation accuracy and fluency can be enhanced by pre-editing which includes the following four areas: to provide a straightforward sentence structure, to improve punctuation, to use straightforward wording, and to eliminate redundancy and superfluous elements.

Submitted by: Basis Technology Inc.

Speaker: Brian Roberson

Topic: Multi-Language Desktop Suite

Professional language analysts leverage a myriad of tools in their quest to produce accurate translations of foreign language material. The effectiveness of these tools ultimately affects resource allocation, information dissemination and subsequent follow-on mission planning – all three of which are vital, time-critical components in the intelligence cycle.

This presentation will highlight the need for interactive tools that perform jointly in an operational environment, focusing on a dynamic suite of foreign language tools packaged into a desktop application and serving in a machine translation role.

Basis Technology’s Arabic/Afghan Desktop Suite (ADS) supports DOMEX, CELLEX, and HUMINT missions while being the most powerful Arabic, Dari and Pushto text analytic and processing software available. The ADS translates large scale lists of names from foreign language to English and also pinpoints place names appearing in reports with their coordinate locations on maps.

With standardization output having to be more accurate than ever, the ADS ensures conformance with USG transliteration standards for Arabic script languages, including IC, BGN/PCGN, SATTS and MELTS. The ADS enables optimization of your limited resources and allows your analysts and linguists to be tasked more efficiently throughout the workflow process.

Submitted by: CACI Inc. and Apptek

Presenter: Kristen Summers and Hassan Sawaf

Topic: User-generated System for Critical Document Triage and Exploitation–Version 2011

CACI has developed and delivered systems for document exploitation and processing to Government customers around the world. Many of these systems include advanced language processing capabilities in order to enable rapid triage of vast collections of foreign language documents, separating the content that requires immediate human attention from the less immediately pressing material.

AppTek provides key patent-pending Machine Translation technology for this critical process, rendering material in Arabic, Farsi and other languages into an English rendition that enables both further automated processing and rapid review by monolingual analysts, to identify the documents that require immediate linguist attention.

Both CACI and AppTek have been working with customers to develop capabilities that enable them, the users, to be the ones in command of making their systems learn and continuously improve. We will describe how we put this critical user requirement into the systems and the key role that the user’s involvement played in this.

We will also discuss some of the key components of the system and what the customer-centric evolution of the system will be, including our document translation workflow, the machine translation technology within it, and our approaches to supporting the technology and sustaining its success designed around adapting to users’ needs.

Submitted by: AMTA Government Track Organizers


Panel Moderator: Judith L. Klavans


Topic: Task-based evaluation methods for machine translation, in practice and theory

A panel of industry and government experts will discuss ways in which they have applied task-based evaluation for Machine Translation and other language technologies in their organizations and share ideas for new methods that could be tried in the future. As part of the discussion, the panelists will address some of the following points:

· What task-based evaluation means within their organization, i.e., how task-based evaluation is defined

· How task-based evaluation impacts the use of MT technologies in their work environment

· Whether task-based evaluation correlates with MT developers’ automated metrics and if not, how do we arrive at automated metrics that do correlate with the more expensive task-based evaluation

· What "lessons-learned" resulted from the course of performing task-based evaluation

· How task-based evaluations can be generalized to multiple workflow environments

Submitted by: MITRE

Presenter: Rod Holland

Topic: Exploring the AFPAK Web

In spite of low literacy levels in Afghanistan and the Tribal Areas of Pakistan, the Pashto and Dari regions of the World Wide Web manifest diverse content from authors with a broad range of viewpoints. We have used cross-language information retrieval (CLIR) with machine translation to explore this content, and present an informal study of the principal genres that we have encountered. The suitability and limitations of existing machine translation packages for these languages for the exploitation of this content is discussed.

Submitted by: Raytheon BBN Technologies

Presenter: Sean Colbath

Topic: Terminology Management for Web Monitoring

Current state-of-the-art in speech recognition, machine translation, and natural language processing (NLP) technologies has allowed the development of powerful media monitoring systems that provide today’s analysts with automatic tools for ingesting and searching through different types of data, such as broadcast video, web pages, documents, and scanned images.

However the core human-language technologies (HLT) in these media monitoring systems are static learners, which mean that they learn from a pool of labeled data and apply the induced knowledge to operational data in the field. To enable successful and widespread deployment and adoption of HLT, these technologies need to be able to adapt effectively to new operational domains on demand.

To provide the US Government analyst with dynamic tools that adapt to these changing domains, these HLT systems must support customizable lexicons. However, the lexicon customization capability in HLT systems presents another unique challenge especially in the context of multiple users of typical media monitoring system installations in the field. Lexicon customization requests from multiple users can be quite extensive, and may conflict in orthographic representation (spelling, transliteration, or stylistic consistency) or in overall meaning. To protect against spurious and inconsistent updates to the system, the media monitoring systems need to support a central terminology management capability to collect, manage, and execute customization requests across multiple users of the system.