Tutorial Presenters: Nicola Ueffing (eBay MT Science), Pete Smith (University of Texas Arlington) and Silvio Picinini (eBay Localization)

Target audience: any person involved with the deployment of MT that is interested in the quality of the corpora data that will be at the foundation of the MT deployment. Roles include managers, linguists, engineers or scientists.

The quality of the corpora that trains MT systems has not been a prominent topic of discussion, if compared to MT technology. Most research uses the same, well- established corpora so that results can be reproduced. However, corpora can strongly determine the quality of the MT output. Neural MT relies more on data quality than previous technologies. Therefore, we thought that the time has come to take a deeper look into the corpora quality. We will look at this from a science view and from a linguist view, exploring current and future roles in the evolving MT scenario.

From the eBay experience, participants will learn and discuss:

Considering that data curation of corpora may become a task for a Language Professional, learn from Univ. of Texas what the profile of a Language Professional could be:


Want to read more about the conference?

Be our guest!

AMTA and IAMT are proud to announce the following keynote speakers for MT Summit XV in Miami:





Come to MT Summit XV to see all of these events:

These presentations were accepted for the Commercial MT Users & Translators track:

Farkhat Aminov Yandex.Translate approach to the translation of Turkic languages
Nikhil Bojja Machine Translation in Mobile Games – Augmenting Social Media Text Normalization with Incentivized Feedback
Robin Bonthrone, Konstantin Lakshin Why are we (still) waiting? What premium translators need to use MT effectively.
Fang Cai Productivity Promotion Strategies for Collaborative Translation on Huge-Volume Technical Documents
Dick Csaplar An Advanced Engine for New and Improved MT Results
Jennifer DeCamp Machine Translation and Terminology Management
Alan K. Melby Quality Evaluation of Four Translations of a Kidney Document: focus on reliability
Tomoki Nagase, Tatsuhiro Kudoh, Katsunori Kotani, Wenjun Ye, Takeshi Mori, Yoshiyuki Sakamoto, Nobutoshi Hatanaka, Takamitsu Takeda, Shu Hirata, Hiromi Nakaiwa A Survey of Usage Environment of Machine Translation by Professional Translators
Morgan O’Brien, Ana Duarte Machine Translation for enterprise technical communications – a journey of discovery
Elaine O’Curran MT Quality Evaluations: From Test Environment to Production
Craig Plesco, Nestor Rychtyckyj Enterprise Application of MT: Progress and Challenges
Juan Rowda A Linguist’s Approach to Quality Estimation
Achim Ruopp Industry Shared Metrics with the TAUS Dynamic Quality Dashboard and API
Carla Schelfhout Accurately Predicting Post-editing Time & Labor for Cost-Management
Mark Seligman, Mike Dillinger Adjusting Interaction Levels in a Speech Translation System for Healthcare
Jean Senellart Beyond Text, Machine Translation and NLP for e-discovery
José G. C. de Souza, Marcello Federico MT Quality Estimation for E-Commerce Data
Tanvi Surti User Experience on Skype Translator
Pablo Vazquez & Edith Bendermacher Setting up MT at NetApp – challenges and learnings while implementing MT
Kristina Vulgan Preparing Your Suppliers for Machine Translation
Alex Yanishevsky How Much Cake to Eat: The Case for Targeted MT Engines


Friday, October 30

Morning Session

Introduction to Machine Translation — Jay Marciano, Lionbridge
Target Audience:  Translators and other translation industry professionals
This tutorial is for people who are beginning their journey with machine translation and want an overview of what it is, how it works, how it can be used, and whether it can fulfil their needs. No previous knowledge of machine translation is assumed, and all levels of skepticism are welcome. The focus will be on providing background knowledge that will help you get more out of the rest of MT Summit XV and make more informed decisions about how to use or invest in machine translation. Past participants have ranged from translation professionals who want to understand changes in their field to corporate executives who are evaluating technology strategies for their organizations. The main topics for discussion are common questions about MT (What is MT and how does it differ from other translation technologies? How well can machines really translate? What’s the latest and greatest in MT?), the quality of the translations it produces (Why is the output sometimes so bad? How can the quality be improved? Can translation quality be measured objectively?), its application (What is MT good for? Can it improve a translator’s efficiency? What are the implications of this technology for translators?), return on investment (Is “free” MT really free? How much does MT really cost?), and steps to deployment (Which MT system should we use? What do we do next?). You will leave this tutorial with the tools you need to take part in an informed discussion of MT.
Bio: Jay Marciano has been the Director of Real-Time Translation Development at Lionbridge Technologies since 2010 and is currently serving on the Board of Directors of AMTA. Prior to joining Lionbridge, he worked for SDL (2001-2010), where he was responsible for the development and refinement of natural language processing software. He has also worked as a product manager and development lead at Transparent Language, where he was on the team that launched FreeTranslation.com in 1999, a lecturer in the English Department at the University of Bonn, Germany, and as an editor at Houghton Mifflin on the staff of the American Heritage Dictionary.
Computer Aided Translation: Advances and Challenges — Philipp Koehn, Johns Hopkins
Target Audience:  MT Researchers & developers
Moving beyond post-editing machine translation, a number of recent research efforts have advanced computer aided translation methods that allow for more interactivity, richer information such as confidence scores, and the completed feedback loop of instant adaptation of machine translation models to user translations.
 This tutorial will explain the main techniques for several aspects of computer aided translation: confidence measures, interactive machine translation (interactive translation prediction), bilingual concordancers, translation option display, online adaptation, eye tracking, logging, and cognitive user models. For each of these, the state of the art and open challenges are presented.
The tutorial will also look under the hood of the open source CASMACAT toolkit that is based on MATECAT, and available as a “Home Edition” to be installed on a desktop machine. The target audience of this tutorials are researchers interested in computer aided machine translation and practitioners who want to use or deploy advanced CAT technology.
Bio: Philipp Koehn, PhD, Professor, Department of Computer Science, Whiting School of Engineering, Johns Hopkins University

Afternoon Session

Using Microsoft Translator Hub and Collaborative Translator Framework — Chris Wendt
Target Audience:  Translators and other translation industry professionals
This is a hands-on introduction to using a publicly available SMT system and translation environment. We will discuss the concepts using Microsoft Translator Hub and the Microsoft Translator API as examples. Many of the concepts are transferable to other MT systems, but other MT systems will not be discussed in detail.
Agenda: Short overview of statistical MT, Overview of the Microsoft Translator web service and API, Probabilistic models and their function, The power of customization, The Translator Hub: Walk through an engine customization, Training data, suitable and unsuitable documents for training, Domains, Test and tuning sets, Collaborative features in the Translator Hub, Collaborative features in the Translator API and web site widget, Continuous retraining, or one-time shots, Deploying your customized system, Using your customized system in applications, TMSs and CMSs, Using your customized system in your own code
Bring your problems, questions and issues. We’ll be able to discuss them and come up with a good solution, which will be helpful for the other attendees. OK if you don’t. Best if you can bring a laptop and if you have translation memory or previously translated documents with at least 5000 segments, but no problem if you don’t have one.
Bio: Chris Wendt graduated as Diplom-Informatiker from the University of Hamburg, Germany, and subsequently spent a decade on software internationalization for a multitude of Microsoft products, including Windows, Internet Explorer, MSN and Windows Live, bringing these products to market with equal functionality worldwide. Since 2005 he has led the program management team for Microsoft’s Machine Translation development, responsible for Microsoft Translator services, including Bing Translator and Skype Translator, connecting Microsoft’s research activities with its practical use in services and applications. He is based at Microsoft headquarters in Redmond, Washington.
Neural Networks in MT — Kyunghyun Cho, NYU
Target Audience: MT Researchers & developers
 Neural machine translation is a new approach to MT, where a single, large neural network is trained to maximize translation performance. This radical departure from existing (phrase-based) machine translation approaches was proposed independently by several research groups during the last two years (2013-2014). This tutorial will present a detailed description of the two dominant models in neural machine translation; discuss a number of fundamental differences to conventional statistical machine translation; cover a series of recent improvements that have made neural machine translation competitive against phrase-based statistical machine translation; and present future directions of neural machine translation.
 Bio: Kyunghyun Cho is an assistant professor in the Department of Computer Science, Courant Institute of Mathematical Sciences and the Center for Data Science at New York University (NYU) from September, 2015. Previously, he was a postdoctoral researcher at the University of Montreal under the supervision of Prof. Yoshua Bengio after obtaining a doctorate degree at Aalto University (Finland) in early 2014 under the supervision of Prof. Juha Karhunen, Prof. Tapani Raiko and Dr. Alexander Ilin. Kyunghyun’s main research interests include neural networks, generative models and their applications, especially, to natural language understanding.

Tuesday, Nov. 3

Afternoon Session

Beyond Post-editing: Interactive and Adaptive Translation Technologies — Spence Green, lilt.com
Target Audience:  Translators and other translation industry professionals; MT developers
Post-editing is an increasingly common mode of machine-assisted translation. Case studies in both academia and industry have recently shown that machine translations can reduce translation time and effort for some domains and language pairs. However, there are two reasons to look beyond current implementations of post-editing. First, the commercial systems most commonly used—Google Translate and Microsoft Translator—are tuned for coverage rather than specific localization settings. Second, it has been known since the 1960s that post-editing is a poor user experience.
Interactive and adaptive translation tools were conceived decades ago to address these shortcomings. Until recently, implementations of these ideas were not widely available outside of the research community. This tutorial will introduce tools and techniques that augment the human translator, offering the potential for significant productivity improvements. The material will be accessible to both translation professionals and those interested in new translation technology. There will be four parts: (1) The history of and motivation for interactive translation techniques, (2) the translator’s workflow and where machine intervention can be effective, (3) interactive and adaptive translation technology, (4) practicum: introduction to some tools.
Bio: Spence Green recently completed a PhD in computer science at Stanford University supervised by Chris Manning. He was given a Best Paper award at CHI 2013 for his work on mixed-initiative translation. He holds an MS in computer science from Stanford and a BS in computer engineering from the University of Virginia. Currently he is a co-founder of Lilt, a provider of interactive translation systems.

Precision Translation Tools announces the release of Slate, the first packaged SMT toolkit for native Windows x86-64 operating systems.

Note: “native” means without Cygwin. There is also a parallel Slate package for Linux. Packages include command-line utilities from Moses, MGIZA++ and PTTools necessary to train, tune and evaluate phrase and phrase-factored SMT models. You can find detailed specs about the packages, where to get them at this URL and the “More about Slate for Windows” link at the bottom of the page:


AMTA 2020 | Conference Announcement

by Darius Hughes | January 12, 2020

The 14th biennial conference of the Association for Machine Translation in the Americas.  September 8-12 Sheraton Orlando Lake Buena Vista Resort Orlando, Florida, USA AMTA conferences are unique in bringing together MT researchers, developers, and users of MT technology from academia, industry, and government. This year AMTA 2020 we have arranged for a spectacular venue in […]

AMTA 2020

AMTA 2018 | Proceedings for the Conference, Keynotes, Workshops and Tutorials

by Mike Dillinger | March 21, 2018

Main Conference Research Track Download (2.7 MB) Commercial and Government Tracks Download (28.4 MB)   Keynotes Arianna Bisazza – Leiden University Research Keynote | Unveiling the Linguistic Weaknesses of Neural MT Download (4 MB) Macduff Hughes – Google Commercial Keynote | Machine Translation Beyond the Sentence Download (2.7 MB) Carl Rubino – IARPA Government Keynote […]

AMTA 2018

AMTA 2018 | Workshop | The Role of Authoritative Standards in the MT Environment

by Mike Dillinger | January 30, 2018

In this workshop, we will bring together experts from across the standards community, including from the American Society for Testing and Materials (now just “ASTM International”), the American National Standards Institute (ANSI), the International Organization for Standardization (ISO), the Globalization and Localization Association (GALA), and the World Wide Web Consortium (W3C). These experts will discuss authoritative standards that […]

AMTA 2018

AMTA 2018 | Tutorial | ModernMT: Open-Source Adaptive Neural MT for Enterprises and Translators

by Mike Dillinger | January 30, 2018

Nowadays, computer-assisted translation (CAT) tools represent the dominant technology in the translation market – and those including machine translation (MT) engines are on the increase. In this new scenario, where MT and post-editing are becoming the standard portfolio for professional translators, it is of the utmost importance that MT systems are specifically tailored to translators. […]

AMTA 2018

AMTA 2018 | Tutorial | MQM-DQF: A Good Marriage (Translation Quality for the 21st Century)

by Mike Dillinger | January 30, 2018

In the past three years, the language industry has been converging on the use of the MQM-DQF framework for analytic quality evaluation. It emerged from two separate quality-evaluation approaches: the European Commission-funded Multidimensional Quality Metrics (MQM) and the Dynamic Quality Framework (DQF) from TAUS. Harmonized in 2015, the resulting shared hierarchy of error types allows […]

AMTA 2018

AMTA 2018 | Tutorial | A Deep Learning curve for Post-Editing

by Mike Dillinger | January 30, 2018

Does post-editing also require a deep learning curve? How do the neural networks of post-editors work in concert with neural MT engines? Can post-editors and engines be retrained to work more effectively with each other? In this tutorial, we demystify the process, focus on the latest MT developments and their impact on post-editing practices. We […]

AMTA 2018