Boston, Massachusetts, March 21, 2018

The goal of quality estimation is to evaluate a translation system’s quality without access to reference translations (Blatz et al., 2004; Specia et al., 2013). This has many potential usages: informing an end user about the reliability of translated content; deciding if a translation is ready for publishing or if it requires human post-editing; highlighting the words that need to be changed. Quality estimation systems are particularly appealing for crowd-sourced and professional translation services, due to their potential to dramatically reduce post-editing times and to save labor costs (Specia, 2011). The increasing interest in this problem from an industrial angle comes as no surprise (Turchi et al., 2014; de Souza et al., 2015; Martins et al., 2016, 2017; Kozlova et al., 2016). A related task is that of automatic post-editing (Simard et al. (2007), Junczys-Dowmunt and Grundkiewicz (2016)), which aims to automatically correct the output of machine translation. Recent work (Martins, 2017, Kim et al., 2017, Hokamp, 2017) has shown that the tasks of quality estimation and automatic post-editing benefit from being trained or stacked together.

In this workshop, we will bring together researchers and industry practitioners interested in the tasks of quality estimation (word, sentence, or document level) and automatic post-editing, both from a research perspective and with the goal of applying these systems in industrial settings for routing, for improving translation quality, or for making human post-editors more efficient. Special emphasis will be given to the case of neural machine translation and the new open problems that it poses for quality estimation and automatic post-editing.

 

ORGANIZERS

André Martins (Unbabel and University of Lisbon): andre.martins@unbabel.com

Ramon Astudillo (Unbabel and INESC-ID Lisboa): ramon@unbabel.com

João Graça (Unbabel): joao@unbabel.com

 

SCHEDULE

9:00 — Welcome

9:15 – 10:00 — Nicola Ueffing: “Automatic Post-Editing and  Machine Translation Quality Estimation at eBay”

10:00 – 10:30 — Rebecca Knowles: “Lightweight Word-Level Confidence Estimation for Neural Interactive Translation Prediction”

10:30 – 11:00 — Coffee Break

11:00 – 11:45 — João Graça

11:45 – 12:30 — Maxim Khalilov: “Machine translation at Booking.com: what’s next?”

12:30 – 14:00 — Lunch break

14:00 – 14:45 — Marcin Junczys-Dowmunt

14:45 – 15:30 — Marcello Federico: “Challenges in Adaptive Neural Machine Translation”

15:30 – 16:00 — Coffee Break

16:00 – 16:20 — Eleftherios Avramidis: “Fine-grained evaluation of Quality Estimation for Machine translation based on a linguistically-motivated Test Suite”

16:20 – 16:40 — Rebecca Knowles: “A Comparison of Machine Translation Paradigms for Use in Black-Box Fuzzy-Match Repair”

16:45 – 17:30 — Discussion Panel (Nicola Ueffing, Maxim Khalilov, Marcello Federico, Alon Lavie)

 

INVITED SPEAKERS

Nicola Ueffing (eBay)

Title: Automatic Post-Editing and  Machine Translation Quality Estimation at eBay

Abstract: This presentation will give an overview of Automatic Post-Editing and Quality Estimation research and development for e-commerce data at eBay. I will highlight two projects: (1) Application of Automatic Post-Editing and Machine Translation for Natural Language Generation for e-commerce browse pages, where the structured data describing the products is automatically “translated” into natural language; and (2) Quality Estimation for Machine Translation of eBay item titles, which compares general models and models which are specifically trained for three different categories in the inventory of eBay’s marketplace platform for prediction of post-edition effort.

Bio: Nicola joined eBay’s machine translation research team in May 2016. Her focus is on machine translation, both for e-commerce content and for natural language generation, and quality estimation. Prior to working for eBay, Nicola was a language modeling research scientist at Nuance Communications, leading the research and development for dictation products like Dragon NaturallySpeaking. Nicola received a PhD in computer science from RWTH Aachen University, specializing in confidence estimation for machine translation. She then joined the Interactive Language Technologies team at the National Research Council Canada as PostDoc research associate. Her research interests include machine translation as well as most other areas of computational natural language processing.

 

Maxim Khalilov (Booking)

Title: Machine translation at Booking.com: what’s next?

Abstract: For many years, machine translation (MT) was primarily focused on the post-editing scenario, in which MT serves as a productivity increase element of a professional translation pipeline. However, in e-commerce the most desirable application of MT is direct publishing of MTed content that dictates different requirements to MT and the MT quality evaluation model.

In this talk, Maxim Khalilov will discuss the Booking.com approach to MT and its evaluation. He will also cover some scenarios in which e-commerce can benefit from advancements in quality estimation and automatic post-editing.

Bio: Maxim Khalilov is a product owner – data science at Booking.com responsible for business aspects of scaled content product development. Prior to his current role, Maxim was a CTO at bmmt GmbH, an innovative German language service provider, an R&D manager at TAUS and a post-doctoral researcher at the University of Amsterdam. Maxim has a Ph.D. from Polytechnic University of Catalonia (Barcelona, 2009), an MBA from IE Business School (Madrid, 2016) and is the author of more than 30 scientific publications.

 

Marcello Federico (MMT Srl/FBK Trento, Italy)

Title: Challenges in Adaptive Neural Machine Translation

Abstract: Neural machine translation represents today the state of the art in terms of performance. However, its deployment in a real-life and dynamic scenario, where multiple users work on different tasks,  presents some important trade-offs and challenges.  In my talk, I will describe the development and deployment of adaptive neural machine translation within the ModernMT EU project, from phrase-based to neural machine translation.  Besides discussing the technological solutions adopted in ModernMT, I will  connect them to the underlying research efforts conducted at FBK in the recent years, including online-learning, automatic post-editing, and translation quality estimation.

Bio: Founder and CEO of MMT Srl, Trento, Italy.  Research director (on leave) and Affiliated Fellow at Fondazione Bruno Kessler, Trento, Italy.  Lecturer at the ICT International Doctoral School of the  University of Trento.  Co-founder and scientific advisor of MateCat Srl. Research interests:  machine translation, natural language processing, machine learning and artificial intelligence.

João Graça (Unbabel)

Title: Unbabel: How to combine AI with the crowd to scale professional-quality translation

Abstract: Unbabel is accelerating the shift towards a world without language barriers by enabling trustworthy, seamless and scalable translations between companies and their customers. In this talk we will show how we combine different Machine Learning techniques together with a crowd of non-professional translators and achieve professional-quality translations in an unprecedented speed and scale. We will also show how quality estimation is used in different steps of the pipeline.

Bio: João Graça is currently the CTO of Unbabel. He was previously the data scientist and natural language processing expert at Dezine and Flashgroup. João did his PhD in Natural Language Processing and Machine Learning at Instituto Superior Técnico together with the University of Pennsylvania with Professors Fernando Pereira, Ben Taskar and Luísa Coheur. He is the author of several papers in the area, his main research topics are machine learning with side information, unsupervised learning and machine translation. João is one of the co-founders of the Lisbon Machine Learning Summer School (LxMLS).

Marcin Junczys-Dowmunt (Microsoft Research)

Title: Are we experiencing the Golden Age of Automatic Post-Editing?

Abstract: In this talk I will describe the rise of neural methods in Automatic Post-Editing and why I believe that we might have reached a “Golden Age” of (neural) post-editing methods. This will be mostly based on the example of the recent WMT shared tasks on Automatic Post-Editing and my own contributions to that task. I will contrast current architectures with historic solution and will argue that only now — with the on-set of neural sequence-to-sequence methods — automatic post-editing has matured enough to have the potential for practical applications. However, there is a risk that this Golden Age might be very short lived and future results might be much less encouraging than the last two WMT shared task on APE might imply.

Bio: Marcin has been working in the Machine Translation team at Microsoft AI and Research — Redmond as a Principal NLP Scientist since January 2018. Before joining Microsoft he was an Assistant Professor at the Adam Mickiewicz University in Poznan, Poland, and a visiting researcher in the MT group at the University of Edinburgh. He also collaborated for many years with the World Intellectual Property Organization and the United Nations, helping with the development of their in-house statistical and neural machine translation systems. His main research interests are neural machine translation, automatic post-editing and grammatical error correction. Most of his open-source activity is being eaten up by his NMT pet-project Marian (http://github.com/marian-nmt/marian).

 


 

Want to read more about the conference?

Be our guest!

 

https://sites.google.com/view/loresmt/

Statistical and neural machine translation (SMT/NMT) methods have been successfully used to build MT systems in many popular languages in the last two decades with significant improvements on the quality of automatic translation. However, these methods still rely upon a few natural language processing (NLP) tools to help pre-process human generated texts in the forms that are required as input for these methods, and/or post-process the output in proper textual forms in target languages.

In many MT systems, the performance of these tools has great impacts on the quality of resulting translation. However, there is not much discussion on these NLP tools, their methods, their roles in different MT systems of diverse methods, and their coverage of support in the many languages of the world, etc. In this workshop, we would like to bring together researchers who work on these topics and help review/overview what are the most important tasks we need from these tools for MT in the following years.

These NLP tools include, but not limited to, several kinds of word tokenizers/de-tokenizers, word segmenters, morphology analysers, etc. In this workshop, we solicit papers dedicated to these supplementary tools that are used in any language and especially in low resource languages. We would like to have an overview of these NLP tools from our community. The evaluations of these tools in research papers should include how they have improved the quality of MT output.

TOPICS
We solicit original research papers, review papers as well as position papers on these tools in the
workshop. Multilingual and/or Cross-lingual NLP tools for MT of low resource languages are
especially welcome. Topics of the workshop include but not limited to:

IMPORTANT DATES
December 22, 2017: First call for papers
January 8, 2018: Second call for papers
February 4, 2018: Submission deadline of workshop papers
February 11, 2018: Notification of acceptance
February 16, 2018: Camera-ready papers due
March 21, 2018: LoResMT workshop

CONTACT
chaohong.liu@adaptcentre.ie

 


 

Want to read more about the conference?

Be our guest!

 

AMTA will be hosting a one-day Technology Forum and is looking for demonstrations of

In the following areas:

There is no exhibit fee. Exhibiters coming only for the Technology Forum will not need to pay registration fees.

AMTA will provide wifi, tables, chairs, draping, easels, and signs. You are also welcome to bring your own signage, although the hotel does not permit our taping or tacking materials to the walls. You will be responsible for bringing computers and monitors, as well as any brochures or papers you would like to distribute. AMTA will organize the rental of large monitors. If you would like a monitor, you can rent for one on the AMTA registration site for approximately $200. AMTA will arrange to have the monitor delivered to and picked up from your exhibit table. (The monitor rentals are on the same page as the banquet registration.)

This year, we are also going to provide opportunities for interested exhibiters to also provide more extensive demonstrations in a classroom setting. If you are interested in this option, please send us a short proposal to Jennifer DeCamp at jdecamp@mitre.org.

To apply for exhibiting in the Technology Forum, please send the attached form (AMTA_2018_Call_for_Participation_Technology_Showcase.docx) to Jennifer DeCamp at jdecamp@mitre.org. You are welcome to provide a logo and up to one additional page of information regarding the software.

 


 

Want to read more about the conference?

Be our guest!

 

Contact: Steve Richardson (tutorials@amtaweb.org)

AMTA 2018 is seeking proposals for tutorials on all topics related to MT. We are interested in tutorials presented by experts from the following categories:
– MT Researchers and Developers
– Commercial MT Users
– Government MT Users
– Translators using MT
Any themes connected to MT research, development, deployment, use, and evaluation are welcome. Past tutorials have included: Introduction to MT, Statistical MT, Advances in Neural MT, Interactive and Adaptive MT, Advances in Computer-Aided Translation, and using specific systems such as MateCat, ModernMT, and the Microsoft Translator Hub.

Important dates:

– Proposal submission deadline: Monday, December 11, 2017
– Notification: Friday, January 5, 2018
– Tutorial materials due date: Friday, March 2, 2018.

– Tutorial days: Saturday, March 17, 2018 or Wednesday, March 21, 2018.

NOTE: AMTA will not print out materials for tutorial organizers – if the organizers plan to distribute such materials, they should bring them to AMTA.

Guidelines:

Tutorials are a forum for experts in MT and MT-related areas to deliver concentrated training on a topic of interest in half-day (or occasionally full-day) teaching sessions. Please consult the accompanying AMTA2018_Tutorial_Proposal_Form.docx for a detailed description of appropriate formats, appropriate supporting material, and compensation for AMTA 2018 tutorials.

How to submit a tutorial proposal:

To submit your proposal, fill in the AMTA2018_Tutorial_Proposal_Form.docx and send it to the Tutorials Chair: Steve Richardson (tutorials@amtaweb.org) with the subject line: “AMTA 2018 tutorial proposal.”

After you have submitted your proposal, you can continue to edit your proposal submission until the submission deadline. If you have any questions about the Tutorials proposal submission process please contact the Tutorials Chair: Steve Richardson (tutorials@amtaweb.org) with the subject line: “AMTA 2018 tutorial proposal.”

 


 

Want to read more about the conference?

Be our guest!

 

Contact: Steve Richardson (workshops@amtaweb.org)

AMTA workshops are intended to provide the opportunity for MT-related communities of interest to spend time together advancing the state of the art, both in ideas and practices, in their area of endeavor. We are particularly interested in submissions related to commercialization of MT and/or its use by professional translators. However, any themes connected to MT research, development, deployment, use, and evaluation are welcome. Topics for past workshops have included MT post-editing, CAT tools, treatment of Semitic languages, translation of patents and scientific literature, collaborative translation, lexical resources, quality metrics for MT, and the impact of MT research on the translation industry.

Important dates:

– Proposal deadline: Monday, December 11, 2017
– Notification of workshop acceptance: Ongoing
– Camera-ready deadline for workshops with proceedings: Friday, February 16, 2018
– Workshop days: Saturday, March 17, 2018 or Wednesday, March 21, 2018.

NOTE: It is possible that workshops scheduled for March 17 may be organized jointly with the GALA 2018 conference, which ends on March 16.

How to submit:

Workshop proposals must be submitted to workshops@amtaweb.org. They should
include:
– Workshop title
– Dates for important milestones (call for papers, recruitment of speakers, etc.)
– Description of the workshop content, technical requirements
– Expected number of participants
– Whether this is an ongoing or new workshop, and
– A signed copy of the AMTA2018_Workshop_Proposal_Form.docx

 


 

Want to read more about the conference?

Be our guest!

 

Contacts:

Jen Doyon and Doug Jones (governmentmtusers@amtaweb.org)

Government and Military MT Stakeholders:

We encourage you to submit a proposal to speak at AMTA 2018 about your insights on research, development and operational use of MT and MT-related technologies in government and military settings. We especially encourage perspectives that challenge the broader MT community, including issues with implementing MT, whether on the technical side, the human side or both.

Important dates:

Topics of interest:

  1. MT as an operational tool for translation, analysis, information discovery.
  2. MT for “non-standard” language in chats, blogs and social networks.
  3. Evaluation of MT including estimation of ROI and human factors.
  4. MT research and development in government and military settings.
  5. Integration of MT into broader workflow, including case studies.
  6. Linguistic resources for MT, especially those hard to find or make.
  7. Neural MT opportunities and challenges.
  8. MT in Humanitarian Assistance / Disaster Relief contexts.
  9. Challenges, opportunities and insights into MT needs for the government and military.

Submit your abstract:

Initial submissions should be 250-500 word abstracts.  The following should accompany each abstract submission:

  1. Presentation Title
  2. Presenter Name
  3. Representing Organization
  4. Email Address
  5. Phone Number

All accepted submissions will be allotted 30-minute time slots. 

If you have original software that you would like to show, you may also consider submitting a proposal for the technology showcase.

Publication:

While not mandatory, presenters wishing to have their submissions published in the AMTA 2018 Proceedings are required to produce papers in accordance with the MT Research Track submission guidelines. Slide decks may also be accepted. While only abstracts are required to be submitted by the initial submission date, only papers or slide decks will be accepted by the final camera-ready date for publication in the Proceedings.

How to submit:

Please email your abstract to the Government/Military MT Stakeholders Chairs (governmentmtusers@amtaweb.org) by Monday December 11, 2018.

 


 

Want to read more about the conference?

Be our guest!

 

AMTA 2018 | Proceedings for the Conference, Keynotes, Workshops and Tutorials

by Mike Dillinger | March 21, 2018

Main Conference Research Track Download (2.7 MB) Commercial and Government Tracks Download (28.4 MB)   Keynotes Arianna Bisazza – Leiden University Research Keynote | Unveiling the Linguistic Weaknesses of Neural MT Download (4 MB) Macduff Hughes – Google Commercial Keynote | Machine Translation Beyond the Sentence Download (2.7 MB) Carl Rubino – IARPA Government Keynote […]

AMTA 2018 Read more...

AMTA 2018 | Tutorial | Getting Started Customizing MT with Microsoft Translator Hub: From Pilot Project to Production

by Mike Dillinger | January 30, 2018

Develop an Effective MT Customization Pilot Project Learn strategies to plan and carry out an effective pilot project to train a customized MT engine and learn tips to evaluate the MT pilot project against your goals so you can move it toward production. Participants will know how to plan a pilot project, select appropriate training […]

AMTA 2018 Read more...

AMTA 2018 | Workshop | The Role of Authoritative Standards in the MT Environment

by Mike Dillinger | January 30, 2018

In this workshop, we will bring together experts from across the standards community, including from the American Society for Testing and Materials (now just “ASTM International”), the American National Standards Institute (ANSI), the International Organization for Standardization (ISO), the Globalization and Localization Association (GALA), and the World Wide Web Consortium (W3C). These experts will discuss authoritative standards that […]

AMTA 2018 Read more...

AMTA 2018 | Tutorial | ModernMT: Open-Source Adaptive Neural MT for Enterprises and Translators

by Mike Dillinger | January 30, 2018

Nowadays, computer-assisted translation (CAT) tools represent the dominant technology in the translation market – and those including machine translation (MT) engines are on the increase. In this new scenario, where MT and post-editing are becoming the standard portfolio for professional translators, it is of the utmost importance that MT systems are specifically tailored to translators. […]

AMTA 2018 Read more...

AMTA 2018 | Tutorial | MQM-DQF: A Good Marriage (Translation Quality for the 21st Century)

by Mike Dillinger | January 30, 2018

In the past three years, the language industry has been converging on the use of the MQM-DQF framework for analytic quality evaluation. It emerged from two separate quality-evaluation approaches: the European Commission-funded Multidimensional Quality Metrics (MQM) and the Dynamic Quality Framework (DQF) from TAUS. Harmonized in 2015, the resulting shared hierarchy of error types allows […]

AMTA 2018 Read more...

AMTA 2018 | Tutorial | A Deep Learning curve for Post-Editing

by Mike Dillinger | January 30, 2018

Does post-editing also require a deep learning curve? How do the neural networks of post-editors work in concert with neural MT engines? Can post-editors and engines be retrained to work more effectively with each other? In this tutorial, we demystify the process, focus on the latest MT developments and their impact on post-editing practices. We […]

AMTA 2018 Read more...