Tutorial Presenters: Nicola Ueffing (eBay MT Science), Pete Smith (University of Texas Arlington) and Silvio Picinini (eBay Localization)

Target audience: any person involved with the deployment of MT that is interested in the quality of the corpora data that will be at the foundation of the MT deployment. Roles include managers, linguists, engineers or scientists.

The quality of the corpora that trains MT systems has not been a prominent topic of discussion, if compared to MT technology. Most research uses the same, well- established corpora so that results can be reproduced. However, corpora can strongly determine the quality of the MT output. Neural MT relies more on data quality than previous technologies. Therefore, we thought that the time has come to take a deeper look into the corpora quality. We will look at this from a science view and from a linguist view, exploring current and future roles in the evolving MT scenario.

From the eBay experience, participants will learn and discuss:

Considering that data curation of corpora may become a task for a Language Professional, learn from Univ. of Texas what the profile of a Language Professional could be:

 


Want to read more about the conference?

Be our guest!

AMTA and IAMT are proud to announce the following keynote speakers for MT Summit XV in Miami:

 

 

 

 

Come to MT Summit XV to see all of these events:

These presentations were accepted for the Commercial MT Users & Translators track:

Farkhat Aminov Yandex.Translate approach to the translation of Turkic languages
Nikhil Bojja Machine Translation in Mobile Games – Augmenting Social Media Text Normalization with Incentivized Feedback
Robin Bonthrone, Konstantin Lakshin Why are we (still) waiting? What premium translators need to use MT effectively.
Fang Cai Productivity Promotion Strategies for Collaborative Translation on Huge-Volume Technical Documents
Dick Csaplar An Advanced Engine for New and Improved MT Results
Jennifer DeCamp Machine Translation and Terminology Management
Alan K. Melby Quality Evaluation of Four Translations of a Kidney Document: focus on reliability
Tomoki Nagase, Tatsuhiro Kudoh, Katsunori Kotani, Wenjun Ye, Takeshi Mori, Yoshiyuki Sakamoto, Nobutoshi Hatanaka, Takamitsu Takeda, Shu Hirata, Hiromi Nakaiwa A Survey of Usage Environment of Machine Translation by Professional Translators
Morgan O’Brien, Ana Duarte Machine Translation for enterprise technical communications – a journey of discovery
Elaine O’Curran MT Quality Evaluations: From Test Environment to Production
Craig Plesco, Nestor Rychtyckyj Enterprise Application of MT: Progress and Challenges
Juan Rowda A Linguist’s Approach to Quality Estimation
Achim Ruopp Industry Shared Metrics with the TAUS Dynamic Quality Dashboard and API
Carla Schelfhout Accurately Predicting Post-editing Time & Labor for Cost-Management
Mark Seligman, Mike Dillinger Adjusting Interaction Levels in a Speech Translation System for Healthcare
Jean Senellart Beyond Text, Machine Translation and NLP for e-discovery
José G. C. de Souza, Marcello Federico MT Quality Estimation for E-Commerce Data
Tanvi Surti User Experience on Skype Translator
Pablo Vazquez & Edith Bendermacher Setting up MT at NetApp – challenges and learnings while implementing MT
Kristina Vulgan Preparing Your Suppliers for Machine Translation
Alex Yanishevsky How Much Cake to Eat: The Case for Targeted MT Engines

 

Friday, October 30

Morning Session

Introduction to Machine Translation — Jay Marciano, Lionbridge
Target Audience:  Translators and other translation industry professionals
This tutorial is for people who are beginning their journey with machine translation and want an overview of what it is, how it works, how it can be used, and whether it can fulfil their needs. No previous knowledge of machine translation is assumed, and all levels of skepticism are welcome. The focus will be on providing background knowledge that will help you get more out of the rest of MT Summit XV and make more informed decisions about how to use or invest in machine translation. Past participants have ranged from translation professionals who want to understand changes in their field to corporate executives who are evaluating technology strategies for their organizations. The main topics for discussion are common questions about MT (What is MT and how does it differ from other translation technologies? How well can machines really translate? What’s the latest and greatest in MT?), the quality of the translations it produces (Why is the output sometimes so bad? How can the quality be improved? Can translation quality be measured objectively?), its application (What is MT good for? Can it improve a translator’s efficiency? What are the implications of this technology for translators?), return on investment (Is “free” MT really free? How much does MT really cost?), and steps to deployment (Which MT system should we use? What do we do next?). You will leave this tutorial with the tools you need to take part in an informed discussion of MT.
Bio: Jay Marciano has been the Director of Real-Time Translation Development at Lionbridge Technologies since 2010 and is currently serving on the Board of Directors of AMTA. Prior to joining Lionbridge, he worked for SDL (2001-2010), where he was responsible for the development and refinement of natural language processing software. He has also worked as a product manager and development lead at Transparent Language, where he was on the team that launched FreeTranslation.com in 1999, a lecturer in the English Department at the University of Bonn, Germany, and as an editor at Houghton Mifflin on the staff of the American Heritage Dictionary.
Computer Aided Translation: Advances and Challenges — Philipp Koehn, Johns Hopkins
Target Audience:  MT Researchers & developers
Moving beyond post-editing machine translation, a number of recent research efforts have advanced computer aided translation methods that allow for more interactivity, richer information such as confidence scores, and the completed feedback loop of instant adaptation of machine translation models to user translations.
 This tutorial will explain the main techniques for several aspects of computer aided translation: confidence measures, interactive machine translation (interactive translation prediction), bilingual concordancers, translation option display, online adaptation, eye tracking, logging, and cognitive user models. For each of these, the state of the art and open challenges are presented.
The tutorial will also look under the hood of the open source CASMACAT toolkit that is based on MATECAT, and available as a “Home Edition” to be installed on a desktop machine. The target audience of this tutorials are researchers interested in computer aided machine translation and practitioners who want to use or deploy advanced CAT technology.
Bio: Philipp Koehn, PhD, Professor, Department of Computer Science, Whiting School of Engineering, Johns Hopkins University

Afternoon Session

Using Microsoft Translator Hub and Collaborative Translator Framework — Chris Wendt
Target Audience:  Translators and other translation industry professionals
This is a hands-on introduction to using a publicly available SMT system and translation environment. We will discuss the concepts using Microsoft Translator Hub and the Microsoft Translator API as examples. Many of the concepts are transferable to other MT systems, but other MT systems will not be discussed in detail.
Agenda: Short overview of statistical MT, Overview of the Microsoft Translator web service and API, Probabilistic models and their function, The power of customization, The Translator Hub: Walk through an engine customization, Training data, suitable and unsuitable documents for training, Domains, Test and tuning sets, Collaborative features in the Translator Hub, Collaborative features in the Translator API and web site widget, Continuous retraining, or one-time shots, Deploying your customized system, Using your customized system in applications, TMSs and CMSs, Using your customized system in your own code
Bring your problems, questions and issues. We’ll be able to discuss them and come up with a good solution, which will be helpful for the other attendees. OK if you don’t. Best if you can bring a laptop and if you have translation memory or previously translated documents with at least 5000 segments, but no problem if you don’t have one.
Bio: Chris Wendt graduated as Diplom-Informatiker from the University of Hamburg, Germany, and subsequently spent a decade on software internationalization for a multitude of Microsoft products, including Windows, Internet Explorer, MSN and Windows Live, bringing these products to market with equal functionality worldwide. Since 2005 he has led the program management team for Microsoft’s Machine Translation development, responsible for Microsoft Translator services, including Bing Translator and Skype Translator, connecting Microsoft’s research activities with its practical use in services and applications. He is based at Microsoft headquarters in Redmond, Washington.
Neural Networks in MT — Kyunghyun Cho, NYU
Target Audience: MT Researchers & developers
 Neural machine translation is a new approach to MT, where a single, large neural network is trained to maximize translation performance. This radical departure from existing (phrase-based) machine translation approaches was proposed independently by several research groups during the last two years (2013-2014). This tutorial will present a detailed description of the two dominant models in neural machine translation; discuss a number of fundamental differences to conventional statistical machine translation; cover a series of recent improvements that have made neural machine translation competitive against phrase-based statistical machine translation; and present future directions of neural machine translation.
 Bio: Kyunghyun Cho is an assistant professor in the Department of Computer Science, Courant Institute of Mathematical Sciences and the Center for Data Science at New York University (NYU) from September, 2015. Previously, he was a postdoctoral researcher at the University of Montreal under the supervision of Prof. Yoshua Bengio after obtaining a doctorate degree at Aalto University (Finland) in early 2014 under the supervision of Prof. Juha Karhunen, Prof. Tapani Raiko and Dr. Alexander Ilin. Kyunghyun’s main research interests include neural networks, generative models and their applications, especially, to natural language understanding.

Tuesday, Nov. 3

Afternoon Session

Beyond Post-editing: Interactive and Adaptive Translation Technologies — Spence Green, lilt.com
Target Audience:  Translators and other translation industry professionals; MT developers
Post-editing is an increasingly common mode of machine-assisted translation. Case studies in both academia and industry have recently shown that machine translations can reduce translation time and effort for some domains and language pairs. However, there are two reasons to look beyond current implementations of post-editing. First, the commercial systems most commonly used—Google Translate and Microsoft Translator—are tuned for coverage rather than specific localization settings. Second, it has been known since the 1960s that post-editing is a poor user experience.
Interactive and adaptive translation tools were conceived decades ago to address these shortcomings. Until recently, implementations of these ideas were not widely available outside of the research community. This tutorial will introduce tools and techniques that augment the human translator, offering the potential for significant productivity improvements. The material will be accessible to both translation professionals and those interested in new translation technology. There will be four parts: (1) The history of and motivation for interactive translation techniques, (2) the translator’s workflow and where machine intervention can be effective, (3) interactive and adaptive translation technology, (4) practicum: introduction to some tools.
Bio: Spence Green recently completed a PhD in computer science at Stanford University supervised by Chris Manning. He was given a Best Paper award at CHI 2013 for his work on mixed-initiative translation. He holds an MS in computer science from Stanford and a BS in computer engineering from the University of Virginia. Currently he is a co-founder of Lilt, a provider of interactive translation systems.

Precision Translation Tools announces the release of Slate, the first packaged SMT toolkit for native Windows x86-64 operating systems.

Note: “native” means without Cygwin. There is also a parallel Slate package for Linux. Packages include command-line utilities from Moses, MGIZA++ and PTTools necessary to train, tune and evaluate phrase and phrase-factored SMT models. You can find detailed specs about the packages, where to get them at this URL and the “More about Slate for Windows” link at the bottom of the page:

http://www.precisiontranslationtools.com/slate/
http://www.precisiontranslationtools.com/downloads/slate-version-1-0-for-windows/

AMTA 2020 | Announcing Conference Keynote Speakers

by Darius Hughes | March 18, 2020

We are pleased to announce that the following machine translation experts from the research, commercial, and government sectors will be giving keynote presentations at the conference: Research Keynote Speakers Colin Cherry – Google Research Colin Cherry is a Research Scientist at Google Translate in Montreal. Previously, he was a Senior Research Officer at Canada’s National […]

AMTA 2020 Read more...

AMTA 2020 | 2nd Call for Student Research Workshop Papers

by Darius Hughes | March 18, 2020

The 14th biennial conference of the Association for Machine Translation in the Americas has been rescheduled to OCTOBER 6-9 and will be held as a virtual conference using Microsoft Teams, a powerful, enterprise collaboration platform. It was previously scheduled from September 8-12 in Orlando, Florida.  This year, AMTA will hold a Student Research Workshop together […]

AMTA 2020 Uncategorized Read more...

AMTA 2020 | Announcing: Registration Details and Program Outline

by Darius Hughes | March 14, 2020

AMTA 2020 – VIRTUAL October 6-9 Registration for AMTA 2020 will open soon. Fees are below. The 14th biennial conference of the Association for Machine Translation in the Americas. The single registration fees listed in the table below will include attendance at all tutorials and workshops, as well as at all presentations in the three […]

AMTA 2020 Read more...

AMTA 2020 | 2nd Call for Papers, Presentations, Workshops, and Tutorials

by Darius Hughes | March 4, 2020

The 14th biennial conference of the Association for Machine Translation in the Americas.  September 8-12 Sheraton Orlando Lake Buena Vista Resort Orlando, Florida, USA AMTA conferences are unique in bringing together MT researchers, developers, and users of MT technology from academia, industry, and government. This year AMTA 2020 we have arranged for a spectacular venue […]

AMTA 2020 Read more...

AMTA 2020 | Be a Sponsor or Exhibitor at the AMTA 2020 Conference

by Darius Hughes | March 1, 2020

Sponsorship Prospectus for AMTA 2020 – October 6-9, 2020 Virtual Conference All sponsorship levels receive: Logo with hyperlink to sponsor website on conference website: http://www.conference.amtaweb.org/ Acknowledgement at conference opening w/name and logo displayed at opening and closing sessions Contact Ray Flournoy, sponsorship chair: sponsorship@amtaweb.org Sponsorship Level Sponsors at this level receive Enthusiast – $1,000 One […]

Uncategorized Read more...

AMTA 2020 | 1st Call for Papers, Presentations, Workshops, and Tutorials

by Darius Hughes | January 25, 2020

The 14th biennial conference of the Association for Machine Translation in the Americas.  September 8-12 Sheraton Orlando Lake Buena Vista Resort Orlando, Florida, USA The 2nd Call for Papers has been published and the information below may be out of date, please click here to go to the newer version. AMTA conferences are unique in […]

AMTA 2020 Read more...