The Translation API Cases and Classes: Project Statement (TAPICC)

Authored By: Serge Gladkoff (Logrus Global LLC) and Klaus Fleischmann (Kaleidoscope GmbH)

With Contributions From: Achim Ruopp (TAUS), David Filip (ADAPT Centre, Trinity College Dublin), and Kevin O’Donnell (Microsoft)

Join the discussion about this project in our TAPICC Connect Group.
 


Introduction

The term “digital transformation” is overused in many industries, and yet it is ever-relevant to our sector.  For the translation industry, digital transformation does not simply mean machine translation or smarter CAT tools. While MT is currently getting a lot of press through its use of AI technologies,(11, 12) the industry is looking for performance and efficiency gains that are far above and beyond what can be expected from MT or CAT in the near future. (3, 4, 13) 

It has been said already that the biggest productivity improvement lies in the area of workflow automation. It has also been said that workflow automation and other systems as a service are most easily and in large numbers coming from the cloud. However, true digital transformation will come from connecting a company’s workflow to the translation market directly, without repetitive and redundant human steps and breaks in the media chain. Rather than performing automatable tasks which are not a challenge for the human intellect, we envision an automated process where human project managers will be standing by a virtual conveyor belt, supervising robotic operations rather than doing data entry or file-moving jobs.

In order to allow systems to talk to each other and exchange both data and workflow information, we must put more emphasis on the great potential of APIs for the translation industry.

API is the developer-designed protocol and interface of communication of a certain system with the “outer world”. While this sounds great, the issue is that an “API” in itself is not a common standard. On the contrary, developers understand this term as the name for very specific sets of programming calls of concrete, particular, and often proprietary systems. And each vendor exposes as much of their unique tool functionality as they deem reasonable to other developers. For this reason, there is currently no unified, standard API for the translation industry.

As a consequence, there are just as many APIs in the world as there are systems. This “Babylonian API confusion” makes it difficult not only for LSPs and end clients in our industry, it also poses a challenge to tool providers, since they need to constantly test and adapt APIs to other systems on an exponential growth curve. This causes significant headaches, time, and financial resources to industry agents and practitioners, and unfortunately each one is currently trying to find the answer by himself and alone, only sometimes polling vendor ecosystems or participating in similar projects with the goal of collecting (and not so much sharing) information – and in total disconnect from research.

It should therefore be in the interest of all parties (LSPs, clients, tool providers) to establish some API standard which allows us to reduce the efforts required to connect our systems.

But how do we tackle this problem?

Our New Attempt

In the past there have been several attempts to address these challenges. You may remember the Linport project led by Alan Melby and supported by GALA. There was also the TIPP project, and there have been numerous APIs published by different vendors such as SDL or Welocalize (normally subsets of their own APIs primarily to connect LSPs to their systems).  Meanwhile, in Germany for example, the producers of component-based content management systems united to create a common translation interchange standard (COTI).

TAUS made a courageous attempt to tackle the issue by starting the TAUS API project in 2012 (originally conceived by Brian McConnell): The goal was to support all file format standards present in the industry, and the initiative saw some adoption from language service providers and localization teams. There was also a “wish list” that was gathered on interoperability at the TAUS Annual Conference 2014.(10) The project also published a test implementation of the API and Microsoft did some internal derivative work.

Our industry’s initiatives to solve the API conundrum strangely feel like Isaac Asimov’s science fiction novel, “The End of Eternity.”  In that plot, safely-isolated time travelers carefully enact changes to reality to improve the overall happiness of humanity, and yet by doing so they introduce changes that kill humanity’s desire to build spaceships for interstellar travel. The topic of APIs pops up again and again, and yet we have not succeeded, despite the clear demonstration that there is a dire need for coordination.

Generally, we know that APIs are a hot topic – as of this writing the news came out that Google acquired Apigee for $625 million. And according to Apigee statistics, the number of API calls shows large increases in API traffic – enterprises with public API programs saw their API traffic almost triple year over year.(7, 8) In our own industry Diego Bartolomé, forward-looking enthusiast of APIs, wrote a detailed explanation of the benefits for our industry, too.(9)

A New Start

This is why GALA, TAUS and LT-Innovate, all three quite prominent industry associations, have now launched a new joint project called TAPICC (Translation API Cases and Classes). The first steps of this new project are:

  • To collect and document all existing API initiatives and groundwork research
  • Identify additional use cases that are not covered yet by research, development or standardization
  • Catalog and classify APIs into basic “use cases” which are shared by all LSP, clients, and tools
  • Prioritize and start to manage the standardization development step by step

The goals are:

  • To provide a basic subset of API use cases and actual API classes along with sample implementations which all the parties in our industry can base their developments on.
  • To enable these standard API classes to become a reliable and stable, interoperable “middleware” between tool providers, end clients, and LSPs in a new ecosystem where all parties can concentrate on their particular specialties without having to worry about compatibility for the most “basic” use cases.

What is different this time?

Our objective is to bring all parties together and focus on practical approaches. The aim is to provide a basic standard which makes systems and agents interoperable on a basic level, leaving room for further sophistication and advanced proprietary functionality on top. 

GALA has agreed to be the project coordinator. Member leaders supporting the revival of this API project are Serge Gladkoff (GALA Ambassador), Klaus Fleischmann (GALA Board) and Laura Brandon (GALA Executive Director); as well as Achim Ruopp from TAUS and David Philip from Trinity College Dublin – ADAPT Research Centre.

GALA will host the group on the GALA Connect community as a communication platform, as well as a GitHub account: https://github.com/GALAglobal/translationapi

First Steps

An initial meeting was held at the end of July 2016 with the representative working group consisting of prominent industry experts from Microsoft, SAP, Google, Mozilla, Enlaso, tauyou, Kaliedoscope, Logrus, Lilt, and others.

There was a good discussion about how to reignite this work.

The new group has already identified a new angle to the problem: during LocWorld Dublin, Serge Gladkoff and David Filip established a core need for the TAUS API:  compliance with XLIFF 2.0 Object Model.  Focusing on this key development is a near-term goal for the project to help with wider adoption.  XLIFF 2.0 is THE only global standard for the translation industry, and therefore any API project must ensure compliance with XLIFF 2.0 Object Model.

Also, David Filip has pointed out that there are at least two committees doing standards work along these lines – OASIS XLIFF Object Model and Other Serializations (XLIFF OMOS) TC and OASIS Content Management Interoperability Services (CMIS) TС.(6) The latter specifically addresses the issue of creating a common standard for connecting CMS Systems. (CMIS doesn't have localization in scope at all, but it's the standard we should look at if we want to target standardized interchange with the CMS area.) UBL is the standard to look at for business data models and data exchange templates.

Clearly, GALA or LT-Innovate are not standards bodies and are not going to duplicate what OASIS is doing. In general, we should not reinvent data models and exchange formats anywhere those already exist, be it UBL, CMIS, TBX, or XLIFF. At the same time, nobody should try to reinvent business data exchange if the community decides that such is in scope of the TAPICC effort.

So how do we match the needs of various constituencies and their capabilities to create value for everybody in the industry?

Here’s what we decided that GALA, TAUS, LT-Innovate, and FEISGILTT (Federated Event for Interoperability Standardization in GILT Technologies) can do.

The TAPICC Project Mission

The TAPICC project mission is to connect forward-looking standards development work and industry practitioners. The industry will provide the experts and developers with descriptions of their use cases and needs, and the developers and researchers will provide recommendations and information about cutting-edge technologies. Together we will try to catalog, prioritize, and define basic functionalities which a common API standard can fulfill. We will endeavor to fill this standard with life, meaning, actual cases, and code on Github to take and implement.

In order for this promise to be guaranteed, clear policy and content ownership of the project need to be defined.  GALA, providing the platform for discussions and assistance in project coordination, will not own the deliverables, but will simply act as a facilitator. The group agreed to ensure it is safe to participate, implement, and contribute to open-source. David Filip has championed the intellectual property rights model called Royalty Free (RF) and Non-Assertion. This requires that collaborating parties sign a Non-Assertion Covenant, making the open source development safe for all partiesTraditional RF licenses, often negotiated by lawyers, will be replaced by the Non-Assertion Covenants.) This will be also detailed further down the road.

Of course, for the project to take off, we need to define better in-scope and out-of-scope topics for the working group. We should not underestimate the importance of the requirements-gathering phase for setting and prioritizing the scope of the TAPICC effort. In order for this to happen, the following first steps have been identified:

  • Bring all knowledge on this subject together as concise, clear, and simple as possible. Establish what has been done and where current research and development is taking place.
  • Work on soliciting clear use cases from industry practitioners with the goal of defining most acute and common industry needs to work on future steps of this project.

So where do we stand now?

Your Participation is Needed

The first step, as outlined above, is a brainstorming and material-gathering step. For this, we have opened up three resources:

  1. A group on the GALA Connect Platform as a central hub of communication, discussion and resource collection.
  2. The github account which currently hosts the TAUS API resources.
  3. A whole afternoon of the FEISGILTT session at LocWorld Montreal to enable everyone who has something to share to come forth and share.

The input we are looking for at this stage is:

  • What needs (“Use Cases”) do you have in the area of APIs for workflow automation and/or connecting systems with each other?
  • What experience have you gathered so far?
  • Are you aware of any existing APIs which are open and could be included in a standard?
  • If you have implemented API connections to external systems, can you please describe the technology choices you made?
  • If you implemented API connections to external systems, can you please share your productivity improvement story?
  • What is it that you would like to achieve with standardized APIs: productivity improvement, avoidance of media breaks, better interoperability, system independence?
  • What is your wish list for the technology providers?
  • As a technology provider, what is it that you would like to know from practitioners?  What would motivate you to adopt a new industry-standard API format?

Of course, these are a lot of open-ended questions that require plenty of deeper discussions; that’s why all substantive answers will be considered as significant contributions to the project and all their authors will be acknowledged as significant contributors.

As you see, our goal is to offer a win-win-win scenario for all industry stakeholders by offering concrete take-aways, reducing the burden of the API providers, increasing interoperability and making our entire process smoother. But we depend on contribution and collaboration to make it work.

Call to Action

As an immediate call to action, please consider one of the following ways of collaborating:

  • Join us in Montreal at the FEISGILTT event dedicated to this topic. Send us your proposal to present your case study, project, or needs. Participate in the discussion.
  • If you do not wish to give a FEISGILTT presentation or cannot make it to Montreal, please submit your use cases and thoughts through the GALA Connect Group, or feel free to communicate directly with the project leader: [email protected].

 

Sources:

[1] https://en.wikipedia.org/wiki/Digital_transformation

[2] https://www.theguardian.com/media-network/media-network-blog/2013/nov/21/digital-transformation

[3] http://robertfortner.posthaven.com/take-two-ibms-watson-portent-or-pretense

[4] http://robertfortner.posthaven.com/rest-in-peas-the-unrecognized-death-of-speech

[5] https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=xliff-omos

[6] https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=cmis

[7] http://apigee.com/about/tags/digital-transformation

[8] The State of APIs, Apigee 2016 report (available free of charge after registration).

[9] Bringing APIs to the Translation Industry, Diego Bartolome, Multilingual Computing, July/August 2016.

[10] The interoperability panel summary at the TAUS Annual Conference 2014. https://www.taus.net/think-tank/articles/event-articles/translation-supply-chain-interoperability-through-apis

[11] http://www.statmt.org/wmt16/

[12] https://slator.com/technology/neural-machine-translation-can-unlock-europes-digital-single-market/

[13] The Google Neural Machine Translation Marketing Deception, Kirti Vashee, http://kv-emptypages.blogspot.ru/2016/09/the-google-neural-machine-translation.html