MT for sign languages is a very broad topic. To understand sign language we use neural models built with a large amount of generic pose images and videos, which we adapt to the use-case of sign language. To translate the output of this first step, we use a multilingual neural model which we call the InterL. This model was originally built for spoken languages and we now use it to translate between spoken languages too. The model is based on the well-known BART [https://arxiv.org/abs/2001.08210
] which we have fine-tuned to get better performance for the sign and spoken languages in SignON and the use-cases we want to address. To generate audio we use a text-to-speech software from the acapela-group (www.acapela-group.com
) and to generate sign language, a series of linguistically motivated transformations use information drawn from the InterL to synthesise our avatar which one utterance at a time.