Cassie looks great, but few would mistake her for anything but a super-advanced avatar. So why would anyone ask such an odd question? Investigation reveals that people are not asking about authenticity of Cassie but about the sign language. They have been bitten by fake-it-till-you-make-its claiming to have cracked the problem of translating between spoken languages and sign languages. But in reality they are no more high-tech than recording human-beings and then either mapping their performances directly onto the computer, or applying filters to make the video appear to be a sequence of computer-generated images.
So what?
This Monkey-See, Monkey-Do approach can deliver beautiful, convincing results. Of course it can, it’s exactly reproducing translators’ or interpreters’ performances. But translators and interpreters are in critically short supply, so employing them in this way carries an opportunity cost for those who prefer British Sign Language: when they are making these Max Headroom translations, what critical work are they not doing?
The translators may need to dedicate more time and energy to support this type of work than they would to simply be in front of a video camera. And that makes it less productive and more expensive. How can these companies compete? Maybe they are living off grants, or convincing less-canny investors that they are this close to being able to pull it off for real. Neither of those is a long-term strategy that your organisation should depend upon. Delivering translations at scale requires real artificial intelligence, not artificial artificial intelligence.
Deep Fakes and generative AI can be harnessed to produce incredible results. However, they are at their best when they aren’t constrained by specific needs, like exact facial expressions, hand gestures, or even an accurate numbers of fingers!
An alternative to captions
At the heart of everything we do at Robotica is the unshakeable belief that humans are best: no matter how incredible AI may be, it isn’t a substitute for a person. The warmth, empathy and trust of interacting with a human being is irreplaceable. With that in mind, we always encourage use of Deaf translators or RSLI interpreters as the default choice: Nobody wants a computer explaining a court ruling or giving them a medical diagnosis. When a human translation is a realistic possibility, hire a person for the job. And for everything else, there’s Robotica.
We fill in the gaps, where translators are not available, or when interpreters may not be the best fit. We make sign language translations that would not otherwise get made.
There simply are too few translators and interpreters, nowhere near the numbers needed to keep up with the accelerating rate of content creation. Every single minute of every day, 8 new books are published. In that time, 500 hours of videos are uploaded to YouTube, and 175 new websites are made - an average of 2,000 pages per minute. With 1,000 times as many translators, people who use sign languages would still be denied almost all of the information and entertainment that most of us take for granted.
Reading isn’t a great option for everyone, including 87,000 people in the UK whose first or only language is British Sign Language. Before Robotica, the choices faced by d/Deaf people for these gaps was “captions or do without?”. Cassie and our other avatars add a third possibility - AI sign language.