oai:arXiv.org:2406.11049
Computer Science
2024
19-06-2024
Historically, sign language machine translation has been posed as a sentence-level task: datasets consisting of continuous narratives are chopped up and presented to the model as isolated clips.
In this work, we explore the limitations of this task framing.
First, we survey a number of linguistic phenomena in sign languages that depend on discourse-level context.
Then as a case study, we perform the first human baseline for sign language translation that actually substitutes a human into the machine learning task framing, rather than provide the human with the entire document as context.
This human baseline -- for ASL to English translation on the How2Sign dataset -- shows that for 33% of sentences in our sample, our fluent Deaf signer annotators were only able to understand key parts of the clip in light of additional discourse-level context.
These results underscore the importance of understanding and sanity checking examples when adapting machine learning to new domains.
Tanzer, Garrett,Shengelia, Maximus,Harrenstien, Ken,Uthus, David, 2024, Reconsidering Sentence-Level Sign Language Translation