The pressures to publish research data have increased considerably in recent years, and there is a growing call for the building of larger research infrastructures.
At the same time European legislation to protect personal data has become more restrictive, however. Amid these cross-pressures, researchers have partly ended up in a limbo between the aspirations of science policy and privacy regulations.
How to publish material that essentially and inevitably necessitates disclosure of personal data?
This question is relevant to the research of Finnish sign languages, which is nationally coordinated by the Sign Language Centre at the University of Jyväskylä.
For the last six years, the centre has been collecting a large set of machine-readable sign language material, the Corpus of Finland’s Sign Languages. In linguistics research, a corpus has traditionally referred to a text bank of billions of words.
However, because sign languages are not written, the sign language corpus is based on video material, so video is a natural solution. However, the videos also show the sign language user’s face, which is interpreted as direct personal data.
Contemporary culture is characterized by a vast amount of moving images in different forms. This is why it is almost incomprehensible how little explicit information is available about research materials based on moving images and about the requirements pertaining to their archiving and publishing.
It is not the duty of one individual researcher to fill this gap. Instead, it requires policies made in collaboration with different parties. This provides a good opportunity to build a positive profile within our own university as well.
The Corpus of Finland’s Sign Languages includes more than 700 hours of video recorded discussions and narration by a total of 104 sign language users. The corpus forms a data infrastructure without which current sign language research would no longer be credible.
At the same time, the corpus is also a permanent document of the Finnish and Finland-Swedish sign language and culture, both of which are still under high change pressures in our society.
The efforts made so far to publish the Corpus of Finland’s Sign Languages can be described by a sentence borrowed from J.R.R. Tolkien’s Lord of the Rings trilogy: “Yet such is oft the course of deeds that move the wheels of the world: small hands do them because they must, while the eyes of the great are elsewhere.”
Help for our contribution to these heroic deeds has come from the language database of the FIN-CLARIN Consortium. It has constantly encouraged us to identify and create practices by which the privacy of the people shown on the videos can be protected without abandoning the publication aims.
Actually, if everything goes well, the first fifth of our corpus will be available through the language database by the end of February 2019.
Professor, Department of Language and Communications Studies