Publications & Conference Papers
2022
Conferences
- Lange, Herbert (2022): Semi-automatic quality assurance for audiovisual corpus data. CLT seminar. 20.10.2022, Göteborg, Schweden.
- Lange Herbert, Aznar, Jocelyn & Isard, Amy (2022): Demonstrating an Automatic Gloss Checker for Annotated Corpora. "Where Do We Need To Go From Here?" - Language Documentation and Archiving during the Decade of Indigenous Languages. Conference and training sessions, Berlin, 3-7 October 2022. 05.10.2022, Berlin.
- Aznar, Jocelyn & Lange, Herbert (2022): Training Session: Improving Corpus Quality in Language Documentation : Introduction to QUEST and the RefCo process. "Where Do We Need To Go From Here?" - Language Documentation and Archiving during the Decade of Indigenous Languages, Conference and training sessions, Berlin, 3-7 October 2022. 04.10.2022, Online.
- Isard, Amy (2022): Approaches to the Anonymisation of Sign Language Corpora. 10th Workshop on the Representation and Processing of Sign Languages: Multilingual Sign Language Resouces. 25 June 2022. Marseille.
- Aznar, Jocelyn & Lange, Herbert (2022): RefCo and its Checker: Improving Language Documentation Corpora’s Reusability Through a Semi-Automatic Review Proces. In: Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022). 20-25 June 2022, Marseille. P. 2721–2729. http://www.lrec-conf.org/proceedings/lrec2022/LREC-2022.pdf
Publications
- Wamprechtshammer, A., Arestau, E., Aznar, J., Hedeland, H., Isard, A., Khait, I., Lange, H., Majka, N. & Rau, F. (2022): QUEST. Guidelines and Specifications for the Assessment of Audiovisual, Annotatated Language Data. In: Working Papers in Corpus Linguistics and Digital Technologies: Analyses and Methodology. Vol. 8 (2022). https://ojs.bibl.u-szeged.hu/index.php/wpcl/issue/view/2474/382.
- Rau, F., Majka, N. & Schwiertz, G. (2022). Metadata Recommendations for Audio-Visual Language Data. DOI: 10.5281/zenodo.7346840
- Aznar, Jocelyn & Seifart, Frank (2022): The RefCo Toolkit. Zenodo. https://zenodo.org/record/7380448#.Y6A1HH2ZNPY
- Aznar, Jocelyn (2022): Nisvai DoReCo dataset. In: Seifart, Frank, Ludger Paschen and Matthew Stave (eds.). Language Documentation Reference Corpus (DoReCo) 1.1. Berlin & Lyon: Leibniz-Zentrum Allgemeine Sprachwissenschaft & laboratoire Dynamique Du Langage (UMR5596, CNRS & Université Lyon 2). https://doreco.huma-num.fr/languages/nisv1234 (Accessed on 13/09/2022). DOI:10.34847/nkl.2801565f
- Krifka, Manfred (2022): Daakie DoReCo dataset. In: Seifart, Frank, Ludger Paschen and Matthew Stave (eds.). Language Documentation Reference Corpus (DoReCo) 1.1. Berlin & Lyon: Leibniz-Zentrum Allgemeine Sprachwissenschaft & laboratoire Dynamique Du Langage (UMR5596, CNRS & Université Lyon 2). https://doreco.huma-num.fr/languages/port1286
 (Accessed on 13/09/2022). DOI:10.34847/nkl.efeav5l9
- Seifart, Frank, Ludger Paschen & Matthew Stave (eds.) (2022): Language Documentation Reference Corpus (DoReCo) 1.1. Berlin & Lyon: Leibniz-Zentrum Allgemeine Sprachwissenschaft & laboratoire Dynamique Du Langage (UMR5596, CNRS & Université Lyon 2). DOI:10.34847/nkl.7cbfq779
- Seifart, Frank (2022): Bora DoReCo dataset. In: Seifart, Frank, Ludger Paschen and Matthew Stave (eds.). Language Documentation Reference Corpus (DoReCo) 1.1. Berlin & Lyon: Leibniz-Zentrum Allgemeine Sprachwissenschaft & laboratoire Dynamique Du Langage (UMR5596, CNRS & Université Lyon 2). https://doreco.huma-num.fr/languages/bora1263 (Accessed on 13/09/2022). DOI:10.34847/nkl.6eaf5laq
- Seifart, Frank (2022): Resígaro DoReCo dataset. In: Seifart, Frank, Ludger Paschen and Matthew Stave (eds.). Language Documentation Reference Corpus (DoReCo) 1.1. Berlin & Lyon: Leibniz-Zentrum Allgemeine Sprachwissenschaft & laboratoire Dynamique Du Langage (UMR5596, CNRS & Université Lyon 2). https://doreco.huma-num.fr/languages/resi1247
 (Accessed on 13/09/2022). DOI:10.34847/nkl.ffb96lo8
- Isard, Amy & Arestau, Elena (2022): Curation Criteria for Multimodal and Multilingual Data: A Mixed Study within the QUEST project. In: Navarreta, Constanza / Eskevich, Maria (Ed.): Selected Papers from the CLARIN Annual Conference 2021: Virtual Event, 2021, 27 - 29 September. (Linköping Electronic Conference Proceddings 189). Linköping: Linköping University Electronic Press. S. 56-68. https://ecp.ep.liu.se/index.php/clarin/article/view/417/375
- Hedeland, Hanna (2022): FAIR-Prinzipien und Qualitätskriterien für Transkriptionsdaten: Empfehlungen und offene Fragen. In: Schwarze, C. & Grawunder, S. (Hrsg.) Transkription und Annotation gesprochener Sprache und multimodaler Interaktion. Konzepte, Probleme, Lösungen. Narr Francke Attempto.
2021
Conferences
- Aznar, Jocelyn & Seifart, Frank (2021): "Silent gaps and interjections as prosodic cues for Direct speech: Preliminary results of a cross-linguistic study on three language documentation corpora". Language Documentation and Linguistic Theory 6 (hybrid). 16.12.2021 - 18.12.2021. SOAS University of London.
- Khait, Ilya & Lukschy, Leonore & Seyfiddinipur, Mandana (2021): Linguistic Archives and Language Communities Questionaire: Establishing (Re-)Use Criteria. 1st International Workshop on Digital Language Archives (LangArch-2021), Joint Conference on Digital Libraries (virtual). 30.09.2021. University of North Texas. https://www.ideals.illinois.edu/bitstream/handle/2142/111704/LangArc-2021_paper_KhaitEtAl.pdf?sequence=2&isAllowed=y
- Hedeland, Hanna & Schmidt, Thomas (2021): The TEI-based ISO standard “Transcription of Spoken Language” as an exchange format within CLARIN and beyond. In: Monachini, Monica/Eskevich, Maria (Ed.): Proceedings of CLARIN Annual Conference 2021. 27 – 29 September 2021, Virtual Edition. Utrecht: CLARIN, 2021. pp. 100-104. https://ids-pub.bsz-bw.de/frontdoor/deliver/index/docId/10717/file/Hedeland_Schmidt_The_TEI_based_ISO_standard_2021.pdf
- Isard, Amy & Arestau, Elena (2021): Curation Criteria for Multimodal and Multilingual Data: A Mixed Study within the Quest Project. In: Monachini, Monica/Eskevich, Maria (Ed.): Proceedings of CLARIN Annual Conference 2021. 27 - 29 September 2021, Virtual Edition. Utrecht: CLARIN, 2021, pp. 105-109. https://office.clarin.eu/v/CE-2021-1923-CLARIN2021_ConferenceProceedings.pdf.
- Arestau, Elena (2021): Nachhaltige Dokumentation von Metadaten für audiovisuelle Lernerkorpora: Zwischenergebnisse aus dem Projekt QUEST. Poster presented at GAL-Sektionentagung 2021, Würzburg, 15.09. - 17.09.2021. PDF
- Hedeland, Hanna & Schmidt, Thomas (2021): Transkription audiovisueller Daten - Werkzeuge und gute Praktiken. Workshop im Rahmen der "vDhd 2021 Experimente", virtuelles Event. Workshop im Rahmen von Hands on Research Data. Eine Workshopreihe der AG Datenzentren zur Transparenz und Dokumentation von Forschungsdaten. 24.03.2021 und 16.09.2021. https://vdhd2021.hypotheses.org/290
- Schmidt, Thomas & Hedeland, Hanna & Frick, Elena (2021): Ein Standard in der Praxis: ISO 24624:2016. Transcription of spoken language. In: Helling, Patrick / Speer, Andreas/ Eide, Øyvind (Ed.): FORGE 2021: Forschungsdaten in den Geisteswissenschaften – Mapping the Landscape – Geisteswissenschaftliches Forschungsdatenmanagement zwischen lokalen und globalen, generischen und spezifischen Lösungen (FORGE2021). Köln: Zenodo. pp. 100-106. https://ids-pub.bsz-bw.de/frontdoor/deliver/index/docId/10783/file/Schmidt_Hedeland_Frick_ein_Standard_2021.pdf
- Wamprechtshammer, Anna & Arestau, Elena (2021): Generische und disziplinspezifische Zugänge zur Qualität audiovisueller, annotierter Sprachdaten im BMBF-Projekt QUEST. Poster presented at FORGE 2021: Forschungsdaten in den Geisteswissenschaften - Mapping the Landscape - Geisteswissenschaftliches Forschungsdatenmanagement zwischen lokalen und globalen, generischen und spezifischen Lösungen (FORGE 2021), Cologne, 08.09.2021 - 10.09.2021. https://zenodo.org/record/5379583#.YVVwK31CQ2w.
- Hedeland, Hanna (2021): „Standardisierung, Interoperabilität – und FAIRness? ISO/TEI für Transkriptionen gesprochener Sprache“. Workshop „APIs und andere Werkzeuge für die Vernetzung zwischen Programmen und Projekten“. 02.09.2021.
- Wamprechtshammer, Anna, Elena Arestau & Amy Isard (2021): Generic and discipline-specific approaches to the quality of audiovisual, annotated language data in the BMBF project QUEST. Presented at the 11th European Summer University in Digital Humanities "Culture and Technology", University of Leipzig, 02 - 13 August 2021. PDF
- Aznar, Jocelyn (2021). "Getting the speaker's voices into the language resources: Using anno´tated narratives to build critical language resources for the Nisvai community, Vanuatu". ASA2021: Responsibility - Panel: Responsible Documentation? (virtual). 31.03.2021. University of St Andrews. https://hal.archives-ouvertes.fr/hal-03457994/document
- Majka, N., Schwiertz, G. & Rau, Felix (2021): Vagueness in Metadata. Presented at PARADISEC at 100, Sydney, Australia, 17 - 19 February 2021. https://www.youtube.com/watch?v=rBNh9Es1Tas&t=82s / https://zenodo.org/recor/4506936#.YOb1HUxCQ2x.
- Aznar, Jocelyn (2021): "Silent gaps as a distinctive feature of Direct Speech: a Corpus study of Nisvai Narratives, Vanuatu". From dialogue to grammar - Language emerging from human sociality: the case of speech respresentation (virtual). 11.02.2021. University of Helsinki. https://www.soas.ac.uk/linguistics/events/language-documentation-and-linguistic-theory-6/language-documentation-and-linguistics-theory-6-conference---non-plenary-talks.html
Publications
- Arkhangelskiy, Timofey, Hedeland, Hanna & Riaposov, Aleksandr (2021): Evaluating and assuring research data quality for audiovisual annotated language data: In: Navarreta, Constanza / Eskevich, Maria (Hg.): Selected Papers from the CLARIN Annual Conference 2020: Virtual Event, 2020, 5-7 October. (Linköping Electronic Conference Proceddings 180). Linköping: Linköping University Electronic Press. pp. 1-7. https://ecp.ep.liu.se/index.php/clarin/article/view/1/1
- Hedeland, Hanna (2021): Towards comprehensive definitions of data quality for audiovisual annotated language resources. In: Navarreta, Constanza / Eskevich, Maria (Ed.): Selected Papers from the CLARIN Annual Conference 2020: Virtual Event, 2020, 5-7 October. (Linköping Electronic Conference Proceddings 180). Linköping: Linköping University Electronic Press. pp. 93-103. https://ids-pub.bsz-bw.de/frontdoor/deliver/index/docId/10518/file/Hedeland_Towards_comprehensive_definitions_2021.pdf
2020
Conferences
- Arkhangelskiy, Timofey, Hanna Hedeland & Riaposov, Aleksandr (2020): Evaluating and Assuring Reserach Data Quality for Audiovisual Annotated Language Data. Paper presented at CLARIN Annual Conference 2020, 07. Oktober 2020. https://ids-pub.bsz-bw.de/frontdoor/deliver/index/docId/1007/file/Arkhangelskiy_Hedeland_Riaposov_Evaluating_research_data_quality_2020.pdf.
- Hedeland, Hanna (2020): Towards Comprehensive Definitions of Data Quality for Audiovisual Annotated Language Resources. Paper presented at CLARIN Annual Conference 2020, 07. Oktober 2020. https://ids-pub.bsz-bw.de/frontdoor/deliver/index/docId/10076/file/Hedeland_Towards_comprehensive_definitons_of_data_quality_2020.pdf.
Publications
- Seyfeddinipur, Mandana & Rau, Felix (2020): "Keeping it real: Video data in language documenation and language archiving". In: Language Documentation and Conservation 14, pp. 503-514. URL: https://scholarspace.manoa.hawaii.edu/handle/10125/24965.
- Nordhoff, Sebastian (2020). Modelling and annotating interlinear glossed text from 280 different endangered languages as Linked Data with LIGT. In: Proceedings of the 14th Linguistic Annotation Workshop (LAW XIV). https://www.aclweb.org/anthology/2020.law-1.9.pdf.
- Aznar, Jocelyn & Seifart, Frank (2020): RefCo: An initiative to develop a set of quality criteria for fieldwork corpora. 2èmes journées scientifiques du Groupement de Recherche Linguistique Informatique Formelle et de Terrain (LIFT), Montrouge, France. pp. 95-101. https://hal.archives-ouvertes.fr/hal-03047143/document.
- Isard, Amy (2020): Approaches to the Anonymisation of Sign Language Corpora. In: Proceedings of the 9th Workshop on the Representation and Processing of Sign Languages (LREC-2020 workshop). https://www.aclweb.org/anthology/2020.signlang-1.15.pdf
- Nordhoff, Sebastian (2020): From the attic to the cloud: mobilization of endangered lanuage resources with linked data. In: Proceedings of LR4SSHOC: Workshop about Language Resources for the SSH Cloud (LREC-2020). https://www.aclweb.org/anthology/2020.lr4sshoc-1.3.pdf.
- von Prince, Kilu & Sebastian Nordhoff (2020): An Empirical Evaluation of Annotation Practices in Corpora from Language Documentation. In: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020). https://www.aclweb.org/anthology/2020.lrec-1.338.pdf