ProjektPhylomilia – Phylogenetic linguistic inference from acoustic speech data

Grunddaten

Akronym:
Phylomilia
Titel:
Phylogenetic linguistic inference from acoustic speech data
Laufzeit:
01.10.2024 bis 30.10.2027
Abstract / Kurz- beschreibung:
Computational comparative linguistics traditionally relies heavily on manual data preprocessing, which limits progress, scalability, and reproducibility. This project aims to revolutionize this field by leveraging advanced Deep Learning methodologies, specifically focusing on automatic speech processing, to perform phylogenetic analysis directly from acoustic speech data without manual intervention. Utilizing speech as the primary data source marks a significant shift from writing-based analyses, allowing for more direct and nuanced insights into language evolution.

The proposed methodology simplifies the traditional multi-step workflow of linguistic analysis into two core processes:
1. Transforming speech data into vector space representations using self-supervised Deep Learning models such as wav2vec-u, which effectively captures the linguistic features directly from audio data.
2. Conducting phylogenetic inference from these vectorized representations to construct
language family trees and deduce historical language relationships.
As preparation for the pre-training for the first step, the project will device a language-
independent end-to-end automatic speech recognition tool that transcribes spoken language into IPA.

Leveraging autoencoder techniques, the project will, furthermore, probabilistically reconstruct aspects of the vocabulary and phonotactics of earlier language stages.

As the Deep-Learning methods utilized in the project have a black-box character, the project will, finally, devote attention to post-hoc explainability of the trained model in linguistic terms.
Schlüsselwörter:
Linguistik
linguistics
deep learning
automatic speech recognition
phylogenetics

Beteiligte Mitarbeiter/innen

Leiter/innen

Seminar für Sprachwissenschaft (SfS)
Fachbereich Neuphilologie, Philosophische Fakultät
SFB 833 - Bedeutungskonstitution: Dynamik und Adaptivität sprachlicher Strukturen
Sonderforschungsbereiche und Transregios

Lokale Einrichtungen

Seminar für Sprachwissenschaft (SfS)
Fachbereich Neuphilologie
Philosophische Fakultät

Geldgeber

Hannover, Niedersachsen, Deutschland
Hilfe

wird permanent gelöscht. Dies kann nicht rückgängig gemacht werden.