LIA Norwegian - Corpus of historical dialect recordings

LIA Norwegian comprises 3.5 million words elicited from 1374 informants from 226 local areas in Norway. The material is transcribed both (quasi) phonetically and orthographically (Nynorsk), as well as being morphologically tagged with the newly developed spoken language tagger for Nynorsk, the LIA tagger.

LIA Norwegian is accessible via the corpus search interface Glossa.

The recordings and transcriptions were provided by four universities: NTNU, UiB, UiO and UiT. There is also material from Målførearkivet (the dialect archive at the University of Oslo) that was previously available in the Nordic Dialect Corpus.

Search the corpus
Read the user manual for LIA Norwegian

Audio files, transcriptions and metadata from the corpus are available at a file depot, along with audio that has not been transcribed in the project. The transcriptions can be downloaded in ELAN format from the depot, while the audio can be streamed.

Search the LIA file depot
Read about the depot

All transcriptions are downloadable in plain text format. A folder containing 553 transcriptions from LIA Norwegian, in ELAN format, along with their corresponding audio, can moreover be downloaded. The recordings contain no sensitive information and can be used freely by linguists or for other technological purposes. (Many of the LIA recordings have content that has been deemed sensitive. Such content has not been transcribed, such that the recordings can still be used in the corpus. These recordings are not available for download.)

Download selected audio files and transcriptions from LIA Norwegian

Download all transcriptions from LIA in plain text format


English



Contact:

tekstlab-post at iln.uio.no

Read more about the LIA-project