NoTa-Oslo: Norwegian Speech Corpus - the Oslo part

NoTa-Oslo is a speech corpus with interviews and conversations from 166 informants born and raised in Oslo and the Oslo area. The informants are carefully selected w.r.t. sociolinguistic variables and therefore representative in terms of age, gender, place of residence and education. NoTa-Oslo consists of approx. 957 000 words that are orthographically transcribed and morphologically tagged. The corpus is searchable in the search interface Glossa, and the transcriptions are linked to audio and video files.

The NoTa-Oslo corpus was built during the period 2004 - 2006.

NoTa-Oslo now uses the new version of Glossa, a search and post-processing tool developed by the Text Laboratory.
Log in with Feide or CLARIN. Contact us if you need another login alternative.

Search in NoTa-Oslo

Download the transcriptions:

License for downloadig
In html format
In txt format with informant codes
In txt format without informant codes