Lexical database of selected Norwegian words

Words have many different characteristics, both in terms of form and content. Examples of such characteristics include word length, word structure, imageability, and usage frequency. These and other characteristics affect both language acquisition, how early children learn different words, and language usage, how easy or hard it is for us to find the right words when we need them.

Norwegian Words is a searchable, lexical database containing approximately 1600 Norwegian words (900 nouns, 500 verbs and 200 adjectives). For each word you can get information about characteristics that are known from previous research to affect language acquisition, storage and processing of words in populations both with and without speech and language impairments.

Where do the words come from?

The 1600 words in Norwegian Words are mainly chosen from the following assessment tools developed to investigate language and language use in children and adults:

Who can use Norwegian Words?

Norwegian Words can be used freely, for instance by students and researchers, and practitioners within different fields such as linguistics, psychology and speech and language therapy. You can search for single words or different characteristics in addition to words from different assessment tools. Currently the database is only accessible through the interface developed by the Text Laboratory. Please refer to Norwegian Words like this:

Lind, M., Simonsen, H.G., Hansen, P., Holm, E. & Mevik, B.-H. (2015) Norwegian Words: A lexical database for clinicians and researchers. Clinical Linguistics & Phonetics 29(4), p. 276-290. Downloaded DATE from: http://www.tekstlab.uio.no/ordforradet/en/

Who have developed Norwegian Words?

The database is developed by:

In addition, the following people have contributed to the database:

Calculations of phonological neighbourhood density are based on data from NorKompLeks: Nordgård, T. (1998). Norwegian Computational Lexicon (NorKompLeks). Proceedings of the 11th Nordic Conference of Computational Linguistics NODALIDA 98. CST, Copenhagen.

The words’ frequencies are based on the database NoWaC: Guevara, E. (2010). NoWaC: A large web-based corpus for Norwegian. Proceedings of the NAACL HLT 2010 Sixth Web as Corpus Workshop. Los Angeles, CA: Association for Computational Linguistics.

We would like to thank all the informants who participated in the surveys on imageability and age of acquisition.