The Oslo-Bergen Tagger
- A grammatical tagger for Bokmål and Nynorsk
The Oslo-Bergen Tagger is a robust morphological tagger developed at the University of Oslo and at Uni Computing in Bergen over several years. The tagger consists of three main modules: a preprocessor with multitagger and compound analyser, a grammar module for morphological disambiguation (Constraint Grammar) and a statistical module that removes the last of the remaining morphological ambiguity (only for Bokmål). The Constraint Grammar module uses a compiler developed at the University of Southern Denmark in Odense. The multitagger uses the lexicon Norsk ordbank.
Read more about The Oslo-Bergen Tagger, the history of the tagger, evaluation and tagset in the menu on the left.
The tagger can be downloaded on GPL license or NEW: run online here.
The Oslo-Bergen Tagger is improved and modernised through the infrastructure project Clarino+.
How to refer to the tagger:
Johannessen, Janne Bondi; Hagen, Kristin; Lynum André and Nøklestad, Anders. 2012. OBT+stat. A combined rule-based and statistical tagger. In Andersen, Gisle (ed.): Exploring Newspaper Language. Corpus compilation and research based on the Norwegian Newspaper Corpus. John Benjamins Publishing Company, 51-65.