cmdp:resourceCommonInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1396012485126’] [cmd:ref=‘humit-tagger’]:
cmdp:resourceType [cmd:ref=‘obt’]: toolService
cmdp:identificationInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1396012485125’] [cmd:ref=‘humit-tagger’]:
cmdp:resourceName [cmd:ref=‘obt’] [xml:lang=‘en’]: The Humit Tagger
cmdp:resourceName [cmd:ref=‘obt’] [xml:lang=‘no’]: Humit-taggeren
cmdp:description [cmd:ref=‘obt’] [xml:lang=‘en’]: The Humit Tagger is a morphological AI tagger for Norwegian Bokmål and Nynorsk developed at Humit, University of Oslo.
The
tagger is based on a neural network, more precisely a pre-trained BERT
model for Norwegian, developed by the National Library of Norway. The
tagger is a so-called sequence classifier, which selects morphological
tags but not lemmas.
In this first version of the Humit Tagger, the full-form word list from Norsk ordbank is used as a basis for lemma selection.
cmdp:resourceShortName [cmd:ref=‘obt’]: humit-tagger
cmdp:url [cmd:ref=‘obt’]: https://www.hf.uio.no/humit/english/resources/humit-tagger/index.html
cmdp:distributionInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1396012485124’] [cmd:ref=‘humit-tagger’]:
cmdp:licenceInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1396012485158’] [cmd:ref=‘humit-tagger’]:
cmdp:distributionAccessMedium: Downloadable
cmdp:downloadLocation: https://github.com/humit-oslo/humit-tagger
cmdp:executionLocation: https://tekstlab.uio.no/humtag_nett/
cmdp:licence [cmd:ComponentRef=‘clarin.eu:cr1:c_1447674760330’]:
cmdp:licenceURL: http://en.wikipedia.org/wiki/MIT_License
cmdp:licensor:
cmdp:actorInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1396012485194’]:
cmdp:actorType: organization
cmdp:organizationInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1407745711883’]:
cmdp:organizationName [xml:lang=‘en’]: University of Oslo
cmdp:organizationName [xml:lang=‘no’]: Universitetet i Oslo
cmdp:organizationShortName [xml:lang=‘no’]: UiO
cmdp:organizationShortName [xml:lang=‘en’]: UoO
cmdp:departmentName [xml:lang=‘no’]: Humit – senter for digital utvikling på HF
cmdp:departmentName [xml:lang=‘en’]: Humit – Centre for digital development at HF
cmdp:communicationInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1352813745460’]:
cmdp:email: humit@hf.uio.no
cmdp:url: https://www.hf.uio.no/humit/english/
cmdp:distributionRightsHolder:
cmdp:actorInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1396012485194’]:
cmdp:actorType: organization
cmdp:organizationInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1407745711883’]:
cmdp:organizationName [xml:lang=‘en’]: University of Oslo
cmdp:organizationName [xml:lang=‘no’]: Universitetet i Oslo
cmdp:organizationShortName [xml:lang=‘no’]: UiO
cmdp:organizationShortName [xml:lang=‘en’]: UoO
cmdp:departmentName [xml:lang=‘en’]: Humit – Centre for digital development at HF
cmdp:departmentName [xml:lang=‘no’]: Humit – senter for digital utvikling på HF
cmdp:communicationInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1352813745460’]:
cmdp:email: humit@hf.uio.no
cmdp:url: https://www.hf.uio.no/humit/english/
cmdp:iprHolder:
cmdp:actorInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1396012485194’] [cmd:ref=‘cg’]:
cmdp:actorType [cmd:ref=‘cg’]: organization
cmdp:organizationInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1407745711883’]:
cmdp:organizationName: Humit – Centre for digital development at HF
cmdp:organizationShortName: Humit
cmdp:actorInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1396012485194’]:
cmdp:contact [cmd:ref=‘humit-tagger’]:
cmdp:actorInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1396012485194’] [cmd:ref=‘humit-tagger’]:
cmdp:organizationInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1407745711883’]:
cmdp:organizationName: Humit – Centre for digital development at HF
cmdp:organizationShortName: Humit
cmdp:communicationInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1352813745460’]:
cmdp:email: humit@hf.uio.no
cmdp:url: https://www.hf.uio.no/humit/english/index.html
cmdp:metadataInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1407745711922’] [cmd:ref=‘humit-tagger’]:
cmdp:metadataCreationDate: 2025-01-10
cmdp:metadataCreator [cmd:ref=‘humit-tagger’]:
cmdp:actorInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1396012485194’]:
cmdp:personInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1396012485192’]:
cmdp:organizationInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1407745711883’]:
cmdp:organizationName: Humit – Centre for digital development at HF
cmdp:organizationShortName: Humit
cmdp:communicationInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1352813745460’]:
cmdp:email: kristin.hagen@iln.uio.no
cmdp:url: https://www.hf.uio.no/humit/english/
cmdp:versionInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1430905751648’] [cmd:ref=‘humit-tagger’]:
cmdp:version [cmd:ref=‘obt’]: First version
cmdp:validationInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1407745711923’] [cmd:ref=‘humit-tagger’]:
cmdp:validationModeDetails [cmd:ref=‘cg’]:
So far, the tagger has only been evaluated on a test part of the
Norwegian Dependency Treebank where there is only one correct answer for
each word form. The Humit tagger then has an accuracy of 0.98 for tags
and 0.99 for lemmas.
cmdp:validationReportUnstructured [cmd:ComponentRef=‘clarin.eu:cr1:c_1353678848789’]:
cmdp:documentUnstructured: See home page
https://www.hf.uio.no/humit/english/resources/humit-tagger/index.html
cmdp:resourceDocumentationInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1355150532301’] [cmd:ref=‘humit-tagger’]:
cmdp:documentationUnstructured [cmd:ComponentRef=‘clarin.eu:cr1:c_1355150532302’]:
cmdp:documentUnstructured: See home page
https://www.hf.uio.no/humit/english/resources/humit-tagger/index.html
cmdp:documentationUnstructured [cmd:ComponentRef=‘clarin.eu:cr1:c_1355150532302’]:
cmdp:documentUnstructured:
Haug, D. T. T., Yildirim, A., Hagen, K., & Nøklestad, A. (2023).
Rules and neural nets for morphological tagging of Norwegian-Results and
challenges. NEALT Proceedings Series, 425-435.
cmdp:resourceCreationInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1407745711921’] [cmd:ref=‘humit-tagger’]:
cmdp:creationStartDate: 2022
cmdp:creationEndDate: 2024
cmdp:resourceCreator [cmd:ref=‘humit-tagger’]:
cmdp:actorInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1396012485194’]:
cmdp:actorType: organization
cmdp:communicationInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1352813745460’]:
cmdp:email: humit@hf.uio.no
cmdp:fundingProject [cmd:ref=‘humit-tagger’]:
cmdp:projectInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1430905751647’]:
cmdp:projectName: Common Language Resources and Technology Infrastructure Norway +
cmdp:projectShortName: CLARINO +
cmdp:url: http://clarin.b.uib.no/
cmdp:fundingType: nationalFunds
cmdp:funder: the Research Council of Norway
cmdp:fundingCountry: Norway
cmdp:projectStartDate: 2020-03-01
cmdp:projectEndDate: 2023-12-31
cmdp:toolInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1422885449327’]:
cmdp:inputInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1360931019804’]:
cmdp:resourceType: corpus
cmdp:modalityType: writtenLanguage
cmdp:languageName: Norwegian
cmdp:languageName: Norwegian Bokmål
cmdp:languageName: Norwegian Nynorsk
cmdp:characterEncoding: utf-8
cmdp:annotationType: lemmatization
cmdp:annotationType: morphosyntacticAnnotation-posTagging
cmdp:tagset: http://www.tekstlab.uio.no/obt-ny/english/tagset.html
cmdp:segmentationLevel: word
cmdp:segmentationLevel: clause
cmdp:outputInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1360931019824’]:
cmdp:resourceType: corpus
cmdp:modalityType: writtenLanguage
cmdp:languageName: Norwegian
cmdp:languageName: Norwegian Bokmål
cmdp:languageName: Norwegian Nynorsk
cmdp:characterEncoding: utf-8
cmdp:tagset: http://www.tekstlab.uio.no/obt-ny/english/tagset.html
cmdp:segmentationLevel: clause
cmdp:segmentationLevel: word
cmdp:toolServiceOperationInfo [cmd:ComponentRef=‘clarin.eu:cr1:c_1360931019835’]:
cmdp:operatingSystem: See https://github.com/humit-oslo/humit-tagger