CMDI 1.1. Metadata
Header
MdCreator: Kristin Hagen
MdCreationDate: 2015-08-07
MdSelfLink:
MdProfile: clarin.eu:cr1:p_1407745711925
MdCollectionDisplayName: Clarino - Textlab
Resources
ResourceProxyList:
JournalFileProxyList:
ResourceRelationList:
IsPartOfList:
Components
corpusProfile:
resourceCommonInfo [ComponentId=‘clarin.eu:cr1:c_1396012485126’]:
resourceType: corpus
identificationInfo [ComponentId=‘clarin.eu:cr1:c_1396012485125’]:
resourceName [xml:lang=‘nn’]: Oslo-korpuset av taggede norske tekster (nynorskdelen)
resourceName [xml:lang=‘en’]: The Oslo Corpus of Tagged Norwegian Texts (the Nynorsk part)
description: The corpus consists of the texts that were available at the Text Laboratory in January 1999. It is composed of texts from three genres: fiction (2.1 mill. words), newpapers/magazines (nynorsk: 1 mill.), and factual prose (700.000), all in all 3.8 mill. words.

All fiction comes from ECI (European Corpus Initiative) and Norsk Tekstarkiv (Norwegian Text Archive). The texts from newspapers and magazines have been collected by the Text Laboratory with kind permission from the various editorial offices. The factual prose consists mainly of NOU reports (Norwegian Official Reports) and Norwegian laws and regulations.

The corpus is not meant to be representative in any sense, although it contains texts from a variety of genres.

The corpus is grammatically tagged with the original version of The Oslo-Bergen tagger.
resourceShortName [xml:lang=‘nn’]: Oslo-korpuset
resourceShortName [xml:lang=‘en’]: The Oslo Corpus
url: http://www.tekstlab.uio.no/norsk/bokmaal/english.html
url: http://www.tekstlab.uio.no/norsk/nynorsk/index.html
PID: http://hdl.handle.net/11538/0000-0005-E7C3-A
distributionInfo [ComponentId=‘clarin.eu:cr1:c_1396012485124’]:
licenceInfo [ComponentId=‘clarin.eu:cr1:c_1396012485158’]:
userCategory: Academic
distributionAccessMedium: accessibleThroughInterface
executionLocation: http://www.tekstlab.uio.no/norsk/bokmaal/english.html
licence [ComponentId=‘clarin.eu:cr1:c_1447674760330’]:
licenceFamily: CLARIN
licenceName: CLARIN_ACA-NC-LOC-ND
licenceURL: https://kitwiki.csc.fi/twiki/bin/view/FinCLARIN/ClarinEulaAca?ID=1&AFFIL=EDU&BY=1&NC=1&LOC=1&NORED=1&ND=1
conditionsOfUse: BY
conditionsOfUse: ID
conditionsOfUse: LOC
conditionsOfUse: NC
conditionsOfUse: ND
conditionsOfUse: NORED
licensor:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName [xml:lang=‘en’]: University of Oslo
organizationName [xml:lang=‘no’]: Universitetet i Oslo
organizationShortName [xml:lang=‘no’]: UiO
organizationShortName [xml:lang=‘en’]: UoO
departmentName [xml:lang=‘en’]: Department of Linguistics and Scandinavian Studies
departmentName [xml:lang=‘no’]: Institutt for lingvistiske og nordiske studier (ILN)
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/english/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
distributionRightsHolder:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName [xml:lang=‘en’]: University of Oslo
organizationName [xml:lang=‘no’]: Universitetet i Oslo
organizationShortName [xml:lang=‘no’]: UiO
organizationShortName [xml:lang=‘en’]: UoO
departmentName [xml:lang=‘en’]: Department of Linguistics and Scandinavian Studies
departmentName [xml:lang=‘no’]: Institutt for lingvistiske og nordiske studier (ILN)
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/english/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
contact:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName: The Text Laboratory
organizationShortName: Textlab
departmentName: Department of Linguistics and Scandinavian Studies, University of Oslo
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
metadataInfo [ComponentId=‘clarin.eu:cr1:c_1407745711922’]:
metadataCreationDate: 2015-08-07
metadataLastDateUpdated: 2017-06-08
metadataCreator:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: person
personInfo [ComponentId=‘clarin.eu:cr1:c_1396012485192’]:
surname: Hagen
givenName: Kristin
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName: The Text Laboratory
organizationShortName: Textlab
departmentName: Department of Linguistics and Scandinavian Studies, University of Oslo
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: kristin.hagen@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
versionInfo [ComponentId=‘clarin.eu:cr1:c_1430905751648’]:
version: 2
resourceDocumentationInfo [ComponentId=‘clarin.eu:cr1:c_1355150532301’]:
documentationUnstructured [ComponentId=‘clarin.eu:cr1:c_1355150532302’]:
role: documentation
documentUnstructured: http://www.tekstlab.uio.no/norsk/bokmaal/english.html
resourceCreationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711921’]:
creationEndDate: 1999-12-31
resourceCreator:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName: The Text Laboratory
organizationShortName: Text Lab
departmentName: Department of Linguistics and Scandinavian Studies, University of Oslo
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
corpusInfo [ComponentId=‘clarin.eu:cr1:c_1407745711878’]:
corpusType: Written Corpus
corpusPartInfo [ComponentId=‘clarin.eu:cr1:c_1407745711885’]:
mediaType: text
corpusTextInfo [ComponentId=‘clarin.eu:cr1:c_1396012485188’]:
textFormatInfo [ComponentId=‘clarin.eu:cr1:c_1427452477072’]:
mimeType: txt
characterEncodingInfo [ComponentId=‘clarin.eu:cr1:c_1447674760355’]:
characterEncoding: latin1
corpusPartGeneralInfo [ComponentId=‘clarin.eu:cr1:c_1407745711882’]:
sourceWorkInfo [ComponentId=‘clarin.eu:cr1:c_1407745712071’]:
workDescription: The corpus is composed of texts from three genres:
- fiction (2.1 mill. words)
- newpapers/magazines (nynorsk: 1 mill.)
- factual prose (700.000)

All fiction comes from ECI (European Corpus Initiative) and Norsk Tekstarkiv (Norwegian Text Archive). The texts from newspapers and magazines have been collected by the Text Laboratory with kind permission from the various editorial offices. The factual prose consists mainly of NOU reports (Norwegian Official Reports) and Norwegian laws and regulations.
lingualityInfo [ComponentId=‘clarin.eu:cr1:c_1355150532313’]:
lingualityType: monolingual
languageInfo [ComponentId=‘clarin.eu:cr1:c_1428388179423’]:
languageId: No
languageName: Norwegian
languageInfo [ComponentId=‘clarin.eu:cr1:c_1428388179423’]:
languageId: Nn
languageName: Norwegian Nynorsk
modalityInfo [ComponentId=‘clarin.eu:cr1:c_1447674760356’]:
modalityType: writtenLanguage
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 18,5 mill
sizeUnit: words
annotationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711924’]:
annotationType: morphosyntacticAnnotation-posTagging
annotationType: lemmatization
segmentationLevel: word
tagset: The Oslo Bergen-tagger tagset: http://tekstlab.uio.no/obt-ny/english/index.html
tagsetLanguageId: Nn
tagsetLanguageName: Norwegian Nynorsk
theoreticModel: Constraint grammar
annotationMode: automatic
annotationManualUnstructured [ComponentId=‘clarin.eu:cr1:c_1355150532325’]:
role: annotationManual
documentUnstructured: http://www.tekstlab.uio.no/obt-ny/english/index.html
annotationTool [ComponentId=‘clarin.eu:cr1:c_1355150532326’]:
targetResourceNameURI: The Oslo-Bergen Tagger: http://tekstlab.uio.no/obt-ny/english/index.html
classificationInfo [ComponentId=‘clarin.eu:cr1:c_1403588862809’]:
genreInfo [ComponentId=‘clarin.eu:cr1:c_1407745711877’]:
genreType: textGenre
genre: factual prose
classificationInfo [ComponentId=‘clarin.eu:cr1:c_1403588862809’]:
genreInfo [ComponentId=‘clarin.eu:cr1:c_1407745711877’]:
genreType: textGenre
genre: newspaper and magazines
classificationInfo [ComponentId=‘clarin.eu:cr1:c_1403588862809’]:
genreInfo [ComponentId=‘clarin.eu:cr1:c_1407745711877’]:
genreType: textGenre
genre: fiction and drama
timeCoverageInfo [ComponentId=‘clarin.eu:cr1:c_1447674760358’]:
timeCoverage: Texts that were available at the Text Laboratory in January 1999