CMDI 1.1. Metadata
Header
MdCreator: Kristin Hagen
MdCreationDate:
MdSelfLink:
MdProfile: clarin.eu:cr1:p_1407745711925
MdCollectionDisplayName: Clarino - Textlab
Resources
ResourceProxyList:
JournalFileProxyList:
ResourceRelationList:
IsPartOfList:
Components
corpusProfile:
resourceCommonInfo [ComponentId=‘clarin.eu:cr1:c_1396012485126’]:
resourceType: corpus
identificationInfo [ComponentId=‘clarin.eu:cr1:c_1396012485125’]:
resourceName [xml:lang=‘en’]: Nordic Dialect Corpus
resourceName [xml:lang=‘nb’]: Nordisk dialektkorpus
description [xml:lang=‘en’]: Nordic Dialect Corpus is a corpus of Norwegian, Swedish, Danish, Faroese, Icelandic and Övdalian spoken language. It consists of spontaneous speech data from dialects of the North Germanic languages across all of the Nordic countries. The linguistic data in the corpus comes from a variety of sources, both old and new (see homepage - Data Collection). The corpus contains about 2,8 million words from conversations and interviews by dialect speakers. It is transcribed and linked to audio and video, has a map function, and can be searched in a large variety of ways. Even if the aim of the corpus is Nordic syntax research, the corpus is a general one, a Norwegian Dialect Corpus, a Swedish Dialect Corpus and so on, to be used in a wide range of research areas, such as phonology, morphology and lexicography.
resourceShortName [xml:lang=‘en’]: NDC - Nordic Dialect Corpus
url: http://www.tekstlab.uio.no/nota/scandiasyn/
PID: http://hdl.handle.net/11538/0000-0005-E7C7-6
distributionInfo [ComponentId=‘clarin.eu:cr1:c_1396012485124’]:
licenceInfo [ComponentId=‘clarin.eu:cr1:c_1396012485158’]:
userCategory: Academic
distributionAccessMedium: accessibleThroughInterface
executionLocation: http://www.tekstlab.uio.no/nota/scandiasyn/
licence [ComponentId=‘clarin.eu:cr1:c_1447674760330’]:
licenceFamily: CLARIN
licenceName: CLARIN_ACA-NC-LOC-PRIV-ND-*
licenceURL: https://kitwiki.csc.fi/twiki/bin/view/FinCLARIN/ClarinEulaAca?ID=1&AFFIL=EDU&BY=1&NC=1&LOC=1&PRIV=1&NORED=1&ND=1
conditionsOfUse: *
conditionsOfUse: BY
conditionsOfUse: ID
conditionsOfUse: LOC
conditionsOfUse: NC
conditionsOfUse: ND
conditionsOfUse: NORED
conditionsOfUse: PRIV
nonStandardConditionsOfUse: The corpus has audio and video recordings classified as personal data. In agreement with NSD, the Data Protection Official in Norway, the corpus is accesible only through Glossa, a search and post-processing tool developed by the Text Laboratory. The video and audio excerpts given by the search interface can not be shown in public unless you have an agreement with the Text Laboratory.
licensor:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName [xml:lang=‘en’]: University of Oslo
organizationName [xml:lang=‘no’]: Universitetet i Oslo
organizationShortName [xml:lang=‘no’]: UiO
organizationShortName [xml:lang=‘en’]: UoO
departmentName [xml:lang=‘en’]: Department of Linguistics and Scandinavian Studies
departmentName [xml:lang=‘no’]: Institutt for lingvistiske og nordiske studier (ILN)
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/english/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
distributionRightsHolder:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName [xml:lang=‘en’]: University of Oslo
organizationName [xml:lang=‘no’]: Universitetet i Oslo
organizationShortName [xml:lang=‘no’]: UiO
organizationShortName [xml:lang=‘en’]: UoO
departmentName [xml:lang=‘en’]: Department of Linguistics and Scandinavian Studies
departmentName [xml:lang=‘no’]: Institutt for lingvistiske og nordiske studier (ILN)
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/english/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
contact:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName: The Text Laboratory
organizationShortName: Textlab
departmentName: Department of Linguistics and Scandinavian Studies, University of Oslo
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
metadataInfo [ComponentId=‘clarin.eu:cr1:c_1407745711922’]:
metadataCreationDate: 2015-02-03
metadataLastDateUpdated: 2017-06-08
metadataCreator:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: person
personInfo [ComponentId=‘clarin.eu:cr1:c_1396012485192’]:
surname: Hagen
givenName: Kristin
sex: female
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName: The Text Laboratory
organizationShortName: Textlab
departmentName: Department of Linguistics and Scandinavian Studies, University of Oslo
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
validationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711923’]:
validated: true
validationType: content
validationMode: manual
validationModeDetails: The transcriptions are proof read against the audio files. The national projects NorDiaSyn, DanDiaSyn and SweDiaSyn have proof read own transcriptions, see homepage - Transcription
validationExtent: full
resourceDocumentationInfo [ComponentId=‘clarin.eu:cr1:c_1355150532301’]:
documentationStructured [ComponentId=‘clarin.eu:cr1:c_1361876010648’]:
role: documentation
documentInfo [ComponentId=‘clarin.eu:cr1:c_1353678848788’]:
documentType: other
title: Nordic Dialect Corpus and Syntax Database
author: The Text Laboratory
year: 2013
url: http://www.tekstlab.uio.no/nota/scandiasyn/
documentLanguageId: en
documentationStructured [ComponentId=‘clarin.eu:cr1:c_1361876010648’]:
role: documentation
documentInfo [ComponentId=‘clarin.eu:cr1:c_1353678848788’]:
documentType: manual
title: The Nordic Dialect Corpus - Search Interface Documentation
author: Eirik Olsen
year: 2014
url: http://www.tekstlab.uio.no/nota/scandiasyn/help/
documentLanguageId: en
documentationStructured [ComponentId=‘clarin.eu:cr1:c_1361876010648’]:
role: documentation
documentInfo [ComponentId=‘clarin.eu:cr1:c_1353678848788’]:
documentType: book
title [xml:lang=‘nb’]: Om artiklene i denne boka og Nordisk dialektkorpus
editor: Janne Bondi Johannessen og Kristin Hagen
year: 2014
publisher: Novus forlag
bookTitle: Språk i Norge og nabolanda. Ny forskning om talespråk.
ISBN: 978-82-7099-795-4
documentLanguageName: Norwegian bokmål
documentLanguageId: nb
resourceCreationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711921’]:
creationStartDate: 2005-01-01
creationEndDate: 2013-12-24
fundingProject:
projectInfo [ComponentId=‘clarin.eu:cr1:c_1430905751647’]:
projectName [xml:lang=‘en’]: Scandinavian Dialect Syntax
projectShortName: ScanDiaSyn
url: http://websim.arkivert.uit.no/scandiasyn/scandiasyn/index.html%3fcolapsemenu=colapsemenu
url: http://www.tekstlab.uio.no/nota/scandiasyn/index.html
fundingType: other
funder: http://websim.arkivert.uit.no/scandiasyn/scandiasyn/29
fundingProject:
projectInfo [ComponentId=‘clarin.eu:cr1:c_1430905751647’]:
projectName [xml:lang=‘nb’]: NorDiaSyn - Norsk dialektsyntaks
projectName: Nordiasyn - Norwegian Dialect Syntax
projectShortName: Nordiasyn
url: http://www.tekstlab.uio.no/nota/NorDiaSyn/index.html
url: http://www.tekstlab.uio.no/nota/NorDiaSyn/english/index.html
fundingType: nationalFunds
funder: The Research Council of Norway
fundingCountry: Norway
projectStartDate: 2009-01-01
projectEndDate: 2013-12-31
fundingProject:
projectInfo [ComponentId=‘clarin.eu:cr1:c_1430905751647’]:
projectName [xml:lang=‘nb’]: For the funding of the national projects in Norway, Sweden, Denmark, Iceland and Faroese islands, see under National Projects: http://www.tekstlab.uio.no/nota/scandiasyn/dialect_data_collection.html
url: http://www.tekstlab.uio.no/nota/scandiasyn/dialect_data_collection.html
fundingType: nationalFunds
corpusInfo [ComponentId=‘clarin.eu:cr1:c_1407745711878’]:
corpusType: Multimodal Corpus
corpusPartInfo [ComponentId=‘clarin.eu:cr1:c_1407745711885’]:
mediaType: text
corpusTextInfo [ComponentId=‘clarin.eu:cr1:c_1396012485188’]:
textFormatInfo [ComponentId=‘clarin.eu:cr1:c_1427452477072’]:
mimeType: txt
sizePerTextFormat [ComponentId=‘clarin.eu:cr1:c_1447674760342’]:
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 2 829 878
sizeUnit: words
characterEncodingInfo [ComponentId=‘clarin.eu:cr1:c_1447674760355’]:
characterEncoding: latin1
corpusPartInfo [ComponentId=‘clarin.eu:cr1:c_1407745711885’]:
mediaType: video
corpusVideoInfo [ComponentId=‘clarin.eu:cr1:c_1407745711880’]:
videoContentInfo [ComponentId=‘clarin.eu:cr1:c_1360931019779’]:
typeOfVideoContent: (Some recordings in the corpus are audio only, below are the video recordings)
Norway: informal conversations and semi-formal interwievs. 438 informants from 111 places.
Âlvdalen, Sweden: interviews and conversations: 17 informants from 7 places
Faroese islands: intervievs and conversations: 20 informants from 5 place
textIncludedInVideo: none
dynamicElementInfo [ComponentId=‘clarin.eu:cr1:c_1360931019781’]:
bodyParts: face
bodyParts: arms
settingInfo [ComponentId=‘clarin.eu:cr1:c_1360230992162’]:
naturality: spontaneous
conversationalType: dialogue
audience: few
interactivity: overlapping
interaction: Two scenarios in the corpus:
1) semiformal interview: research assistant/researcher and informant(s).
2) Free conversation between two informants. Research assistants were some times passively present in the room during the conversations to prevent conversations about sensitive matters
videoFormatInfo [ComponentId=‘clarin.eu:cr1:c_1427452477073’]:
mimeType: videos in mpeg4 streaming format available through Glossa
frameRate: 25
resolutionInfo [ComponentId=‘clarin.eu:cr1:c_1360931019784’]:
sizeWidth: 400
sizeHeight: 300
resolutionStandard: HD.720
compressionInfo [ComponentId=‘clarin.eu:cr1:c_1360230992165’]:
compression: true
compressionName: mpg
corpusPartInfo [ComponentId=‘clarin.eu:cr1:c_1407745711885’]:
mediaType: audio
corpusAudioInfo [ComponentId=‘clarin.eu:cr1:c_1404130561236’]:
audioSizeInfo [ComponentId=‘clarin.eu:cr1:c_1360230992160’]:
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: approx 150 GB
sizeUnit: gb
audioContentInfo [ComponentId=‘clarin.eu:cr1:c_1360230992161’]:
textualDescription: Norway:
1)old audio recordings from Målførearkivet, University of Oslo. Interviews: 126 informants from 52 places.
2) New recordings: informal conversations and semi-formal interwievs. 438 informants from 111 places.

Sweden: interviews. 109 informants from 31 places.
+ Âlvdalen, Sweden: interviews and conversations: 17 informants from 7 places

Denmark: interviews: 81 informants from 15 places

Iceland: intervievs and conversations: 32 informants from 7 places

Faroese islands: intervievs and conversations: 20 informants from 5 place
settingInfo [ComponentId=‘clarin.eu:cr1:c_1360230992162’]:
naturality: spontaneous
conversationalType: dialogue
audience: few
interactivity: overlapping
interaction: Two scenarios:
1) (semiformal) interview: research assistant or researcher and informant(s).
2) Free conversation between two informants. Research assistants were sometimes passively present in the room during the conversations to prevent conversations about sensitive matters
audioFormatInfo [ComponentId=‘clarin.eu:cr1:c_1427452477070’]:
mimeType: wav and mpeg4
signalEncoding: linearPCM
samplingRate: 32
quantization: 64
numberOfTracks: 1
recordingQuality: medium
compressionInfo [ComponentId=‘clarin.eu:cr1:c_1360230992165’]:
compression: true
compressionName: mpg
corpusPartGeneralInfo [ComponentId=‘clarin.eu:cr1:c_1407745711882’]:
personSourceSetInfo [ComponentId=‘clarin.eu:cr1:c_1360931019775’]:
numberOfPersons: 823
ageOfPersons: teenager
ageOfPersons: adult
ageOfPersons: elderly
ageRangeStart: 15
ageRangeEnd: 90
sexOfPersons: mixed
originOfPersons: native
dialectAccentOfPersons: Dialects from Norway, Sweden, Denmark, the Faroe Islands, Iceland and Älvdalen.
geographicDistributionOfPersons: Norway, Sweden, Denmark, the Faroe Islands, Iceland and Älvdalen
lingualityInfo [ComponentId=‘clarin.eu:cr1:c_1355150532313’]:
lingualityType: multilingual
multilingualityType: other
multilingualityTypeDetails: Interviews and conversations in 5 scandinavian languages. Can be translated to english by google translate
languageInfo [ComponentId=‘clarin.eu:cr1:c_1428388179423’]:
languageId: no
languageName: Norwegian
sizePerLanguage [ComponentId=‘clarin.eu:cr1:c_1447674760349’]:
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 2 187 046
sizeUnit: words
languageVarietyInfo [ComponentId=‘clarin.eu:cr1:c_1428388179422’]:
languageVarietyType: dialect
languageVarietyName: Dialects from 163 places in Norway, 564 informants
languageInfo [ComponentId=‘clarin.eu:cr1:c_1428388179423’]:
languageId: Sv
languageName: Swedish (Övdalien included)
sizePerLanguage [ComponentId=‘clarin.eu:cr1:c_1447674760349’]:
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 307 861
sizeUnit: words
languageVarietyInfo [ComponentId=‘clarin.eu:cr1:c_1428388179422’]:
languageVarietyType: dialect
languageVarietyName: Dialects from 38 places in Sweden, 126 informants
languageInfo [ComponentId=‘clarin.eu:cr1:c_1428388179423’]:
languageId: Da
languageName: Danish
sizePerLanguage [ComponentId=‘clarin.eu:cr1:c_1447674760349’]:
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 211 266
sizeUnit: words
languageVarietyInfo [ComponentId=‘clarin.eu:cr1:c_1428388179422’]:
languageVarietyType: dialect
languageVarietyName: Dialects from 15 places in Denmark. 81 informants
languageInfo [ComponentId=‘clarin.eu:cr1:c_1428388179423’]:
languageId: Is
languageName: Icelandic
sizePerLanguage [ComponentId=‘clarin.eu:cr1:c_1447674760349’]:
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 61 294
sizeUnit: words
languageVarietyInfo [ComponentId=‘clarin.eu:cr1:c_1428388179422’]:
languageVarietyType: dialect
languageVarietyName: Dialects from 7 places in Iceland, 32 informants
languageInfo [ComponentId=‘clarin.eu:cr1:c_1428388179423’]:
languageId: fo
languageName: Faroese
sizePerLanguage [ComponentId=‘clarin.eu:cr1:c_1447674760349’]:
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 62 411
sizeUnit: words
languageVarietyInfo [ComponentId=‘clarin.eu:cr1:c_1428388179422’]:
languageVarietyType: dialect
languageVarietyName: Dialects from 5 places on the Faroese islands, 20 informants
languageInfo [ComponentId=‘clarin.eu:cr1:c_1428388179423’]:
languageId: Nb
languageName: Norwegian Bokmål
modalityInfo [ComponentId=‘clarin.eu:cr1:c_1447674760356’]:
modalityType: spokenLanguage
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 2 829 878
sizeUnit: words
annotationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711924’]:
annotationType: morphosyntacticAnnotation-posTagging
annotatedElements: other
segmentationLevel: word
annotationFormat: See http://www.tekstlab.uio.no/nota/scandiasyn/tagging.html for tagging of the five languages
tagset: See http://www.tekstlab.uio.no/nota/scandiasyn/tagging.html for tagging of the five languages
annotationMode: automatic
annotationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711924’]:
annotationType: speechAnnotation-phoneticTranscription
annotationManualUnstructured [ComponentId=‘clarin.eu:cr1:c_1355150532325’]:
role: annotationManual
documentUnstructured: Norwegian and Övdalian have phonetic transcriptions, see http://www.tekstlab.uio.no/nota/scandiasyn/transcription.html
annotationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711924’]:
annotationType: speechAnnotation-orthographicTranscription
annotationManualUnstructured [ComponentId=‘clarin.eu:cr1:c_1355150532325’]:
role: annotationManual
documentUnstructured: All languages are ortographical transcribed, see http://www.tekstlab.uio.no/nota/scandiasyn/transcription.html
annotationTool [ComponentId=‘clarin.eu:cr1:c_1355150532326’]:
targetResourceNameURI: Transcriber (http://trans.sourceforge.net/en/presentation.php )
classificationInfo [ComponentId=‘clarin.eu:cr1:c_1403588862809’]:
genreInfo [ComponentId=‘clarin.eu:cr1:c_1407745711877’]:
genreType: speechGenre
genre: informal
unstandardisedGenre: conversations
classificationInfo [ComponentId=‘clarin.eu:cr1:c_1403588862809’]:
genreInfo [ComponentId=‘clarin.eu:cr1:c_1407745711877’]:
genreType: speechGenre
genre: semi formal
unstandardisedGenre: interviews
timeCoverageInfo [ComponentId=‘clarin.eu:cr1:c_1447674760358’]:
timeCoverage: 1951 - 2014
geographicCoverageInfo [ComponentId=‘clarin.eu:cr1:c_1447674760357’]:
geographicCoverage: Norway, Sweden, Denmark, the Faroe Islands, Iceland and Älvdalen
recordingInfo [ComponentId=‘clarin.eu:cr1:c_1426673949970’]:
recordingDeviceType: tapeVHS
recordingDeviceType: tapeVHS
recordingDeviceType: other
recordingEnvironment: office
recordingEnvironment: closedPublicPlace
recordingEnvironment: conferenceRoom
recordingEnvironment: lectureRoom
recordingEnvironment: other
captureInfo [ComponentId=‘clarin.eu:cr1:c_1407745712025’]:
capturingDeviceType: closeTalkMicrophone
capturingDeviceType: camera