CMDI 1.1. Metadata
Header
MdCreator: Kristin Hagen
MdCreationDate: 2015-03-04
MdSelfLink:
MdProfile: clarin.eu:cr1:p_1407745711925
MdCollectionDisplayName: Clarino - Textlab
Resources
ResourceProxyList:
JournalFileProxyList:
ResourceRelationList:
IsPartOfList:
Components
corpusProfile:
resourceCommonInfo [ComponentId=‘clarin.eu:cr1:c_1396012485126’]:
resourceType: corpus
identificationInfo [ComponentId=‘clarin.eu:cr1:c_1396012485125’]:
resourceName [xml:lang=‘nb’]: Amerikanordisk talespråkskorpus
resourceName [xml:lang=‘en’]: Corpus of American Nordic Speech
description [xml:lang=‘en’]: Corpus of American Nordic Speech (CANS)is a small speech corpus with informants from USA and Canada speaking Norwegian and Swedish. Most of the informants learnt to speak their Nordic language as children at home.
CANS contains both conversations and interviews. The transcriptions are both phonetic and orthographic and are linked to audio and video.
Corpus of American Nordic Speech will be extended with both new and older recordings and transcriptions.
resourceShortName: CANS
url: http://www.tekstlab.uio.no/norskiamerika/english/index.html
url: https://sites.google.com/site/svenskaniamerika/home/english
PID: http://hdl.handle.net/11538/0000-0005-E7C9-4
distributionInfo [ComponentId=‘clarin.eu:cr1:c_1396012485124’]:
licenceInfo [ComponentId=‘clarin.eu:cr1:c_1396012485158’]:
userCategory: Academic
distributionAccessMedium: accessibleThroughInterface
executionLocation: http://www.tekstlab.uio.no/norskiamerika/english/index.html
licence [ComponentId=‘clarin.eu:cr1:c_1447674760330’]:
licenceFamily: CLARIN
licenceName: CLARIN_ACA-NC-LOC-PRIV-ND-*
licenceURL: https://kitwiki.csc.fi/twiki/bin/view/FinCLARIN/ClarinEulaAca?ID=1&AFFIL=EDU&BY=1&NC=1&LOC=1&PRIV=1&NORED=1&ND=1
conditionsOfUse: *
conditionsOfUse: BY
conditionsOfUse: ID
conditionsOfUse: LOC
conditionsOfUse: NC
conditionsOfUse: ND
conditionsOfUse: NORED
conditionsOfUse: PRIV
nonStandardConditionsOfUse: The corpus has audio and video recordings classified as personal data. In agreement with NSD, the Data Protection Official in Norway, the corpus is accessible only through Glossa, a search and post-processing tool developed by the Text Laboratory.
The video and audio excerpts given by the search interface can not be shown in public unless you have an agreement with the Text Laboratory.
Please note that every individual researcher is responsible for treating the participants in the corpus with respect and sincerity. Furthermore, the participants must be kept anonymous in every published paper or other output.
licensor:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName [xml:lang=‘en’]: University of Oslo
organizationName [xml:lang=‘no’]: Universitetet i Oslo
organizationShortName [xml:lang=‘no’]: UiO
organizationShortName [xml:lang=‘en’]: UoO
departmentName [xml:lang=‘en’]: Department of Linguistics and Scandinavian Studies
departmentName [xml:lang=‘no’]: Institutt for lingvistiske og nordiske studier (ILN)
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
distributionRightsHolder:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName [xml:lang=‘en’]: University of Oslo
organizationName [xml:lang=‘no’]: Universitetet i Oslo
organizationShortName [xml:lang=‘no’]: UiO
organizationShortName [xml:lang=‘en’]: UoO
departmentName [xml:lang=‘en’]: Department of Linguistics and Scandinavian Studies
departmentName [xml:lang=‘no’]: Institutt for lingvistiske og nordiske studier (ILN)
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
contact:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName: The Text Laboratory
organizationShortName: Textlab
departmentName: Department of Linguistics and Scandinavian Studies, University of Oslo
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
metadataInfo [ComponentId=‘clarin.eu:cr1:c_1407745711922’]:
metadataCreationDate: 2015-03-04
metadataLastDateUpdated: 2017-10-13
metadataCreator:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: person
personInfo [ComponentId=‘clarin.eu:cr1:c_1396012485192’]:
surname: Hagen
givenName: Kristin
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName: The Text Laboratory
organizationShortName: Textlab
departmentName: Department of Linguistics and Scandinavian Studies, University of Oslo
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: kristin.hagen@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
versionInfo [ComponentId=‘clarin.eu:cr1:c_1430905751648’]:
version: version 2
updateFrequency: The corpus will be extended with both new and older recordings and transcriptions.
validationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711923’]:
validated: true
validationType: content
validationMode: manual
validationModeDetails: The transcriptions are proofread against the audio files.
validationExtent: full
validator:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName: The Text Laboratory
organizationShortName: Textlab
departmentName: Department of Linguistics and Scandinavian Studies, University of Oslo
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
resourceDocumentationInfo [ComponentId=‘clarin.eu:cr1:c_1355150532301’]:
documentationUnstructured [ComponentId=‘clarin.eu:cr1:c_1355150532302’]:
role: documentation
documentUnstructured: http://www.tekstlab.uio.no/norskiamerika/english/index.html
resourceCreationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711921’]:
creationStartDate: 2010-01-01
resourceCreator:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName: The Text Laboratory
organizationShortName: Text Lab
departmentName: Department of Linguistics and Scandinavian Studies, University of Oslo
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: person
personInfo [ComponentId=‘clarin.eu:cr1:c_1396012485192’]:
surname: Larsson
givenName: Ida
sex: female
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName [xml:lang=‘en’]: University of Oslo
organizationName [xml:lang=‘no’]: Universitetet i Oslo
organizationShortName [xml:lang=‘no’]: UiO
organizationShortName [xml:lang=‘en’]: UoO
departmentName [xml:lang=‘en’]: Department of Linguistics and Scandinavian Studies
departmentName [xml:lang=‘no’]: Institutt for lingvistiske og nordiske studier (ILN)
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: ida.larsson@iln.uio.no
fundingProject:
projectInfo [ComponentId=‘clarin.eu:cr1:c_1430905751647’]:
projectName [xml:lang=‘en’]: Norwegian in America
projectShortName [xml:lang=‘en’]: NorAmDiaSyn
fundingType: nationalFunds
funder: The Research Council of Norway
fundingCountry: Norway
fundingProject:
projectInfo [ComponentId=‘clarin.eu:cr1:c_1430905751647’]:
projectName [xml:lang=‘en’]: Norwegian in America
projectShortName [xml:lang=‘en’]: NorAmDiaSyn
fundingType: other
funder: Department of Linguistics and Scandinavian Studies, University of Tromsø (through Merete Anderssen and Marit Westergaard)
fundingCountry: Norway
fundingProject:
projectInfo [ComponentId=‘clarin.eu:cr1:c_1430905751647’]:
projectName [xml:lang=‘en’]: Norwegian in America
projectShortName: NorAmDiaSyn
fundingType: ownFunds
funder: The Text Laboratory
fundingCountry: Norway
fundingProject:
projectInfo [ComponentId=‘clarin.eu:cr1:c_1430905751647’]:
projectName [xml:lang=‘en’]: Language Infrastructure made Accessible
projectShortName [xml:lang=‘en’]: LIA
url: http://www.hf.uio.no/iln/english/research/projects/language-infrastructure-made-accessible/index.html
fundingType: nationalFunds
funder: The Research Council of Norway
fundingCountry: Norway
projectStartDate: 2014-04-01
projectEndDate: 2019-04-01
fundingProject:
projectInfo [ComponentId=‘clarin.eu:cr1:c_1430905751647’]:
projectName [xml:lang=‘en’]: Swedish in America
projectShortName: Swedish in America
url: https://sites.google.com/site/svenskaniamerika/home/english
fundingType: nationalFunds
funder: Torsten Söderbergs stiftelse
fundingCountry: Sweden
projectStartDate: 2010-01-01
projectEndDate: 2011-12-31
fundingProject:
projectInfo [ComponentId=‘clarin.eu:cr1:c_1430905751647’]:
projectName [xml:lang=‘en’]: Swedish in America
projectShortName: Swedish in America
fundingType: ownFunds
funder: Department of Linguistics and Scandinavian Studies, UiO
fundingCountry: Norway
projectStartDate: 2015-01-01
projectEndDate: 2017-08-31
corpusInfo [ComponentId=‘clarin.eu:cr1:c_1407745711878’]:
corpusType: Multilingual Corpus
corpusPartInfo [ComponentId=‘clarin.eu:cr1:c_1407745711885’]:
mediaType: text
corpusTextInfo [ComponentId=‘clarin.eu:cr1:c_1396012485188’]:
textFormatInfo [ComponentId=‘clarin.eu:cr1:c_1427452477072’]:
mimeType: .txt
sizePerTextFormat [ComponentId=‘clarin.eu:cr1:c_1447674760342’]:
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 243 732
sizeUnit: words
characterEncodingInfo [ComponentId=‘clarin.eu:cr1:c_1447674760355’]:
characterEncoding: utf-8
corpusPartInfo [ComponentId=‘clarin.eu:cr1:c_1407745711885’]:
mediaType: video
corpusVideoInfo [ComponentId=‘clarin.eu:cr1:c_1407745711880’]:
videoContentInfo [ComponentId=‘clarin.eu:cr1:c_1360931019779’]:
typeOfVideoContent: Interviews and conversations between American Norwegians
textIncludedInVideo: none
settingInfo [ComponentId=‘clarin.eu:cr1:c_1360230992162’]:
naturality: spontaneous
conversationalType: multilogue
audience: some
interactivity: overlapping
videoFormatInfo [ComponentId=‘clarin.eu:cr1:c_1427452477073’]:
mimeType: video in streaming format mp4 available through Glossa
frameRate: 25
resolutionInfo [ComponentId=‘clarin.eu:cr1:c_1360931019784’]:
sizeWidth: 400
sizeHeight: 300
resolutionStandard: HD.720
compressionInfo [ComponentId=‘clarin.eu:cr1:c_1360230992165’]:
compression: true
compressionName: mpg
corpusPartInfo [ComponentId=‘clarin.eu:cr1:c_1407745711885’]:
mediaType: audio
corpusAudioInfo [ComponentId=‘clarin.eu:cr1:c_1404130561236’]:
audioSizeInfo [ComponentId=‘clarin.eu:cr1:c_1360230992160’]:
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: approx 10 GB
sizeUnit: gb
audioContentInfo [ComponentId=‘clarin.eu:cr1:c_1360230992161’]:
textualDescription: Interviews and conversations between American Norwegians and American Swedes
settingInfo [ComponentId=‘clarin.eu:cr1:c_1360230992162’]:
naturality: spontaneous
conversationalType: dialogue
audience: some
interactivity: overlapping
interaction: Two scenarios: one semiformal interview: research assistant/researcher and informant. One free conversation between two informants.
audioFormatInfo [ComponentId=‘clarin.eu:cr1:c_1427452477070’]:
mimeType: wav and mp4
signalEncoding: linearPCM
samplingRate: 32
quantization: 64
numberOfTracks: 1
recordingQuality: medium
compressionInfo [ComponentId=‘clarin.eu:cr1:c_1360230992165’]:
compression: true
compressionName: mp3
corpusPartGeneralInfo [ComponentId=‘clarin.eu:cr1:c_1407745711882’]:
personSourceSetInfo [ComponentId=‘clarin.eu:cr1:c_1360931019775’]:
numberOfPersons: 69
ageOfPersons: elderly
ageRangeStart: 50
ageRangeEnd: 98
sexOfPersons: mixed
originOfPersons: native
dialectAccentOfPersons: American-Norwegian and American-Swedish
geographicDistributionOfPersons: USA and Canada
lingualityInfo [ComponentId=‘clarin.eu:cr1:c_1355150532313’]:
lingualityType: bilingual
languageInfo [ComponentId=‘clarin.eu:cr1:c_1428388179423’]:
languageId: Nb
languageName: Norwegian Bokmål
languageVarietyInfo [ComponentId=‘clarin.eu:cr1:c_1428388179422’]:
languageVarietyType: dialect
languageVarietyName: American Norwegian dialects
sizePerLanguageVariety [ComponentId=‘clarin.eu:cr1:c_1447674760341’]:
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 197 889
sizeUnit: words
languageInfo [ComponentId=‘clarin.eu:cr1:c_1428388179423’]:
languageId: Sv
languageName: Swedish
sizePerLanguage [ComponentId=‘clarin.eu:cr1:c_1447674760349’]:
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 45 843
sizeUnit: words
languageVarietyInfo [ComponentId=‘clarin.eu:cr1:c_1428388179422’]:
languageVarietyType: dialect
languageVarietyName: American-Swedish dialects
modalityInfo [ComponentId=‘clarin.eu:cr1:c_1447674760356’]:
modalityType: spokenLanguage
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 243 732
sizeUnit: words
annotationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711924’]:
annotationType: morphosyntacticAnnotation-posTagging
annotatedElements: other
segmentationLevel: word
tagset: POS tagset created for the statistical NoTa-tagger - based on the tagset of the Oslo Bergen Tagger.
tagsetLanguageId: nb
tagsetLanguageName: Norwegian Bokmål
theoreticModel: TreeTagger
annotationMode: automatic
annotationManualStructured [ComponentId=‘clarin.eu:cr1:c_1361876010647’]:
role: annotationManual
documentInfo [ComponentId=‘clarin.eu:cr1:c_1353678848788’]:
documentType: manual
title [xml:lang=‘nb’]: NoTa-taggeren: TAGGEVEILEDNING
author: Åshild Søfteland
year: 2007
url: http://www.tekstlab.uio.no/nota/oslo/Taggeveiledning2.pdf
documentLanguageName: Norwegian bokmål
documentLanguageId: nb
annotationManualStructured [ComponentId=‘clarin.eu:cr1:c_1361876010647’]:
role: annotationManual
documentInfo [ComponentId=‘clarin.eu:cr1:c_1353678848788’]:
documentType: article
title [xml:lang=‘en’]: Tagging a Norwegian Speech Corpus
author: Anders Nøklestad and Åshild Søfteland
editor: Joakim Nivre,Heiki-Jaan Kaalep,Kadri Muischnek, Mare Koit
year: 2007
bookTitle: Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007
pages: 245–248
conference: Nodalida 2007
documentLanguageName: English
documentLanguageId: en
annotationManualStructured [ComponentId=‘clarin.eu:cr1:c_1361876010647’]:
role: annotationManual
documentInfo [ComponentId=‘clarin.eu:cr1:c_1353678848788’]:
documentType: article
title [xml:lang=‘nb’]: Manuell morfologisk
tagging av NoTa-materialet med støtte fra en statistisk tagger.
author: Åshild Søfteland og Anders Nøklestad
editor: Janne Bondi Johannessen og Kristin Hagen
year: 2008
publisher: Novus forlag
bookTitle: Språk i Oslo. Ny forskning omkring talespråk
pages: 226–234.
ISBN: 978-82-7099-471-7
documentLanguageName: Norwegian
documentLanguageId: nb
annotationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711924’]:
annotationType: speechAnnotation-phoneticTranscription
segmentationLevel: word
annotationMode: manual
annotationManualUnstructured [ComponentId=‘clarin.eu:cr1:c_1355150532325’]:
role: annotationManual
documentUnstructured: http://www.tekstlab.uio.no/norskiamerika/english/index.html
annotationTool [ComponentId=‘clarin.eu:cr1:c_1355150532326’]:
targetResourceNameURI: Transcriber (http://trans.sourceforge.net/en/presentation.php )
annotationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711924’]:
annotationType: morphosyntacticAnnotation-posTagging
annotatedElements: other
segmentationLevel: word
tagset: PAROLE tag set customized for Nordic Dialect Corpus
tagsetLanguageName: Swedish
theoreticModel: TnT
annotationMode: automatic
annotationManualUnstructured [ComponentId=‘clarin.eu:cr1:c_1355150532325’]:
role: annotationManual
documentUnstructured: See documentation on the Nordic Dialect Corpus web page: http://www.tekstlab.uio.no/nota/scandiasyn/tagging.html
classificationInfo [ComponentId=‘clarin.eu:cr1:c_1403588862809’]:
genreInfo [ComponentId=‘clarin.eu:cr1:c_1407745711877’]:
genreType: speechGenre
genre: informal
timeCoverageInfo [ComponentId=‘clarin.eu:cr1:c_1447674760358’]:
timeCoverage: Interviews and conversations from 2010 - 2015
geographicCoverageInfo [ComponentId=‘clarin.eu:cr1:c_1447674760357’]:
geographicCoverage: Informants from 25 places in USA and Canada speaking Norwegian and Swedish
recordingInfo [ComponentId=‘clarin.eu:cr1:c_1426673949970’]:
recordingDeviceType: hardDisk
recordingEnvironment: office
recordingEnvironment: closedPublicPlace
recordingEnvironment: conferenceRoom
recordingEnvironment: lectureRoom
recordingEnvironment: other
recorderActor:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName: The Text Laboratory
organizationShortName: Textlab
departmentName: Department of Linguistics and Scandinavian Studies, University of Oslo
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
captureInfo [ComponentId=‘clarin.eu:cr1:c_1407745712025’]:
capturingDeviceType: closeTalkMicrophone
capturingDeviceType: camera
creationInfo [ComponentId=‘clarin.eu:cr1:c_1360230992154’]:
creationMode: manual