Difference between revisions of "Parallel Annotation of Speech and Text"
m (Protected "Parallel Annotation of Speech and Text" ([edit=autoconfirmed] (indefinite) [move=autoconfirmed] (indefinite))) |
|||
| Line 4: | Line 4: | ||
Goal of this short pilot has been parallel sound and text annotation. The study has been conducted by Professor [[User:Wim van Dommelen|Wim van Dommelen]] and Assc.Professor [[User:Dorothee Beermann|Dorothee Beermann]] at the [http://www.ntnu.no/isk Institute of Languages and Communication Studies] at the [http://www.ntnu.no Norwegian University of Science and Technology].Scientific assistant for the project was [[User:Asger Hagerup|Asger Hagerup]]. The project has been funded by [http://www.ntnu.no/hf/satsing/sstl the SSTL]. | Goal of this short pilot has been parallel sound and text annotation. The study has been conducted by Professor [[User:Wim van Dommelen|Wim van Dommelen]] and Assc.Professor [[User:Dorothee Beermann|Dorothee Beermann]] at the [http://www.ntnu.no/isk Institute of Languages and Communication Studies] at the [http://www.ntnu.no Norwegian University of Science and Technology].Scientific assistant for the project was [[User:Asger Hagerup|Asger Hagerup]]. The project has been funded by [http://www.ntnu.no/hf/satsing/sstl the SSTL]. | ||
| − | The pilot investigated | + | The pilot investigated how to integrate presentations of linguistically annotated audio and text material, combining [http://www.fon.hum.uva.nl/praat/ Praat] and [[Main Page|TypeCraft]]. |
| − | '''Praat''' is a signal analysis software developed by [http://www.fon.hum.uva.nl/paul/ Paul Boersma] and [http://www.fon.hum.uva.nl/david/ David Weenink] from the [http://www.example.com University of Amsterdam]. It is a tool widely used for the annotation of sound objects. For the present study we have taken advantage of the fact that Praat annotation data resides in a TextGrid object that exists separately from the sound object. | + | '''Praat''' is a signal analysis software developed by [http://www.fon.hum.uva.nl/paul/ Paul Boersma] and [http://www.fon.hum.uva.nl/david/ David Weenink] from the [http://www.example.com University of Amsterdam]. It is a tool widely used for the annotation of sound objects. For the present study we have taken advantage of the fact that Praat annotation data resides in a TextGrid object that exists separately from the sound object. Using annotated tiers allows easy referencing of data across applications. At present our sound signal representations are static, and selective, that is, they focus on the presentation of one selected feature to illustrate interesting correlations across phonetic and linguistic categories. |
Revision as of 13:45, 19 March 2010
This page is under construction
Contents
Project Description
Goal of this short pilot has been parallel sound and text annotation. The study has been conducted by Professor Wim van Dommelen and Assc.Professor Dorothee Beermann at the Institute of Languages and Communication Studies at the Norwegian University of Science and Technology.Scientific assistant for the project was Asger Hagerup. The project has been funded by the SSTL.
The pilot investigated how to integrate presentations of linguistically annotated audio and text material, combining Praat and TypeCraft.
Praat is a signal analysis software developed by Paul Boersma and David Weenink from the University of Amsterdam. It is a tool widely used for the annotation of sound objects. For the present study we have taken advantage of the fact that Praat annotation data resides in a TextGrid object that exists separately from the sound object. Using annotated tiers allows easy referencing of data across applications. At present our sound signal representations are static, and selective, that is, they focus on the presentation of one selected feature to illustrate interesting correlations across phonetic and linguistic categories.
Description of the material
For our study we selected 10 sentences from the phonetic database of the Sound to Sense project.
Sentences 1 to 3
Speaker dialect: Bergen
| Jeg |
| e |
| 1SG |
| PN |
| ser | |
| se: | r |
| see | PRES |
| V | |
| bildet | |
| bild | e |
| picture | DEFSG |
| N | |
| kan |
| kan: |
| canPRES |
| V |
| du |
| ʉ |
| 2SG |
| CL |
| si |
| si: |
| sayINF |
| V |
| litt |
| lit: |
| a.little |
| ADVm |
| på |
| po |
| onDIR |
| PREP |
| skrått | |
| skro: | t |
| diagonal | ADJ>ADV |
| ADVm | |
| ned |
| ned |
| downDIR |
| ADVm |
| ovenifra |
| ovenifra |
| from.aboveDIRSRC |
| ADVm |
File download for viewing in the Praat(Downloading Help):
| Det |
| de |
| 3SGNEUT |
| PN |
| dekker | |
| dek: | er |
| cover | PRES |
| V | |
| omtrent |
| umtrent |
| approximately |
| ADVm |
| hele | |
| he:l | e |
| whole | DEF |
| ADJ | |
| det |
| de |
| DEFSGNEUT |
| ART |
| venstre |
| venstre |
| left |
| ADVm |
| mest |
| mest |
| mostSUP |
| ADJ |
| altså |
| aso |
| that.isDM |
| ADVm |
| venstreste | ||
| venstre | st | e |
| left | SUPMU | DEF |
| ADJ | ||
| kortsiden | ||
| kort | sid | en |
| short | side | DEFSG |
| N | ||
File download for Praat (Downloading Help):
| Hun |
| hun |
| 3SGFEM |
| PN |
| står | |
| sto: | r |
| stand | PRES |
| V | |
| med |
| med |
| withMNR |
| PREP |
| ryggen | |
| ryɡ: | en |
| back | DEFSG |
| N | |
| mot |
| mut |
| againstDIR |
| PREP |
| veggen | |
| veɡ: | en |
| wall | DEFSG |
| N | |
| opp |
| up |
| upDIRMU |
| PREP |
| og |
| o |
| and |
| CONJC |
| ser | |
| se: | r |
| see | PRES |
| V | |
| på |
| po |
| atDIR |
| PREP |
| han |
| han |
| 3SGMASC |
| PN |
| som |
| som |
| PNrel |
| skal |
| skal: |
| shallPRES |
| V |
| kaste | |
| kast | e |
| throw | INF |
| V | |
| ballen | |
| bal: | en |
| ball | DEFSG |
| N | |
| som |
| som |
| PNrel |
| står | |
| sto: | r |
| stand | PRES |
| V | |
| utenfor |
| ʉtenfor |
| outside |
| ADVm |
| og |
| o |
| and |
| CONJC |
| peker | |
| pe:k | er |
| point | PRES |
| V | |
| på |
| po |
| atDIR |
| PREP |
| boksene | |
| boks | ene |
| box | DEFPL |
| N | |
File download for Praat (Downloading Help):
Speaker Dialect: Trondheim
Parallel Annotation of Speech and Text - Part 2
Speaker Dialect: Eastern Norway
Parallel Annotation of Speech and Text - Part 3
About the TextGrid files
The TextGrid files are opened together with the matching sound files for viewing in the Praat application. The TextGrid files consist of three tiers, 'Word' (rendered in Bokmål orthography) 'Phoneme' (shows underlying segments) and 'Note' (shows surface realisation with IPA symbols, and other notes).
Here is a list of glosses used in the 'Note' tier:
Phonology/Phonetics:
BrV = Segent realised with breathy voice
CrV = Segent realised with creaky voice
DV = Underlying voiced segment realised devoiced
EPN = Epenthesis
RD = Reduction of segment (e.g. corner vowel realised as schwa or plosive as fricative).
V = Underlying non-voiced segment realised voiced
Morphophonology/Syntax
CL = Clitic
Other
ERR = The speaker errs and corrects himself
HES = (Audible) hesitation from speaker
Downloading Help
When clicking on the file links called Sound and TextGrid the files will open in a separate window in your browser.
Go to *FILE*, right click and select *Save this Page as*.
You now are able to save the file to a place of your choice in your home directory.