Skip to content

Commit

Permalink
Update TTS and ASR examples, with newest versions from Katri, both SM…
Browse files Browse the repository at this point in the history
…E and SMJ
  • Loading branch information
snomos committed Feb 28, 2024
1 parent 6f0ecf1 commit 66dcf1c
Show file tree
Hide file tree
Showing 7 changed files with 71 additions and 10 deletions.
Binary file added public/1_Dálla_FP_660univnet-4416.wav
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added public/80_99_north_s_00_001-4499.wav
Binary file not shown.
Binary file not shown.
Binary file removed public/referansa.wav
Binary file not shown.
81 changes: 71 additions & 10 deletions slides.md
Original file line number Diff line number Diff line change
Expand Up @@ -863,13 +863,12 @@ Under development (release this summer):
### Grammar Checker Demo
-->
---
layout: two-cols
---
## Text-to-speech (TTS)
<br/>
<br/>
<v-clicks>
* Commercial, closed source since 2014 — North Sámi
Expand All @@ -894,38 +893,100 @@ Test sample, North Sámi, 5h recordings:
</div>
::right::
<div v-after>
Test sample, old closed-source synthesis:
> Sámediggi lea Suoma sápmelaččaid alimus politihkalaš orgána, mii ovddasta sápmelaččaid sihke riikkadási ja riikkaidgaskasaš oktavuođain.
<audio controls="controls">
<source type="audio/wav" src="/5_Sámediggi_acapela-test-sentences-1-yle-4418.wav"/>
<p>Your browser does not support the audio element.</p>
</audio>
Same text with new, ML-based synthesis, ca 10 hours:
<audio controls="controls">
<source type="audio/wav" src="/5_Sámediggi_FP_F_470univnet_spkr2-4371.wav"/>
<p>Your browser does not support the audio element.</p>
</audio>
And finally, SMJ synthesis sample:
> – Dálla gå dáhtámasjijnaj giella ållåsit sijdajda tjágŋá, de sijda julevsámegielak ariednán aj ájteduvvi.
<audio controls="controls">
<source type="audio/wav" src="/1_Dálla_FP_660univnet-4416.wav"/>
<p>Your browser does not support the audio element.</p>
</audio>
</div>
---
layout: two-cols
---
## Automatic speech recognition (ASR)
## Automatic speech recognition (ASR) - North Sámi
<br/>
<br/>
* experiments w/ only 35 hours of transcribed speech
* experiments w/ only 35 (50?) hours of transcribed speech
* Whisper model
* very promising, given the starting point
* first target use:
* subtitling
::right::
### Example
### Example (from Norw. Sámi Parliament discussions)
> ja Norgga sámit riikkasearvvis mun maiddái geahčen dien ee dien ee total reanskkaskáhpa ja oidnen dan ahte ođđamárket diet buolbmát várggáid dá Várjjat guovllus
<audio controls="controls">
<source type="audio/wav" src="/80_99_north_s_00_001-4499.wav"/>
<p>Your browser does not support the audio element.</p>
</audio>
Generated transcript:
> Ja Norgga __Sámiid__ Riikkasearvvis. __Eh__ mun maiddái __gehččen__ dien eh dien __dien__ eh __[totalregnskap]__ ja oidnen dan ahte __ovdamearkka__ __dihte__ Buolbmát Várggáid __dahje__ Várjjat guovllus
- Errors are bold-faced
- upper/lower case errors are not marked
---
layout: two-cols
---
## Automatic speech recognition (ASR) - Lule Sámi
<br/>
<br/>
* experiments w/ just 20+ hours of transcribed speech
* also Whisper model
* surprisingly good
::right::
### Example (from Norw. public broadcaster NRK)
> Ja de bosui davvebiegga nu garrasit go sáhtii, muhto mađi eanet son bosui, dađi čavgadeappot vánddardeaddji giesai jáhka iežas birra. De beaivváš báitigođii hui lieggasit, nu lieggasit ahte vánddardeaddji ovttatmanos nuolai jáhka. Ja nie šattai davvebiegga mieđihit ahte beaivváš lei sudnos gievrrat.
> Ja mån lav badjánam jåhkå gasska tjielden danna muv mánnávuohta ja muv nuorravuohta årrum.
<audio controls="controls">
<source type="audio/wav" src="/referansa.wav"/>
<source type="audio/wav" src="/NRK_Dan_i_diede_jus_i_gatjada_Dalla_ja_dalloj_OF1_002-4726.wav"/>
<p>Your browser does not support the audio element.</p>
</audio>
Generated transcript:
> ja de bosui davvebiegga nu garrasit go sáhtii muhto __mađe eanas__ son bosui dađi čávgadeappot vánddardeaddji __geasái__ jahka iežas birra De beaivváš __báikegođii__ hui __lieggasiid__ nu lieggasiid ahte vánddardeaddji __ovttatmánus__ nuolai __jahka__ ja nie šattai davvebiegga __međihit__ ahte beaivváš lei __sutnos kievrrat__
> Ja mån lav badjánam __Jåhkågasska__ tjielden, danna __l__ muv mánnávuohta ja muv nuorravuohta årrum.
- Errors are bold-faced
- punctuation is lost
- upper/lower case errors are not marked
---
Expand Down

0 comments on commit 66dcf1c

Please sign in to comment.