<a href="https://smidgeo.com/notes/deathmtn">deathmtn</a>

10/13/2024, 3:35:11 PM

Vosk's basic example took forever and transcribed nothing. Whisper is getting stuff, but the "medium"-sized model has taken 1.5 hours for 30 minutes of audio. It did say it would take double the time, so I guess that's sort of in the ballpark?

Anyway, the future is not as futuristic as one might expect.