mstdn.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A general-purpose Mastodon server with a 500 character limit. All languages are welcome.

Administered by:

Server stats:

14K
active users

#speechtech

0 posts0 participants0 posts today

Speech arises in the mind. Therefore, speech using only thought is not a big surprise. Brain-computer interfaces (BCIs) are decoding neural activity, translating thoughts directly into words. This breakthrough could revolutionize communication for those with speech impairments, offering new hope for expressing thoughts without vocalization. #Neuroscience #BCI #Innovation #SpeechTech

thedebrief.org/speech-using-on

#SpeechTech #SummerSchool 2024: Charting new futures

📅 June 3--7 (register before April 15)
📍 Fryslân, the Netherlands
🔗 rug.nl/education/summer-winter

📣 Join this 5-day journey for the innovators, the dreamers, and the hands-on learners. Dive into a dynamic curriculum that blends the arts, ethics, design + tech. Engage in stimulating discussions and thought-provoking lectures. Experience a unique convergence of human-centered design and futuring!

University of Groningen · Speech Technology: Charting New FuturesExplore, Create, Transform: Dive into the world of speech tech in our 5-day Speech Technology  summer school – where innovation meets ethics and...

Tomorrow will start a conference on speech technology… One can imagine algorithms and the ‘AI’ variety to be of special interest for translations between speech, text, languages, meta data and other pseudo-semantic instances. I will be arguing the risks and offer a solution for human behavioural relations, pertaining to the use of so called ‘Large Language Models’s #LLM produce. Throw away after use!

There will be online streaming

lithme.eu/conference/
#AI #NLP #SpeechTech #Behaviour

lithme.eu3rd International Conference ‘Language in the Human-Machine Era’ – LITHME

Discover the Future of Speech Technology!

🤖 Speech Tech Summer School: robots, video games, voice cloning ... and more!
🗓 May 14-17
📍 Campus Fryslân - University of Groningen, the Netherlands

From robots 🤖 to video games 👾 to speech recognition 🗯, our hands-on summer school will feature workshops plus, speakers from Google, Respeecher, and more.

⏰ Space is limited!
rug.nl/education/summer-winter

#voicetechnology #speechtechnology #speechtech, #summerschool #speechrecognition #voiceai

University of Groningen · Speech Technology: Charting New FuturesExplore, Create, Transform: Dive into the world of speech tech in our 5-day Speech Technology  summer school – where innovation meets ethics and...
Replied to Nick Fisher

@nickfisherau Nice. Did you publish any details about your alignment code?
For post-edit - maybe #Praat is easier for #speechtech?
I recently did phone alignment tool as a gift to Czech #phonetics and minimizing dependencies (and install hassle), I finally went without anything like Kaldi, using just very basic #Pytorch for NN AM training and then alignment. I guess going zero-up was really easier than decomposing some big thing like Whisper.
github.com/vaclavhanzl/prak

GitHubGitHub - vaclavhanzl/prak: Czech phonetic alignment toolCzech phonetic alignment tool. Contribute to vaclavhanzl/prak development by creating an account on GitHub.
Continued thread

I still need to manually check (and occasionally correct) the alignments though.

I wrote a couple of Python scripts/extension for Audacity that loads up the audio/labels so you can manually drag the handles to adjust the alignments. Works pretty well!

I know there are various web interfaces for this but seems you need to use them for your entire pipeline (ingest/labelling/export/indexing/etc) or not at all.

💡 Interesting read on how one of the biggest commercial players out there plans to use Mozilla Open Voice data to make speech AI more inclusive and open to more language.

💬 Sounds idealistic, but Open Voice datasets are created by unpaid volunteers who donate hours and hours of their speech. Not sure whether I feel comfortable with that, tbh.

💭 Thoughts?

venturebeat.com/ai/nvidia-ente

VentureBeatNvidia takes on Meta and Google in the speech AI technology raceBy Victor Dey

We'll dig deeper into OpenAI Whisper (openai.com/blog/whisper/) this weekend. I'll announce a DateTime for the YT live stream here later.

Note: It's not a presentation but a session. You can also join in on the task/Livestream via audio/video through Google Meet.

YT Link: youtube.com/@datadrivenbabe

How much do you know about the sub-field already?

OpenAIIntroducing WhisperWe’ve trained and are open-sourcing a neural net called Whisper that approaches human level robustness and accuracy on English speech recognition. Read Paper View Code View Model Card Whisper examples: Reveal Transcript Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask