mstdn.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A general-purpose Mastodon server with a 500 character limit. All languages are welcome.

Administered by:

Server stats:

12K
active users

#speechrecognition

3 posts3 participants0 posts today
Niavy :verified: :bearn:<p>Il y a quelque temps, dans un fil sur les apps open source, je me suis fait rembarrer en parlant de Whisper comme alternative à la synthèse vocale Google, au motif que ça appartiendrait à OpenAI. Ça vous dit quelque chose ?</p><p>Pour le moment j'utilise avec plaisir SherpaTTS, mais bon, je suis curieux.</p><p>En plus, la description de Whisper+ semble indiquer que l'application est capable de traduire à la volée de la langue parlée vers l'anglais ! <span class="h-card" translate="no"><a href="https://masto.bike/@Vive_Levant" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>Vive_Levant</span></a></span> </p><p><a href="https://masto.bike/tags/SpeechRecognition" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SpeechRecognition</span></a><br><a href="https://masto.bike/tags/OpenAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenAI</span></a><br><a href="https://masto.bike/tags/TTS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TTS</span></a><br><a href="https://masto.bike/tags/FOSS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FOSS</span></a></p>
Data Quine<p>Journal of Open Source Software: voice: A Comprehensive R Package for Audio Analysis <br>{voice}<br>"...a free, open-source toolkit designed to streamline audio analysis by integrating music theory and advanced computational techniques. It enables researchers to extract, summarize, and analyze voice data efficiently, supporting applications such as speech recognition, speaker identification, and mood inference..."</p><p><a href="https://joss.theoj.org/papers/10.21105/joss.08420" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">joss.theoj.org/papers/10.21105</span><span class="invisible">/joss.08420</span></a></p><p><a href="https://datasci.social/tags/RStats" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>RStats</span></a> <a href="https://datasci.social/tags/Audio" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Audio</span></a> <a href="https://datasci.social/tags/SpeechRecognition" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SpeechRecognition</span></a> <a href="https://datasci.social/tags/AudioAnalysis" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AudioAnalysis</span></a> <a href="https://datasci.social/tags/Speech" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Speech</span></a></p>
Hacker News<p>Voxtral-Mini-3B-2507 – Open source speech understanding model</p><p><a href="https://huggingface.co/mistralai/Voxtral-Mini-3B-2507" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">huggingface.co/mistralai/Voxtr</span><span class="invisible">al-Mini-3B-2507</span></a></p><p><a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HackerNews</span></a> <a href="https://mastodon.social/tags/OpenSource" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenSource</span></a> <a href="https://mastodon.social/tags/SpeechRecognition" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SpeechRecognition</span></a> <a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://mastodon.social/tags/Models" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Models</span></a> <a href="https://mastodon.social/tags/MachineLearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MachineLearning</span></a> <a href="https://mastodon.social/tags/VoxtralMini3B" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>VoxtralMini3B</span></a> <a href="https://mastodon.social/tags/TechNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TechNews</span></a></p>
Ecologia Digital<p>"<a href="https://mato.social/tags/KarenHao" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>KarenHao</span></a> only really gets her teeth into this point in the book’s epilogue, “How the Empire Falls.” She takes inspiration from <a href="https://mato.social/tags/TeHiku" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TeHiku</span></a>, a <a href="https://mato.social/tags/M%C4%81ori" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Māori</span></a> AI <a href="https://mato.social/tags/speechrecognition" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>speechrecognition</span></a> project. Te Hiku seeks to revitalize the <a href="https://mato.social/tags/te_reo" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>te_reo</span></a> language through putting archived audio tapes of te reo speakers into an AI model, teaching new generations of Māori.<br>The tech has been developed on consent and active participation from the Māori community, and it is only licensed to organizations that respect Māori values"</p>
Jeremy KahnI don't know why they call it vibe coding
Debby<p><span class="h-card" translate="no"><a href="https://mastodon.social/@thelinuxEXP" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>thelinuxEXP</span></a></span> I really like Speech Note! It's a fantastic tool for quick and local voice transcription in multiple languages, created by <span class="h-card" translate="no"><a href="https://mastodon.social/@mkiol" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>mkiol</span></a></span> </p><p>It's incredibly handy for capturing thoughts on the go, conducting interviews, or making voice memos without worrying about language barriers. The app uses strictly locally running LLMs, and its ease of use makes it a standout choice for anyone needing offline transcription services.</p><p>I primarily use <a href="https://hear-me.social/tags/WhisperAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WhisperAI</span></a> for transcription and Piper for voice, but many other models are available as well. </p><p>It is available as flatpak and <a href="https://github.com/mkiol/dsnote" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">github.com/mkiol/dsnote</span><span class="invisible"></span></a> </p><p><a href="https://hear-me.social/tags/TTS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TTS</span></a> <a href="https://hear-me.social/tags/transcription" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>transcription</span></a> <a href="https://hear-me.social/tags/TextToSpeech" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TextToSpeech</span></a> <a href="https://hear-me.social/tags/translator" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>translator</span></a> translation <a href="https://hear-me.social/tags/offline" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>offline</span></a> <a href="https://hear-me.social/tags/machinetranslation" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>machinetranslation</span></a> <a href="https://hear-me.social/tags/sailfishos" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>sailfishos</span></a> <a href="https://hear-me.social/tags/SpeechSynthesis" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SpeechSynthesis</span></a> <a href="https://hear-me.social/tags/SpeechRecognition" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SpeechRecognition</span></a> <a href="https://hear-me.social/tags/speechtotext" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>speechtotext</span></a> <a href="https://hear-me.social/tags/nmt" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>nmt</span></a> <a href="https://hear-me.social/tags/linux" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>linux</span></a>-desktop <a href="https://hear-me.social/tags/stt" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>stt</span></a> <a href="https://hear-me.social/tags/asr" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>asr</span></a> <a href="https://hear-me.social/tags/flatpak" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>flatpak</span></a>-applications <a href="https://hear-me.social/tags/SpeechNote" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SpeechNote</span></a></p>
ResearchBuzz: Firehose<p>Gallaudet News: Gallaudet experts drive accessibility of speech tech for deaf voices . “Some people use their voices to control tech, from cell phones and remote controls to home appliances and in transportation. Voice command capabilities are made possible through training AI and machine learning. The Speech Accessibility Project is creating datasets of more diverse speech patterns, which […]</p><p><a href="https://rbfirehose.com/2025/06/27/gallaudet-news-gallaudet-experts-drive-accessibility-of-speech-tech-for-deaf-voices/" class="" rel="nofollow noopener" target="_blank">https://rbfirehose.com/2025/06/27/gallaudet-news-gallaudet-experts-drive-accessibility-of-speech-tech-for-deaf-voices/</a></p>
Hacker News<p>DeepSpeech Is Discontinued</p><p><a href="https://github.com/mozilla/DeepSpeech" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">github.com/mozilla/DeepSpeech</span><span class="invisible"></span></a></p><p><a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HackerNews</span></a> <a href="https://mastodon.social/tags/DeepSpeech" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DeepSpeech</span></a> <a href="https://mastodon.social/tags/Discontinued" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Discontinued</span></a> <a href="https://mastodon.social/tags/Mozilla" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Mozilla</span></a> <a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://mastodon.social/tags/SpeechRecognition" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SpeechRecognition</span></a> <a href="https://mastodon.social/tags/MachineLearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MachineLearning</span></a></p>
Taylor Arndt<p>Going live at 4 PM Central to build a real-time speech-to-text app using SwiftUI and iOS 26 APIs.<br>I’ll walk through everything—mic permissions, live transcription, and Apple’s speech recognition tools.<br>No delay, no post-processing. Just fast, accurate voice-to-text in SwiftUI.<br>Watch live: <a href="https://www.youtube.com/watch?v=vIqZq1UYBOA" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">youtube.com/watch?v=vIqZq1UYBO</span><span class="invisible">A</span></a><br>Hit “Notify Me” to join in.<br><a href="https://iosdev.space/tags/SwiftUI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SwiftUI</span></a> <a href="https://iosdev.space/tags/iOSDev" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>iOSDev</span></a> <a href="https://iosdev.space/tags/SpeechRecognition" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SpeechRecognition</span></a> <a href="https://iosdev.space/tags/LiveCoding" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LiveCoding</span></a> <a href="https://iosdev.space/tags/Accessibility" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Accessibility</span></a></p>
PLOS Biology<p>Slow amplitude fluctuations in sounds, critical for <a href="https://fediscience.org/tags/SpeechRecognition" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SpeechRecognition</span></a>, seem poorly represented in the <a href="https://fediscience.org/tags/brainstem" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>brainstem</span></a>. This study shows that overlooked intricacies of <a href="https://fediscience.org/tags/SpikeTiming" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SpikeTiming</span></a> represent these fluctuations, reconciling low-level neural processing with <a href="https://fediscience.org/tags/perception" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>perception</span></a> @plosbiology.org 🧪 <a href="https://plos.io/3FJ4adI" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">plos.io/3FJ4adI</span><span class="invisible"></span></a></p>
CITO Greenhouse<p>The Marvel of Auditory and Cognitive Networks Working Together in Your Brain</p><p><a href="https://mastodon.social/tags/AuditoryProcessing" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AuditoryProcessing</span></a> <a href="https://mastodon.social/tags/BrainScience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>BrainScience</span></a> <a href="https://mastodon.social/tags/NeuralNetworks" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>NeuralNetworks</span></a> <a href="https://mastodon.social/tags/CognitiveScience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CognitiveScience</span></a> <a href="https://mastodon.social/tags/Hearing" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Hearing</span></a> <a href="https://mastodon.social/tags/SpeechRecognition" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SpeechRecognition</span></a> <a href="https://mastodon.social/tags/BrainPlasticity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>BrainPlasticity</span></a> <a href="https://mastodon.social/tags/CentralNervousSystem" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CentralNervousSystem</span></a> <a href="https://mastodon.social/tags/SoundProcessing" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SoundProcessing</span></a> <a href="https://mastodon.social/tags/Neuroscience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Neuroscience</span></a> <a href="https://mastodon.social/tags/ListeningSkills" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ListeningSkills</span></a> <a href="https://mastodon.social/tags/BrainHealth" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>BrainHealth</span></a> <a href="https://mastodon.social/tags/AuditoryDisorders" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AuditoryDisorders</span></a> <a href="https://mastodon.social/tags/LearningAndMemory" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LearningAndMemory</span></a></p><p><a href="https://youtube.com/shorts/7GO01YoqIHo?feature=share" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">youtube.com/shorts/7GO01YoqIHo</span><span class="invisible">?feature=share</span></a></p>
Debby<p>🌟 Excited to share Thorsten-Voice's YouTube channel! 🎥 🗣️🔊 ♿ 💬</p><p>Thorsten presents innovative TTS solutions and a variety of voice technologies, making it an excellent starting point for anyone interested in open-source text-to-speech. Whether you're a developer, accessibility advocate, or tech enthusiast, his channel offers valuable insights and resources. Don't miss out on this fantastic content! 🎬</p><p>follow hem here: <span class="h-card" translate="no"><a href="https://techhub.social/@thorstenvoice" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>thorstenvoice</span></a></span> <br>or on YouTube: <a href="https://www.youtube.com/@ThorstenMueller" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="">youtube.com/@ThorstenMueller</span><span class="invisible"></span></a> YouTube channel! </p><p><a href="https://hear-me.social/tags/Accessibility" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Accessibility</span></a> <a href="https://hear-me.social/tags/FLOSS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FLOSS</span></a> <a href="https://hear-me.social/tags/TTS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TTS</span></a> <a href="https://hear-me.social/tags/ParlerTTS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ParlerTTS</span></a> <a href="https://hear-me.social/tags/OpenSource" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenSource</span></a> <a href="https://hear-me.social/tags/VoiceTech" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>VoiceTech</span></a> <a href="https://hear-me.social/tags/TextToSpeech" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TextToSpeech</span></a> <a href="https://hear-me.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://hear-me.social/tags/CoquiAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CoquiAI</span></a> <a href="https://hear-me.social/tags/VoiceAssistant" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>VoiceAssistant</span></a> <a href="https://hear-me.social/tags/Sprachassistent" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Sprachassistent</span></a> <a href="https://hear-me.social/tags/MachineLearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MachineLearning</span></a> <a href="https://hear-me.social/tags/AccessibilityMatters" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AccessibilityMatters</span></a> <a href="https://hear-me.social/tags/FLOSS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FLOSS</span></a> <a href="https://hear-me.social/tags/TTS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TTS</span></a> <a href="https://hear-me.social/tags/OpenSource" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenSource</span></a> <a href="https://hear-me.social/tags/Inclusivity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Inclusivity</span></a> <a href="https://hear-me.social/tags/FOSS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FOSS</span></a> <a href="https://hear-me.social/tags/Coqui" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Coqui</span></a> <a href="https://hear-me.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://hear-me.social/tags/CoquiAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CoquiAI</span></a> <a href="https://hear-me.social/tags/VoiceAssistant" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>VoiceAssistant</span></a> <a href="https://hear-me.social/tags/Sprachassistent" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Sprachassistent</span></a> <a href="https://hear-me.social/tags/VoiceTechnology" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>VoiceTechnology</span></a> <a href="https://hear-me.social/tags/K%C3%BCnstlicheStimme" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>KünstlicheStimme</span></a> <a href="https://hear-me.social/tags/MachineLearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MachineLearning</span></a> <a href="https://hear-me.social/tags/Python" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Python</span></a> <a href="https://hear-me.social/tags/Rhasspy" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Rhasspy</span></a> <a href="https://hear-me.social/tags/TextToSpeech" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TextToSpeech</span></a> <a href="https://hear-me.social/tags/VoiceTech" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>VoiceTech</span></a> <a href="https://hear-me.social/tags/STT" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>STT</span></a> <a href="https://hear-me.social/tags/SpeechSynthesis" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SpeechSynthesis</span></a> <a href="https://hear-me.social/tags/SpeechRecognition" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SpeechRecognition</span></a> <a href="https://hear-me.social/tags/Sprachsynthese" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Sprachsynthese</span></a> <a href="https://hear-me.social/tags/ArtificialVoice" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ArtificialVoice</span></a> <a href="https://hear-me.social/tags/VoiceCloning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>VoiceCloning</span></a> <a href="https://hear-me.social/tags/Spracherkennung" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Spracherkennung</span></a> <a href="https://hear-me.social/tags/CoquiTTS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CoquiTTS</span></a> <a href="https://hear-me.social/tags/voice" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>voice</span></a> <a href="https://hear-me.social/tags/a11y" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>a11y</span></a> <a href="https://hear-me.social/tags/ScreenReader" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ScreenReader</span></a></p>
Debby<p>Goode <span class="h-card" translate="no"><a href="https://techhub.social/@thorstenvoice" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>thorstenvoice</span></a></span>, just found your channel and I'm impressed! Your work on TTS is fantastic and so important for accessibility in the FLOSS community. Keep it up! <a href="https://hear-me.social/tags/AccessibilityMatters" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AccessibilityMatters</span></a> <a href="https://hear-me.social/tags/FLOSS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FLOSS</span></a> <a href="https://hear-me.social/tags/TTS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TTS</span></a> <a href="https://hear-me.social/tags/OpenSource" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenSource</span></a> <a href="https://hear-me.social/tags/Inclusivity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Inclusivity</span></a> <a href="https://hear-me.social/tags/FOSS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FOSS</span></a> <a href="https://hear-me.social/tags/Coqui" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Coqui</span></a> <a href="https://hear-me.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://hear-me.social/tags/CoquiAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CoquiAI</span></a> <a href="https://hear-me.social/tags/VoiceAssistant" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>VoiceAssistant</span></a> <a href="https://hear-me.social/tags/Sprachassistent" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Sprachassistent</span></a> <a href="https://hear-me.social/tags/VoiceTechnology" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>VoiceTechnology</span></a> <a href="https://hear-me.social/tags/K%C3%BCnstlicheStimme" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>KünstlicheStimme</span></a> <a href="https://hear-me.social/tags/MachineLearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MachineLearning</span></a> <a href="https://hear-me.social/tags/Python" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Python</span></a> <a href="https://hear-me.social/tags/Rhasspy" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Rhasspy</span></a> <a href="https://hear-me.social/tags/TextToSpeech" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TextToSpeech</span></a> <a href="https://hear-me.social/tags/VoiceTech" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>VoiceTech</span></a> <a href="https://hear-me.social/tags/STT" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>STT</span></a> <a href="https://hear-me.social/tags/SpeechSynthesis" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SpeechSynthesis</span></a> <a href="https://hear-me.social/tags/SpeechRecognition" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SpeechRecognition</span></a> <a href="https://hear-me.social/tags/Sprachsynthese" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Sprachsynthese</span></a> <a href="https://hear-me.social/tags/ArtificialVoice" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ArtificialVoice</span></a> <a href="https://hear-me.social/tags/VoiceCloning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>VoiceCloning</span></a> <a href="https://hear-me.social/tags/Spracherkennung" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Spracherkennung</span></a> <a href="https://hear-me.social/tags/CoquiTTS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CoquiTTS</span></a> <a href="https://hear-me.social/tags/voice" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>voice</span></a> <a href="https://hear-me.social/tags/a11y" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>a11y</span></a> <a href="https://hear-me.social/tags/ScreenReader" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ScreenReader</span></a></p>
IT News<p>Christmas Comes Early With AI Santa Demo - With only two hundred odd days ’til Christmas, you just know we’re already feeling... - <a href="https://hackaday.com/2025/05/18/christmas-comes-early-with-ai-santa-demo/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">hackaday.com/2025/05/18/christ</span><span class="invisible">mas-comes-early-with-ai-santa-demo/</span></a> <a href="https://schleuss.online/tags/artificialintelligence" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>artificialintelligence</span></a> <a href="https://schleuss.online/tags/speechrecognition" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>speechrecognition</span></a> <a href="https://schleuss.online/tags/speechsynthesis" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>speechsynthesis</span></a> <a href="https://schleuss.online/tags/santaclaus" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>santaclaus</span></a> <a href="https://schleuss.online/tags/libpeer" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>libpeer</span></a> <a href="https://schleuss.online/tags/openai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>openai</span></a> <a href="https://schleuss.online/tags/llm" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>llm</span></a> <a href="https://schleuss.online/tags/ai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ai</span></a></p>
Pyrzout :vm:<p>Christmas Comes Early With AI Santa Demo <a href="https://hackaday.com/2025/05/18/christmas-comes-early-with-ai-santa-demo/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">hackaday.com/2025/05/18/christ</span><span class="invisible">mas-comes-early-with-ai-santa-demo/</span></a> <a href="https://social.skynetcloud.site/tags/ArtificialIntelligence" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ArtificialIntelligence</span></a> <a href="https://social.skynetcloud.site/tags/speechrecognition" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>speechrecognition</span></a> <a href="https://social.skynetcloud.site/tags/speechsynthesis" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>speechsynthesis</span></a> <a href="https://social.skynetcloud.site/tags/SantaClaus" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SantaClaus</span></a> <a href="https://social.skynetcloud.site/tags/libpeer" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>libpeer</span></a> <a href="https://social.skynetcloud.site/tags/openai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>openai</span></a> <a href="https://social.skynetcloud.site/tags/LLM" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLM</span></a> <a href="https://social.skynetcloud.site/tags/ai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ai</span></a></p>
Hacker News<p>Jargonic Sets New SOTA for Japanese ASR</p><p><a href="https://aiola.ai/blog/jargonic-japanese-asr/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">aiola.ai/blog/jargonic-japanes</span><span class="invisible">e-asr/</span></a></p><p><a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HackerNews</span></a> <a href="https://mastodon.social/tags/Jargonic" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Jargonic</span></a> <a href="https://mastodon.social/tags/SOTA" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SOTA</span></a> <a href="https://mastodon.social/tags/Japanese" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Japanese</span></a> <a href="https://mastodon.social/tags/ASR" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ASR</span></a> <a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://mastodon.social/tags/Technology" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Technology</span></a> <a href="https://mastodon.social/tags/SpeechRecognition" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SpeechRecognition</span></a> <a href="https://mastodon.social/tags/Innovation" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Innovation</span></a></p>
Winbuzzer<p>Nvidia Releases High-Speed Parakeet AI Speech Recognition Model, Claims Top Spot on Leaderboard</p><p><a href="https://mastodon.social/tags/Nvidia" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Nvidia</span></a> <a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://mastodon.social/tags/ASR" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ASR</span></a> <a href="https://mastodon.social/tags/SpeechRecognition" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SpeechRecognition</span></a> <a href="https://mastodon.social/tags/SpeechToText" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SpeechToText</span></a> <a href="https://mastodon.social/tags/OpenSource" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenSource</span></a> <a href="https://mastodon.social/tags/MachineLearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MachineLearning</span></a> <a href="https://mastodon.social/tags/Parakeet" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Parakeet</span></a> <a href="https://mastodon.social/tags/NeMo" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>NeMo</span></a> <a href="https://mastodon.social/tags/HuggingFace" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HuggingFace</span></a> <a href="https://mastodon.social/tags/AIModels" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AIModels</span></a></p><p><a href="https://winbuzzer.com/2025/05/06/nvidia-releases-high-speed-parakeet-ai-speech-recognition-model-claims-top-spot-on-leaderboard-xcxwbn/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">winbuzzer.com/2025/05/06/nvidi</span><span class="invisible">a-releases-high-speed-parakeet-ai-speech-recognition-model-claims-top-spot-on-leaderboard-xcxwbn/</span></a></p>
Farooq | فاروق<p>Yesterday, I ordered food online. However it went a little off. And I contacted Support. They called me and for one moment, I thought it's a bot or recorded voice or something. And I hated it. Then I realized it's a human on the line.</p><p>I was planning to do an LLM+TTS+Speech Recognition and deploy it on A311D. To see if I can practice british accent with it. Now I'm rethinking about what I want to do. This way we are going, it doesn't lead to a good destination. I would hate it if I would have to talk to a voice enabled chatbot as support agent rather than a human.</p><p>And don't get me wrong. Voice enabled chatbots can have tons of good uses. But replacing humans with LLMs, not a good one. I don't think so.</p><p><a href="https://cr8r.gg/tags/LLM" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLM</span></a> <a href="https://cr8r.gg/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://cr8r.gg/tags/TTS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TTS</span></a> <a href="https://cr8r.gg/tags/ASR" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ASR</span></a> <a href="https://cr8r.gg/tags/speechrecognition" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>speechrecognition</span></a> <a href="https://cr8r.gg/tags/speechai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>speechai</span></a> <a href="https://cr8r.gg/tags/ML" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ML</span></a> <a href="https://cr8r.gg/tags/MachineLearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MachineLearning</span></a> <a href="https://cr8r.gg/tags/chatbot" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>chatbot</span></a> <a href="https://cr8r.gg/tags/chatbots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>chatbots</span></a> <a href="https://cr8r.gg/tags/artificialintelligence" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>artificialintelligence</span></a></p>
Richard Emling (DO9RE)<p>I'm exploring ways to improve audio preprocessing for speech recognition for my [midi2hamlib](<a href="https://github.com/DO9RE/midi2hamlib" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">github.com/DO9RE/midi2hamlib</span><span class="invisible"></span></a>) project. Do any of my followers have expertise with **SoX** or **speech recognition**? Specifically, I’m seeking advice on: 1️⃣ Best practices for audio preparation for speech recognition. 2️⃣ SoX command-line parameters that can optimize audio during recording or playback. <br> <a href="https://github.com/DO9RE/midi2hamlib/blob/main/tests/speech_menu.sh" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/DO9RE/midi2hamlib/b</span><span class="invisible">lob/main/tests/speech_menu.sh</span></a> <a href="https://metalhead.club/tags/SoX" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SoX</span></a> <a href="https://metalhead.club/tags/SpeechRecognition" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SpeechRecognition</span></a> <a href="https://metalhead.club/tags/OpenSource" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenSource</span></a> <a href="https://metalhead.club/tags/AudioProcessing" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AudioProcessing</span></a> <a href="https://metalhead.club/tags/ShellScripting" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ShellScripting</span></a> <a href="https://metalhead.club/tags/Sphinx" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Sphinx</span></a> <a href="https://metalhead.club/tags/PocketSphinx" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>PocketSphinx</span></a> <a href="https://metalhead.club/tags/Audio" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Audio</span></a> Retoot appreciated.</p>
Farooq | فاروق<p>After my <a href="https://cr8r.gg/tags/wake_word_detection" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>wake_word_detection</span></a> <a href="https://cr8r.gg/tags/research" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>research</span></a> has delievered fruits, I have plans to continue works in the voice domain. I would love if I could train a <a href="https://cr8r.gg/tags/TTS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TTS</span></a> model which has <a href="https://cr8r.gg/tags/British" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>British</span></a> accent so I would use it to practice.</p><p>I was wondering if I could do the inference on <a href="https://cr8r.gg/tags/A311D" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>A311D</span></a> <a href="https://cr8r.gg/tags/NPU" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>NPU</span></a>. However, as I am skimming papers of different models, having inference on A311D with reasonable performance seems unlikely. Even training of these models on my entry level <a href="https://cr8r.gg/tags/IntelArc" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>IntelArc</span></a> <a href="https://cr8r.gg/tags/GPU" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GPU</span></a> would be painful.</p><p>Maybe I could just finetune an already existing models. I am also thinking about using <a href="https://cr8r.gg/tags/GeneticProgramming" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GeneticProgramming</span></a> for some components of these TTS models to see if there will be better inference performance.</p><p>There are <a href="https://cr8r.gg/tags/FastSpeech2" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FastSpeech2</span></a> and <a href="https://cr8r.gg/tags/SpeedySpeech" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SpeedySpeech</span></a> which look promising. I wonder how much natural their accents will be. But they would be good starting points.</p><p>BTW, if anyone needs opensource models, I would love to work as a freelancer and have an <a href="https://cr8r.gg/tags/opensource" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>opensource</span></a> job. Even if someone can just provide access to computation resources, that would be good.</p><p><a href="https://cr8r.gg/tags/forhire" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>forhire</span></a> <a href="https://cr8r.gg/tags/opensourcejob" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>opensourcejob</span></a> <a href="https://cr8r.gg/tags/job" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>job</span></a> <a href="https://cr8r.gg/tags/hiring" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>hiring</span></a></p><p><a href="https://cr8r.gg/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://cr8r.gg/tags/VoiceAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>VoiceAI</span></a> <a href="https://cr8r.gg/tags/opensourceai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>opensourceai</span></a> <a href="https://cr8r.gg/tags/ml" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ml</span></a> <a href="https://cr8r.gg/tags/speechrecognition" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>speechrecognition</span></a> <a href="https://cr8r.gg/tags/speechsynthesis" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>speechsynthesis</span></a> <a href="https://cr8r.gg/tags/texttospeech" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>texttospeech</span></a> <a href="https://cr8r.gg/tags/machinelearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>machinelearning</span></a> <a href="https://cr8r.gg/tags/artificialintelligence" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>artificialintelligence</span></a> <a href="https://cr8r.gg/tags/getfedihired" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>getfedihired</span></a> <a href="https://cr8r.gg/tags/FediHire" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FediHire</span></a> <a href="https://cr8r.gg/tags/hireme" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>hireme</span></a> <a href="https://cr8r.gg/tags/wakeworddetection" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>wakeworddetection</span></a></p>