I wrote a service, (nowhere close for public release), that segments audio based on speakers. You have to identify one speech segment, it is then capable of labelling others. It uses GMM and MFCC. Is something like this in the works? Cool idea! I consume a fair bit of podcasts, I can affirm that there is definitely a need for this
Do try out https://scribie.com/transcription/free as well once. Our diarisation system is around 90% accurate on longer paragraphs and will be out this week.
I wrote a service, (nowhere close for public release), that segments audio based on speakers. You have to identify one speech segment, it is then capable of labelling others. It uses GMM and MFCC. Is something like this in the works? Cool idea! I consume a fair bit of podcasts, I can affirm that there is definitely a need for this