Understanding non-American accents
Why the focus on ‘Indian’? Well, by far the biggest issue for me was in getting Siri to understand my accent. People who know me claim that I don’t have that thick an accent, but that’s not what Siri ‘tells’ me.
On the plus side, Siri is subtly training me to have a more ‘American’ voice. She highlights words she has trouble understanding in blue, so you can quickly figure out what words one is consistently mispronouncing, or pronouncing with a thick accent. ‘Paragon’ was totally off, for instance!
Your wish is my command – but be QUICK
It’s almost impossible to dictate text messages through Siri – at least for me. I tend to pause just a little too long between sentences, and Siri needs you to know exactly what you want to say before you start dictating.
In fact, I ended up using Voice Notes to record quick notes for myself, when I’d have preferred to dictate them through Siri. On the plus side, you can use earphone controls to record/pause Voice Notes, for some quick eyes-on-the-road note taking.
Ethnic language models
The language models need more work, and I won’t be surprised if there are people hard at work creating ethnicity-oriented speech models. Guessing the ethnicity shouldn’t be that hard to do, given all the clues Apple has – the name of the person, their entire addressbook, and recorded voice samples. These samples can be compared to aggregate training data collected from virtually every country on the planet.
The holy grail, of course, would be a trained model on a per-person basis, but I wonder if they have the compute resources and data volume, on a per-person basis, to make that happen effectively.
To the future!