Google has officially declared the test results of its audio search feature. The test was done to check whether the audio search has the potential to work in the future or not.
The test results show higher results than Google has actually thought of but quite difficult as well. In order to run the test, Google has made a partnership with KQED.
Difficulties In Audio Search
One of the biggest barriers in making audio search visible that the audio file must be converted into a text file so that it can be searched over Google. Other than this, there is no way in which audio files can be found over search engines.
To make it easy for the publishers’ automated transcriptions are the only way out. As manual transcriptions consume a lot of time and effort. According to KQED, indexing audio transcripts needs to be 100% accurate which is not possible for now.
Drawbacks In Audio To Text Technology
The biggest limitation that has been discovered is to identify the proper nouns. Named entities are not clearly understandable by AI.
KQED’s local news audio is rich in references of named entities related to topics, people, places, and organizations that are contextual to the Bay Area region. Speakers use acronyms like “CHP” for California Highway Patrol and “the Peninsula” for the area spanning San Francisco to San Jose. These are more difficult for artificial intelligence to identify.
As you can see above, when name entities are not clear, then AI tries its best to guess what has been said. However, this thins is strictly not possible for web searches, because incorrect transcriptions will change the whole meaning of what has been said.
Google’s Next Step
Google will work on the project until a proper solution is there. In order to make audio search accessible to every region, Google will come up with some new technology as well.
One of the pillars of the Google New Initiative is incubating new approaches to difficult problems. Once complete, this technology and associated best practices will be openly shared, greatly expanding the anticipated impact.
We’re confident that in the near future, improvements into these speech-to-text models will help convert audio to text faster, ultimately helping people find audio news more effectively.