The art of speech and the art of transcription are blended to bring about a new state-of-the-art technology known as as the automatic speech recognition software program. The ASR or the automatic speech recognition computer software is discovered to be the talk of the town. Speech recognition has been a dream for us from the great old days of star wars and other science fiction motion pictures and stories. Have our dreams come true?? Today, it has been partially fulfilled with the new arrivals in the markets. Each and every company has been into this competitors of giving the best speech recognition computer software to the planet market place. What has
occurred to the race amongst themselves? It reminds me of the hare and the tortoise story. The slow and steady looks like it has won the race, but but has miles to touch the finish line. Discussing about what exactly is the goal of the race?? Is it either getting to the prime or getting to the people, is once more a million dollar question. With all the revenues pooled in for speech recognition have began to drain, there is a want to analyze the growth with time aspect, which will clearly show a flattened graph showing the stagnant nature of the software investigation and development.
Imagine a circumstance, exactly where you have invested on speech recognition application for some thousand dollars per month and find it to be unworthy given that they type in your dictations wrongly,words are replaced and jumbled, and the context becomes different, what a chaos that would create.The frustration that is exhibited at these instances is truly unbearable. Flawless merchandise or solutions are nowhere to be identified considering that every little thing on earth comes with distinctive pros and cons. This applies to the speech-to-text software program as well. It has its personal flaws and demerits, which limits the usage of it within the little neighborhood. The concept wants much more consideration and study to reach or to compete with the languages that have been created more than millions of years.
The ethnologue of the planet appears to be far too extended and unending. The languages that we speak today are the improvement of it over millions of years together with all the efforts of millions of generations. All animals communicate with every single other, but it is only the humans who have formulated the communication in predefined set of signals identified as the language. The Cortical Speech Center is again an evolutionary function that only the humans posses, which differentiates the human brain from the other animals in the animal kingdom. Therefore, the speech recognition softwares that has a very recent history compared to the languages has to travel not millions but at least few decades to recognize the least about the speech and languages spoken by distinct groups of people.
The drawbacks of the voice recognition or audio-to-text software are:
It can’t recognize all the words following spending hours with each other education the computer software. Time is precious after all we have only 24 hours a day!!!
All the punctuations such as coma, full quit, semicolon, hyphenation demands the speaker to dictate wherever he/she desires one particular.
Understanding the context is one more major drawback or demerit: Some words specifically in English have several meanings and requirements to be utilised in the right context to obtain very good outcomes in the records. The software does not look to comprehend the context in most of the locations.
Homophones are once more a tough process to manage for the audio to text software: Diverse words with the identical pronunciation but diverse meanings: For instance elicit-illicit desert-dessert there-their flour-flower bowel-bowl words with exact same pronunciation but diverse spelling and which means, which are used in various context, confuse the application resulting in bloopers and hilarious phrases and sentences.
The other key black mark about the speech recognition is that it cannot recognize the varied varieties of accent that is present in one particular single language. Understanding the words in a neutral slang itself is difficult for the application then how can it ever understand the various slangs or accents utilised by various folks around the globe!!
In 1997, Bill Gates gave a open statement that “In this 10-year time frame, I believe that we”ll not only be making use of the keyboard and the mouse to interact, but throughout that time we will have perfected speech recognition and speech output nicely sufficient that these will grow to be a normal component of the interface.” Now, it is three years past a decade and however speech recognition is only at the primitive stage of usage and improvement.
Hence, to conclude transcription business has a bigger hand over the audio-to-text software. Transcriptionists are not obsolete. They have their own space and want in the field for their integrity, caliber, and expertise in the business.