As
'Live Translate is particularly efficient for travel scenarios such as visitors to this year's
Understanding Context in Voice Recognition
For those already using the translation features of Galaxy AI, such functionalities may seem very useful. But for developers who have made the features come to life, they know that being able to communicate while traveling abroad isn't something that can be taken for granted.
One thing the team noted was that there are more homonyms in Japanese than some other languages. For instance, 'chopsticks' (Hashi,?) and 'bridge' (Hashi,?) are relatively easy to distinguish due to the difference in intonation, but words like 'sightseeing'(Kanko,??), 'customs'(Kanko,??), 'public' (Kokyo,??) and 'prosperity' (Kokyo,??) must be judged based on the context.
'Judgement becomes more difficult when the context is ambiguous, such as names of locale and people, proper nouns, dialects and numbers,' says Akasako. 'So in order to improve the accuracy of speech recognition, a lot of data is needed.'
'We always look for ways to fine-tune the AI model for key events and moments in a timely manner,' continues Akasako. 'With a lot of new combinations of place names and activities, it's important that the context is still clear when people are using Galaxy AI.'
Challenges in Collecting Efficient Data
While recognizing the types of data needed is also important, collecting the data in and of itself is a challenge in its own right.
Previously, the SRJ team used human-recorded data to train the speech recognition engine for Live Translate, which didn't result in sufficient data collection.
'Every time a problem is identified and solved, the accuracy of speech recognition improves significantly,' says Akasako. 'Regardless of where people are, our goal is connecting people with each other, and the tools powered by Galaxy AI will ensure more fun and efficient communication.'
(C) 2024 Electronic News Publishing, source