SURVEY ON DEEP LEARNING FOR AUDIO-VISUAL SPEECH RECOGNITION

Why publish with

ijaser

IJASER publishes high-quality, original research papers, brief reports, and critical reviews in all theoretical, technological, and interdisciplinary studies that make up the fields of advanced science and engineering and its applications.

SURVEY ON DEEP LEARNING FOR AUDIO-VISUAL SPEECH RECOGNITION

Abstract

The term "visual speech," which refers to the visual domain of speech, has gained popularity because of its many uses in fields like public safety, healthcare, military defense, and entertainment. Deep learning methods have greatly aided the advancement of visual speech learning as a potent AI tactic. Recently, spontaneous audio-visual speech recognition systems (AVSRs) have demonstrated remarkable performance, particularly in tasks with restricted vocabulary, by significantly outperforming human speech recognition capabilities, particularly in acoustically loud environments. Globally, research and development of spontaneous speech identification systems on the basis of the processing of audio and visual data is ongoing. The focus of this paper is to analogize various deep learning approaches for AV speech recognition.

Author

RADHIKA SREEDHARAN

Download

International Journal of Advanced
Science and Engineering Research

For Queries/Clarification

Why publish with

Author's Desk

Downloads