基于Vision Transformer的中文唇语识别
薛峰1, 洪自坤2, 李书杰1, 李雨2, 谢胤岑2

Chinese Lipreading Network Based on Vision Transformer
XUE Feng1, HONG Zikun2, LI Shujie1, LI Yu2, XIE Yincen2
注意力权重可视化热力图