avatar

Yusheng Dai

M.S. Candidate
Master's Second Year
dalison (at) mail.ustc.edu.com


Hello 👋!

I am Yusheng Dai, a final-year M.S. student in University of Science and Technology of China (USTC), under the guidance of Prof. JunDu and Prof. Chin-hui Lee. Starting June 2025, I will be pursuing my Ph.D. at Monash University in Australia under the supervision of Prof. Jianfei Cai and Prof. Qiuhong Ke. I obtained my Bachelor’s Degree of Cyber Engineering from Sichuan University in June 2022.

My prior research primarily focuses on video, integrating both visual and audio streams while emphasizing their complementarity, alignment, and transitions across semantic and temporal dimensions. These work can be simply divided into two main categories. The early work, beginning in 2022, focus on audio-visual discriminative models related to talking-face videos in noisy, multi-speaker scenarios, such as Audio-Visual Speech Recognition (AVSR). More recently, since 2023, my research has shifted toward more flexible and high-quality audio and music generation, emphasizing atomic controllability and consistency in combination, guided by text or silent video.

News

Selected Publications [Google Scholar]

  1. CVPR
    Yusheng Dai, Hang Chen, Jun Du, Chin-hui Lee, et.al.
    IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2024.

  2. ICME
    Yusheng Dai, Hang Chen, Jun Du, Chin-hui Lee, et.al.
    IEEE International Conference on Multimedia and Expo (ICME), 2023.

  3. NeurIPS DistShift
    Donglin Zhan*, Yusheng Dai*, Yiwei Dong*, Jinghai He, Zhenyi Wang, James Anderson (* means equal contribution)
    Conference on Neural Information Processing Systems Workshop on Distribution Shifts (NeurIPS DistShift), 2022.

  4. SDM
    Donglin Zhan*, Yusheng Dai*, Yiwei Dong*, Jinghai He, Zhenyi Wang, James Anderson (* means equal contribution)
    SIAM International Conference on Data Mining (SDM), 2024.

  5. Electronics Letters

  6. Interspeech
    Chen Hang, Du Jun, Dai Yusheng, Lee Chin Hui, Siniscalchi, Sabato Marco, Watanabe Shinji, Scharenborg, Odette, Chen Jingdong, et.al.
    In Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech), 2022.

  7. ICASSP
    Shilong Wu, Chenxi Wang, Hang Chen, Yusheng Dai, et.al.
    IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024.


© Yusheng Dai, 2023