avatar

Yusheng Dai

PhD Candidate
Yusheng.Dai (at) monash.edu


Hello 👋!

I am Yusheng Dai, and I recently joined Monash University in Australia to work with Prof. Jianfei Cai. Before that, I completed my Master’s program at University of Science and Technology of China (USTC) working with Prof. Jun Du and Prof. Chin-hui Lee. I obtained my Bachelor’s Degree in Cyber Engineering from Sichuan University in June 2022.

My research focuses on audio-visual modality generation and understanding. Specifically, recent work includes multi-conditional-based universal audio generation, Visual-Text to Speech Audio and Music (VT2SAM), which considers both semantic and temporal alignment. I am also interested in extending standard diffusion-based mel-spectrum generation to better approximate the complete real world, such as in long latent spaces (e.g., infinite-duration audio or panorama) or higher resolutions (up to 44.1kHz audio). Earlier, I focused on audio-visual speech recognition using talking-face videos in noisy, multi-speaker scenarios.

News

Selected Publications [Google Scholar]

  1. ICCV
    Yusheng Dai*, Chenxi Wang*, Chang Li, Chen Wang, et.al.
    International Conference on Computer Vision (ICCV), 2025.

  2. CVPR
    Yusheng Dai, Hang Chen, Jun Du, Chin-hui Lee, et.al.
    IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2024.

  3. ACM MM
    Chenxi Wang*, Yusheng Dai*, Lei Sun, et.al.
    ACM International Conference on Multimedia (ACM MM), 2025.

  4. NeurIPS DistShift
    Donglin Zhan*, Yusheng Dai*, Yiwei Dong*, Jinghai He, Zhenyi Wang, James Anderson (* means equal contribution)
    Conference on Neural Information Processing Systems Workshop on Distribution Shifts (NeurIPS DistShift), 2022.

  5. Electronics Letters

  6. Interspeech
    Chen Hang, Du Jun, Yusheng Dai, Lee Chin Hui, Siniscalchi, Sabato Marco, Watanabe Shinji, Scharenborg, Odette, Chen Jingdong, et.al.
    In Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech), 2022.


© Yusheng Dai, 2023