
Yu-Jhe Li (李宇哲)
I am a Software Engineer at Google, working on multi-modal generative AI. Previously, I was a Research Scientist at Adobe and Microsoft.
I received my Ph.D. in Electrical and Computer Engineering from Carnegie Mellon University (CMU), advised by Prof. Kris Kitani. During my Ph.D., I interned at Meta Research (twice) and Adobe Research.
I earned my M.Sc. from National Taiwan University (國立台灣大學), advised by Prof. Yu-Chiang Frank Wang at the Vision and Learning Lab, and my B.S. from National Tsing Hua University (國立清華大學).
News
- Apr 2026Joined Google as a Software Engineer working on multi-modal generative AI. 🎉
- Sep 2025One paper accepted to NeurIPS 2025 as Oral.
- Jul 2025One paper accepted to COLM 2025.
- Apr 2024Joined Adobe as a Research Scientist.
- Feb 2024One paper accepted to CVPR 2024.
Show older news
- Dec 2023Joined Microsoft as a Research Scientist.
- Aug 2023Successfully defended my Ph.D. at CMU! 🎓
- Feb 2023One paper accepted to CVPR 2023.
- Feb 2023One paper accepted to ICASSP 2023 special track.
- Aug 2022Received Qualcomm Innovation Fellowship (one of only 19 teams in the US — details).
- Jul 2022One paper accepted to ECCV 2022.
- Mar 2022Two papers accepted to CVPR 2022.
Research
My research focuses on multi-modal reasoning and visual generation, specifically:
- Leveraging multi-modal large language models (MLLMs) to enhance high-level reasoning, complex perception, and decision-making.
- Advancing generative AI and visual content generation—particularly through diffusion models—to synthesize high-fidelity images and videos for XR and autonomous systems.
- Developing data-efficient and adaptable methods that bridge the gap between cognitive reasoning and dynamic generation in real-world applications.
Selected Publications
A curated list of representative works. See the full list on Google Scholar.























Professional Service
Conference Reviewer
Journal Reviewer
Education



Undergraduate Exchange Programs


