Ying Shen

Computer Science Ph.D. Student

University of Illinois Urbana-Champaign

Welcome!

I am currently pursuing my Ph.D. in Computer Science at the University of Illinois Urbana-Champaign.

My research interests lie in multi-modal interaction, a vibrant multi-disciplinary research field that aims to enable AI agents to interact seamlessly with users and complex environments by integrating and modeling diverse input and output modalities – including linguistic, acoustic, and visual messages. Specifically, my work focuses on developing efficient, controllable, adaptive, and interactive multi-modal generative models. My enthusiasm is to build robust AI agents capable of understanding, interpreting, and reasoning about the physical world. These systems are envisioned to effectively operate in complex and ever-changing environments, make informed decisions, and respond intelligently to real-world challenges.

I began my Ph.D. studies in Computer Science at Virginia Tech, advised by Prof. Lifu Huang and Prof. Ismini Lourentzou, and later transferred to UIUC to further expand my research capabilities. I obtained my Master of Science degree in Intelligent Information Systems from Carnegie Mellon University and my Bachelor’s degree from School of Software Engineering, Fudan University. Previously, I worked with Prof. Louis-Philippe Morency and Prof. Graham Neubig at CMU.

I am honored to have been awarded the Amazon-VT Fellowship for the 2023-2024 academic year.

Interests

Deep Learning
Multimodal Machine Learning
Deep Generative Models
Natural Language Processing
Computer Vision

Education

PhD in Computer Science, Present

University of Illinois Urbana-Champaign
PhD in Computer Science, 2024

Virginia Tech
MSc in Intelligent Information Systems, 2018

Carnegie Mellon University
BEng in Software Engineering, 2017

Fudan University

Publications

Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling

Diffusion models have emerged as a powerful tool for generating high-quality images from textual descriptions. Despite their successes, …

Jiatao Gu, Ying Shen, Shuangfei Zhai, Yizhe Zhang, Navdeep Jaitly, Joshua M. Susskind

Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling

Many-to-many Image Generation with Auto-regressive Diffusion Models

Recent advancements in image generation have made significant progress, yet existing models present limitations in perceiving and …

Ying Shen, Yizhe Zhang, Shuangfei Zhai, Lifu Huang, Joshua M. Susskind, Jiatao Gu

Many-to-many Image Generation with Auto-regressive Diffusion Models

InternalInspector I2: Robust Confidence Estimation in LLMs through Internal States

Despite their vast capabilities, Large Language Models (LLMs) often struggle with generating reliable outputs, frequently producing …

Mohammad Beigi, Ying Shen, Runing Yang, Zihao Lin, Qifan Wang, Ankith Mohan, Jianfeng He, Ming Jin, Chang-Tien Lu, Lifu Huang

Multimodal Instruction Tuning with Conditional Mixture of LoRA

Multimodal Large Language Models (MLLMs) have demonstrated remarkable proficiency in diverse tasks across different domains, with an …

Ying Shen, Zhiyang Xu, Qifan Wang, Yu Cheng, Wenpeng Yin, Lifu Huang

Multimodal Instruction Tuning with Conditional Mixture of LoRA

Vision-Flan: Scaling Human-Labeled Tasks in Visual Instruction Tuning

Despite vision-language models' (VLMs) remarkable capabilities as versatile visual assistants, two substantial challenges persist …

Zhiyang Xu, Chao Feng, Rulin Shao, Trevor Ashby, Ying Shen, Di Jin, Yu Cheng, Qifan Wang, Lifu Huang

Vision-Flan: Scaling Human-Labeled Tasks in Visual Instruction Tuning

See all publications

Experience

Machine Learning Research Intern

Apple

May 2023 – Aug 2023 New York, NY

Research Associate

Language Technologies Institute, Carnegie Mellon University

Jan 2019 – Dec 2019 Pittsburgh, PA

Graduate Research Assistant

MultiComp Laboratory, Carnegie Mellon University

Sep 2017 – Dec 2018 Pittsburgh, PA

Research Intern

ArticuLab, Carnegie Mellon University

Aug 2016 – Jul 2016 Pittsburgh, PA

A little more about me

I love painting. To me, painting is like another medium (besides verbal and nonverbal behaviors) for expressing feelings and thoughts.
I enjoy traveling, exploring new restaurants, watching Japanese animations, and swimming.
I can speak Shanghainese (native), Chinese Mandarin (native), English (fluent), and Japanese (conversational).
I hope to use technology to help minority groups and improve people’s quality of life.

“There is only one heroism in the world: to see the world as it is, and to love it.”

– Romain Rolland