P

Hey! Heyy
This is Bingxuan Li.

Google Scholar
I'm an incoming CS PhD at University of Illinois Urbana-Champaign (UIUC). I recently completed my Master Degree from the University of California, Los Angeles (UCLA), where I advised by Professor Nanyun Peng and Professor Kai-Wei Chang, and mentored by Professor Yiwei Wang. Before that, I completed my Bachelor's Degree at Purdue University, where I double-majored in Computer Science and Data Science. I have fortune to be mentored and work with Professor Bingxin Zhao, Professor Pengyi Shi, Professor Amy Ward, and Professor Tianyi Zhang.
My current research focus on three-fold:
  • Agentic AI
  • Multimodal Understanding and Reasoning
  • AI Application in Human Health, including AI for Biomedical and Medicine Scientific Discovery, and AI for Healthcare.

๐Ÿ“š Selected Publications

For the full list of publications, please visit my Google Scholar page.
Under-review Pre-prints
  • Pre-print

    Embodied AI

    Multi-Modal

    arxiv
    NEW!! Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence

    Yining Hong*, Rui Sun*, Bingxuan Li, Xingchen Yao, Maxine Wu, Alexander Chien, Da Yin, Ying Nian Wu, Zhecan James Wang, Kai-Wei Chang

    Under Review.

Peer-reviewed Publications
  • ACL-2025

    Multimodal Reasoning

    arxiv
    NEW!! METAL: A Multi-Agent Framework for Chart Generation with Test-Time Scaling

    Bingxuan Li, Yiwei Wang, Jiuxiang Gu, Kai-Wei Chang, Nanyun Peng

    Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL-2025 Main Conference).

  • ICML-2025

    Multimodal Learning

    arxiv
    NEW!! Contrastive Visual Data Augmentation

    Bingxuan Li*, Yu Zhou*, Mohan Tang* (*equal contribution, interchangeable order), Xiaomeng Jin, Te-Lin Wu, Kuan-Hao Huang, Heng Ji, Kai-Wei Chang, Nanyun Peng

    Proceedings of the Forty-Second International Conference on Machine Learning (ICML-2025).

  • EMNLP-2024

    Controllable Generation

    arxiv
    Control Large Language Models via Divide and Conquer

    Bingxuan Li, Yiwei Wang, Tao Meng, Kai-Wei Chang, Nanyun Peng

    Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP-2024 Main Conference). Best Paper Nonimation (2%). Oral Presentation.

    PaperPoster
  • CVPR-2025

    Multimodal Reasoning

    arxiv
    VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning

    Xueqing Wu*, Yuheng Ding*, Bingxuan Li, Pan Lu, Da Yin, Kai-Wei Chang, Nanyun Peng

    Proceedings of the 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR-2025).

  • NAACL-2025

    Controllable Generation

    Creative Generation

    arxiv
    REFFLY: Melody-Constrained Lyrics Editing Model

    Songyan Zhao*, Bingxuan Li* (*equal contribution), Yufei Tian, Nanyun Peng

    Proceedings of the 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL-2025 Main Conference)

    PaperDemo
  • AAAI-2025 GenAI4Health Workshop

    LLMs Reasoning

    Machine Learning

    arxiv
    Enhancing Predictive Model Learning via Domain-Knowledge Augmented Latent Feature Mining

    Bingxuan Li, Pengyi Shi, Amy Ward

    Accepted by AAAI-2025 GenAI4Health Workshop; Under Review for Conference

  • IUI-2025

    Multimodal

    Human Computer Interaction

    arxiv
    HEPHA: A Mixed-Initiative Image Labeling Tool for Specialized Domains

    Shiyuan Zhou*, Bingxuan Li*, Xiyuna Chen* (*equal contribution), Zhi Tu, Yifeng Wang, Yiwen Xiang, Tianyi Zhang

    Proceedings of the 30th ACM Conference on Intelligent User Interfaces (IUI-2025)

    PaperCode
  • IAAI-2024

    AI for Social Good

    arxiv
    Combining Machine Learning and Queueing Theory for Data-driven Incarceration-Diversion Program Management

    Bingxuan Li, Antonio Castellanos, Pengyi Shi, Amy Ward

    Proceedings of the Thirty-Sixth Annual Conference on Innovative Applications of Artificial Intelligence (Collated with AAAI-2024). Oral Presentation

  • Science

    AI for Biomedical Research

    Multimodal

    arxiv
    Heart-brain connections: Phenotypic and genetic insights from magnetic resonance images

    Bingxin Zhao, Tengfei Li, Zirui Fan, Yue Yang, Juan Shu, Xiaochen Yang, Xifeng Wang, Tianyou Luo, Jiarui Tang, Di Xiong, Zhenyi Wu, Bingxuan Li, Jie Chen, Yue Shan, Chalmer Tomlinson, Ziliang Zhu, Yun Li, Jason L Stein, Hongtu Zhu

    Science, 2023

    Paper
  • Nature Communications

    AI for Biomedical Research

    Multimodal

    arxiv
    Eye-brain connections revealed by multimodal retinal and brain imaging genetics

    Bingxin Zhao, Yujue Li, Zirui Fan, Zhenyi Wu, Juan Shu, Xiaochen Yang, Yilin Yang, Xifeng Wang, Bingxuan Li, Xiyao Wang, Carlos Copana, Yue Yang, Jinjie Lin, Yun Li, Jason L Stein, Joan M Oโ€™Brien, Tengfei Li, Hongtu Zhu

    Nature Communications, 2024

    Paper
  • KDD'2023

    Multimodal

    Neuro-symbolic Learning

    Active Learning

    arxiv
    Rapid Image Labeling via Neuro-Symbolic Learning

    Yifeng Wang, Zhi Tu, Yiwen Xiang, Shiyuan Zhou, Xiyuan Chen, Bingxuan Li, Tianyi Zhang

    Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD-2023).

    PaperCode
Services: Reviewing for NeurIPS2024, ICLR2025, AISTATS2025, ICML2025

๐Ÿซ Education

University of California, Los Angeles | Los Angeles, CA

September, 2023 - December, 2024

Master of Engineering in Artificial Intelligence

GPA: 3.95 / 4.0

MEng Fellowship Award

Purdue University | West Lafayette, IN

August, 2019 - May, 2023

Bachelor of Science in Computer Science, Bachelor of Science in Data Science | Double Major

GPA: 3.91 / 4.0

Dean's List (8 / 8 semesters), Semester Honor (8 / 8 semesters), DUIRI Research Fellowship, JD Global Student Scholarship

๐Ÿ‘จโ€๐Ÿ’ป Research Experiences

UCLA NLP | Advisor: Professor Kai-Wei Chang

October, 2023 - Present

  • Multimodal understanding and reasoning

UCLA PLUS Lab | Advisor: Professor Nanyun Peng

September, 2023 - Present

  • Creative Generation
  • Controllable Generation

Wharton Statistics and Data Science Department, University of Pennsylvania | Advisor: Professor Bingxin Zhao

September, 2020 - Present

  • AI for biomedical scientific discovery
  • Large-scale data analysis
  • Applications for genetic data exploration

Daniel School of Business, Purdue University | Advisor: Professor Pengyi Shi

March, 2020 - Present

  • AI for operations research
  • AI for healthcare
  • Applications for social good

Booth School of Business, University of Chicago | Advisor: Professor Amy Ward

March, 2023 - Present

  • AI for operations research
  • Applications for social good

Human-Centered Software Systems Lab, Purdue University | Advisor: Professor Tianyi Zhang

May, 2020 - September, 2024

  • Human-Computer Interaction
  • Computer Vision

๐Ÿ› ๏ธ Selected Projects

Chart Storytelling
Chart Storytelling

Multimodal Understanding

Controllable Generation

  • Developed Evaluation Framework: Assessed pre-trained Vision Language Models (VLMs) like GPT-4V and LLaVA on chart storytelling, focusing on generating coherent narratives from visual data.
  • Improved Model Performance: Fine-tuned LLaVA for better analysis and storytelling with custom prompts, boosting its ability to handle complex charts.
  • Conducted Case Studies: Analyzed VLM robustness and versatility across different chart types and scientific domains.

Decoding EEG
Decoding Electroencephalography Brain Activity Signals with Deep Learning

Deep Learning

AI for Science

Multimodal

  • Developed Data Augmentation Strategies: Implemented spectrogram transformation, clustering-based augmentation, and subsampling with noise to enhance EEG data classification accuracy.
  • Explored Multiple Deep Learning Models: Trained CNN, RNN, and hybrid models (CNN+LSTM, CNN+GRU), achieving 73% test accuracy with CNN and 70% with CNN+GRU on combined subject data.
  • Performed Cross-Validation: Improved model generalizability across subjects, refining hyperparameters and optimizing performance with cross-validation techniques.

POSTMAN
POSTMAN: Task-Based Delivery Platform for Fast Errand Solutions

Full-Stack Development

Human-Computer Interaction

  • Designed UI with MUI in Figma. Developed web page based on React framework. Implement real-time map with Google Map API.
  • Implemented authorization and chatting functions. Used Express as server to interact with MongoDB cluster database and configured Redux store for data operation.

ยฉ 2025 bingxuanli.com