Yurong Liu (刘雨绒)

Ph.D. Student, VIDA Center
Department of Computer Science, New York University

Email: yurong.liu [at] nyu [dot] edu
Office: 370 Jay Street, 11th floor

GitHub | Google Scholar | LinkedIn | Twitter


I am a second-year PhD student in computer science at New York University, where I am fortunate to be advised by Prof. Juliana Freire. I also had the pleasure of being mentored by Dr. Flip Korn during my internship at Google Research, and by Prof. Christopher Musco.

I am broadly interested in data-centric aspects of AI, especially scalable data integration and data explainability. My research also involves representation learning for structured data.

Previously, I jointly pursued an Honors B.S. in Computer Science and an Honors B.A. in Mathematics at the University of Rochester in 2023, where I am pleased to have worked with Prof. Fatemeh Nargesian, Prof. Jiebo Luo, and Prof. Daniel Štefankovič.

"Perhaps you will not only have some appreciation of this culture; it is even possible that you may want to join in the greatest adventure that the human mind has ever begun."

-- Richard Feynman. The Feynman Lectures on Physics (1964)
Publications

(* indicates equal contribution)

Kernel Banzhaf: A Fast and Robust Estimator for Banzhaf Values
Yurong Liu*, R. Teal Witter*, Flip Korn, Tarfah Alrashed, Dimitris Paparas, Juliana Freire
Preprint[paper] [code]

"A novel linear regression-based estimator for Banzhaf values in interpretable machine learning."

BDIViz: A Schema Matching Visualization Tool for Biomedical Domain Experts
Yifan Wu, Dishita Turakhia, Yurong Liu, Juliana Freire, Claudio Silva
Preprint

"A visualization tool that supports iterative exploration, direct manipulation, and value-centric analysis for biomedical schema-matching."

Enhancing Biomedical Schema Matching with LLM-based Training Data Generation
Yurong Liu, Aécio Santos, Eduardo H. M. Pena, Roque Lopez, Eden Wu, Juliana Freire
TRL@NeurIPS, 2024[paper]

"Schema matching with LLMs generating synthetic data for training column embeddings via contrastive learning."

ArcheType: A Novel Framework for Open-Source Column Type Annotation using Large Language Models
Benjamin Feuer, Yurong Liu, Chinmay Hegde, Juliana Freire
VLDB, 2024[paper] [code]

"A framework utilizing large language models for column type annotation, which supports fine-tuning and zero-shot learning settings."

Sampling over Union of Joins
Yurong Liu*, Yunlong Xu*, Fatemah Nargesian
SIGMOD, 2023 (Companion)[paper]

"Random sampling over the set and disjoint union of joins, with sample uniformity and independence guarantees"