Daniella (Zihuiwen) Ye

Back to all members...

PhD, started 2024

Github
Email

Daniella (Zihuiwen) Ye is a DPhil student in Computer Science at the University of Oxford, supervised by Yarin Gal and Phil Blunsom. Her research interests lie in enhancing the reliability and robustness of Large Language Models (LLMs) for real-world applications, with a particular focus on text generation. She is also interested in exploring methods of controlled generation of language models, employing statistical and linguistic approaches.

Some of the topics she has researched in previous years include improving human preference learning for LLMs with synthetic critiques, a project she undertook during her internship at Cohere. She has also explored augmenting text-to-code generation processes through self-play, and applying diffusion models for non-autoregressive text planning. She is a recipient of the DeepMind scholarship.

Publications while at OATML • News items mentioning Daniella (Zihuiwen) Ye • Reproducibility and Code • Blog Posts

Publications while at OATML:

Uncertainty-Aware Step-wise Verification with Generative Reward Models

Complex multi-step reasoning tasks, such as solving mathematical problems, remain challenging for large language models (LLMs). While outcome supervision is commonly used, process supervision via process reward models (PRMs) provides intermediate rewards to verify step-wise correctness in solution traces. However, as proxies for human judgement, PRMs suffer from reliability issues, including susceptibility to reward hacking. In this work, we propose leveraging uncertainty quantification (UQ) to enhance the reliability of step-wise verification with generative reward models for mathematical reasoning tasks. We introduce CoT Entropy, a novel UQ method that outperforms existing approaches in quantifying a PRM's uncertainty in step-wise verification. Our results demonstrate that incorporating uncertainty estimates improves the robustness of judge-LM PRMs, leading to more reliable verification.

Daniella (Zihuiwen) Ye, Luckeciano Carvalho Melo, Younesse Kaddar, Phil Blunsom, Sam Staton, Yarin Gal
arXiv
[paper]