This program aims to address the fundamental research question “How can humans steer and trust AI systems much smarter than them?” with a particular interest in:
- Weak-to-strong generalization: Humans will be weak supervisors relative to superhuman models. Can we understand and control how strong models generalize from weak supervision?
- Interpretability: How can we understand model internals? And can we use this to e.g. build an AI lie detector?
- Scalable oversight: How can we use AI systems to assist humans in evaluating the outputs of other AI systems on complex tasks?
- Many other research directions, including but not limited to honesty, chain-of-thought faithfulness, adversarial robustness, evals and testbeds, and more.
Projects: US$100k - $2 million (over 1-2 years) OR
Graduate Students US$150k ($75k stipend, $75k compute & research funding) for 1 year.
Funder
OpenAI
Application Instructions
Applications for this funding opportunity are centrally managed and submitted by the UTS Research Grants Team.
If you would like to know more about the application process or your eligibility status, contact the Research Grants Team
Submission Information
UTS Internal Deadline for submission is 9 February 2024.
Funder Deadline for submission is 18 February 2024.