AIGI Autumn Newsletter
Greetings from Oxford!
As the world of AI continues to evolve at remarkable speed, the AIGI team has had a whirlwind year—travelling across China, India, Geneva, South Korea, the United States, and Canada to connect, collaborate, and share our latest work.
Now back in our Oxford home, we’re delighted to welcome a new cohort of technical AI governance DPhil students, postdocs, and affiliates. With exciting new publications, projects, and events on the horizon, we look forward to sharing our journey with you!
Welcome to the team: Rivka and Isaac!
Rivka Mitchell joins AIGI as a Postdoctoral Researcher. Her research background is in probability theory, particularly random processes on graphs. She has explored scaling limits and connectivity thresholds in random graph models, as well as MCMC methods. Before joining the AI Governance Initiative, Rivka completed her DPhil in Mathematics at the University of Oxford through the CDT in Random Systems: Analysis, Modelling, and Algorithms; her thesis examined random graphs with evolving structures over time.
Isaac Friend joins AIGI as a Postdoctoral Researcher. His research interests lie broadly in information processing in complex systems. He completed a DPhil in Computer Science at the University of Oxford, at the intersection of causal inference, quantum information, and the foundations of quantum physics; his thesis was titled Causal Identification for Quantum Networks. Previously, he obtained a BS in Mathematics (with a minor in Music) from the University of Chicago, followed by a Master of Advanced Study in Mathematics from the University of Cambridge.
New Technical AI Governance DPhil Cohort
We welcome seven new DPhil students – Mauricio Baker, Evzen Wybitul, Lucas Irwin, Sergey Ichtchenko, Michael Chen, Avi Semler and Chandler Smith – who are joining the programme this year to support our technical governance research. Find out more about them on our Technical AI Governance Programme website.
AIGI Annual Social is back!
Take a well-earned break from your research and join us for an evening of wine, snacks, and stimulating conversation at the Oxford Martin School. Whether you’re a computer scientist pushing the boundaries of machine learning or a philosopher exploring the ethical frontiers of AI, this is your crowd. Come enjoy a drink, meet the growing AIGI team, and share your thoughts (and hot takes!) on the year’s biggest AI developments.
📅 Date: 19 November 2025
🕔 Time: 5:00 – 7:30 pm
📍 Venue: Lecture Theatre, Oxford Martin School, 34 Broad Street, Oxford OX1 3BD
👉 Registration required
AI x Finance workstream
The Social Impacts workstream has established a finance-focused strand, led by our new research affiliate Andrew Sutton. It focuses on ways AI may cause aspects of the financial system to fail, and on ways finance can reduce AI risks, such as by shaping incentives or offering lessons. The first two outputs will be: i) a multi-author Open Questions paper; and ii) investor roundtables and a report on the role of investors in managing AI risks and opportunities. We welcome contributions from those with relevant expertise, or ideas for further work within this strand, which aims to connect and serve researchers, policymakers, regulators, and financial decision-makers.
New affiliates
Matthew Sharp: Associate Director (AI Lead) at Access Partnership. His research examines how AI is transforming work and reshaping human agency.
Tim G. J. Rudner: Assistant Professor of Statistical Sciences at the University of Toronto. His research interests span probabilistic machine learning, AI safety, and AI governance.
Patricia Paskov: Technology and Security Policy Fellow at RAND, where she focuses on the science of frontier AI evaluations.
Andrew Sutton: Ex-banker, whose research focuses on the financial system as an AI risk vector and risk mitigant.
Mark Robinson: Senior Science Diplomacy Advisor to the Oxford Martin AI Governance Initiative, with over 25 years of experience in Big Science projects and Science Diplomacy.
Sören Mindermann: Works with Yoshua Bengio and Mila as the Scientific Lead of the first International AI Safety Report.
Reed Schuler: Through 2024, Reed served as the Managing Director for Implementation and Ambition and Senior Advisor to the U.S. Special Presidential Envoy for Climate John Kerry.
Chin Ze Shen: Research Analyst at the AI Standards Lab. His work focuses on developing concrete threat models and risk pathways for advanced AI systems and evaluating the effectiveness of existing AI standards in mitigating these risks.
Janvi Ahuja: DPhil candidate at Oxford’s Big Data Institute and the Future of Humanity Institute, where her research focuses on statistical modelling of infectious diseases and high-consequence pathogens.
Luise Eder: DPhil student at the Centre for Socio-Legal Studies at the University of Oxford. Her research explores how normative judgements about AI risk are constructed within the emerging AI risk governance regime and negotiated across transnational regulatory networks.
Miro Pluckebaum: Supports AIGI on strategy and research management. He is a Programme Specialist at the Centre for the Governance of AI and the Founder of the Singapore AI Safety Hub.
New office space
We’ve expanded our presence at the Oxford Martin School with a new office on the second floor – featuring a stunning view over Oxford’s Broad Street. Do come and say hello!
AIGI around the world
From New Delhi to Vancouver, and Geneva to Seoul, the AIGI team has been active across the globe this year – contributing to major international discussions on AI governance, safety, and policy.
AIGI in India
Ahead of the India AI Impact Summit 2026, we convened a high-level dialogue in New Delhi titled “Science, Safety and Society: Preparing for the AI Impact Summit”. The closed-door event – held in partnership with Carnegie India and the Observer Research Foundation (ORF) – was officially designated a Pre-Summit Event by India’s Ministry of Electronics and Information Technology (MeitY).
Our Director, Robert Trager, and Senior Adviser and Visiting Fellow, Henry de Zoete, joined leading Indian and international experts, including Prof. Ajay Sood, Principal Scientific Adviser to the Government of India, Prof. Stuart Russell, renowned computer scientist, Abhishek Singh, Additional Secretary at MeitY and CEO of IndiaAI and Amit Shukla, Joint Secretary, Cyber Diplomacy and E-Governance, Ministry of External Affairs.
AIGI at ICML 2025, Vancouver
At ICML 2025 in Vancouver, we hosted a Technical AI Governance Workshop, which received 80 submissions – 64 accepted for poster presentation and eight selected for lightning talks. The workshop featured four invited speakers, including Lucilla Sioli (Head of the EU AI Office), who reportedly described the event as “the highlight of her trip to Vancouver”, and Cozmin Ududec (Head of Science of Evals Research, UK AISI).
AIGI at AI for Good Global Summit, Geneva
We integrated three AI safety-focused panels into the main programme, elevating voices that would otherwise have been overlooked. We hosted two workshops on trustworthy, secure AI, bringing together government representatives (France, Japan, Singapore), leading companies (OpenAI, DeepMind, Anthropic, FMF), and standards bodies (ITU, CEN-CENELEC).
AIGI in WAIC, Shanghai
On July 25, 2025, the Oxford Martin School’s AI Governance Initiative, co-convened a workshop on the sidelines of the World AI Conference in Shanghai, along with cohosts at the Oxford China Policy Lab, the Carnegie Endowment for International Peace, Concordia AI, and Tsinghua University. The workshop advanced critical but broadly lacking cross-border coordination on AI-driven crises, bringing together leading experts from the UK, US, and China to conduct horizon scanning for emerging risks from advanced AI systems, identify early warning mechanisms, and explore emergency response protocols to mitigate disasters driven by advanced AI systems.
AIGI in Oxford
At home, we hosted a range of conferences and dialogues in Oxford.
Agentic Reliability Evaluation Summit
Co-hosted with MLCommons, this two-day summit brought together researchers, developers, and policy experts to explore the technical and governance challenges of agentic AI systems, focusing on reliability evaluation. Participants included representatives from Google, Google DeepMind, Anthropic, OpenAI, and Microsoft, alongside leading academics and policy thinkers.
Dialogue with Legal Scholars
We hosted a dialogue between leading Chinese, British, European, and American legal scholars on AI regulation and governance at the Martin School. The dialogue featured discussion of topics ranging from the state of AI law in leading jurisdictions around the world to the future of copyright and data protection in the age of AI. As AI continues to develop, understanding how legal systems around the world are responding to its novel challenges will enable more effective governance and helpful collaboration.
AIGI Community Meeting – Michaelmas term
We hosted our termly community meeting in October, bringing together affiliates and our wider network to swap the latest AI-governance news and explore ideas for new projects.
What have our AIGI colleagues been up to?
Our Lead of AI Best Practices, Marta Ziosi, has concluded her participation in drafting the EU GPAI Code of Practice. The Code is a means for big tech providers to ensure compliance with the EU AI Act, and it has now been signed by most major providers such as OpenAI, Anthropic, and Google, to name a few. The hope is that the Code becomes an international standard down the line to shape AI governance around the globe. Marta is now leading the AI Best Practices workstream at AIGI, which will focus on topics such as risk management and technical standards for advanced AI, with the aim to become a reference point to address open questions in the field, from risk management of internal deployments to technical standards for AI agents. She will be growing her team in the coming months, so reach out if you are interested in these topics.
Our Senior Research Fellow Fazl Barez taught the inaugural AIGI course “AI Safety and Alignment” in the Department of Engineering Science. The course drew a larger audience than expected and received excellent feedback. Fazl is currently supervising DPhil students and postdocs in interpretability, model safety, and governance. He was recently invited to speak at the Seoul Forum on AI Safety & Security 2025 (hosted by K-AISI) and will be presenting four papers at EMNLP 2025 in Suzhou. His recent publications include the widely discussed papers “Chain-of-Thought is Not Explainability” and “Chain-of-Thought Hijacking”. The latter is featured in Furtune.
Senior Researcher Toby Ord has been writing a widely discussed series of essays on AI scaling. His work explores the recent slowdown in pre-training compute scaling – once the key driver of large language model progress – and examines the implications of a shift towards reinforcement learning (RL). He concludes that this second wave of compute scaling may soon plateau -signalling a new era in how progress in AI is achieved and governed. Read more here.
Legal researcher Nicholas Caputo published his article “Alignment as Jurisprudence” in the Yale Journal of Law & Technology. This paper is one of the foundational works in the emerging field of legal alignment, which seeks to use the lessons of the law to improve the technical alignment of advanced AI. Nicholas has also started developing a workstream on building institutions for AI governance, drawing on the history of technology regulation and cutting-edge AI research to investigate how to adapt institutions for the AI era.
Research Affiliate Patricia Paskov co-organised the NYC AI Governance & Safety Conference during the summer, held at the Yale Club during UN General Assembly week. The event gathered over 55 experts and 100 guests to discuss verification of international AI agreements – a crucial challenge as frontier AI systems create shared global risks. Highlights included a keynote by Ben Harack (Oxford Martin AIGI), a technical talk by Mauricio Baker (RAND), and a lively panel featuring Jordan Schneider (ChinaTalk) and Janet Egan (CNAS).
DPhil Affiliate Ben Harack recently completed a highly regarded report on Verification for International AI Governance, which received recognition from leading figures in the field (including a nod from Yoshua Bengio on X). Over the summer, Ben gave talks at the Technical AI Governance Forum in London, the NYC AI Governance & Safety Conference, and events hosted by the Oxford International Relations Society. He will next speak at the AI & Societal Robustness Conference in Cambridge on 12 December.
DPhil Affiliate Ben Bucknall is currently a visiting research scholar at Stanford University, hosted by Prof. Sanmi Koyejo in the STAIR Lab. His work explores methods for ensuring model authenticity in proprietary systems. He’ll return to Oxford in the new year to teach a course on AI governance for AIMS CDT students.
Research recap
Chain-of-Thought Hijacking
Large reasoning models (LRMs) achieve higher task performance by allocating more inference-time compute. Prior works suggest this scaled reasoning may also strengthen safety by improving refusal. Yet our researchers find the opposite: the same reasoning can be used to bypass safeguards. They introduce Chain-of-Thought Hijacking, a jailbreak attack on reasoning models. The attack pads harmful requests with long sequences of harmless puzzle reasoning. Across HarmBench, CoT Hijacking reaches a 99%, 94%, 100%, and 94% attack success rate (ASR) on Gemini 2.5 Pro, GPT o4 mini, Grok 3 mini, and Claude 4 Sonnet, respectively – far exceeding prior jailbreak methods for LRMs.
Agentic Inequality
Led by our research affiliate Matthew Sharp, this paper introduces and explores “agentic inequality” – the potential disparities in power, opportunity, and outcomes stemming from differential access to, and capabilities of, AI agents.
Safety Frameworks and Standards: A comparative analysis to advance risk management of frontier AI
Led by our Lead of AI Best Practices, Marta Ziosi, the research memo systematically compares Frontier Safety Frameworks (FSFs) with international risk management standards, identifying areas of convergence, divergence, and opportunities for mutual reinforcement.
The Annual AI Governance Report 2025: Steering the Future of AI
In collaboration with ITU, we published a report that highlights the latest trends and applications in areas from networking, multimedia authenticity, cybersecurity, and quantum technologies to energy efficiency, healthcare, food security and smart mobility.
Verification for International AI Governance
The growing impacts of artificial intelligence (AI) are spurring states to consider international agreements that could help manage this rapidly evolving technology. The political feasibility of such agreements can hinge on their verifiability- the extent to which the states involved can determine whether other states are complying. Led by our DPhil affiliate Ben Harack, this report analyzes several potential international agreements and ways they could be verified.
Survey on thresholds for advanced AI systems
Our researchers find, through an expert survey and public consultation, broad support for multi-stakeholder, purpose-specific thresholds to guide oversight of advanced AI systems. Participants favoured combining technical and contextual indicators but disagreed on compute-based triggers, methods for setting thresholds, and their number—highlighting key gaps and priorities for further research.
Mapping Industry Practices to the EU AI Act’s GPAI Code of Practice Safety and Security Measures
This report compares Safety and Security measures in the EU AI Act’s GPAI Code of Practice (Third Draft) with current voluntary commitments by leading AI companies. As binding obligations approach, the Code aims to translate law into technical practice. We map Commitments II.1–II.16 to public, company-facing documentation and evidence sources.
Chain-of-Thought Is Not Explainability
Chains-of-thought (CoT) allow language models to verbalise multi-step rationales before producing their final answer. While this technique often boosts task performance and offers an impression of transparency into the model’s reasoning, we argue that rationales generated by current CoT techniques can be misleading and are neither necessary nor sufficient for trustworthy interpretability.
Risk Tiers: Towards a Gold Standard for Advanced AI
The Oxford Martin AI Governance Initiative (AIGI) convened experts from government, industry, academia, and civil society to lay the foundation for a gold standard for advanced AI risk tiers. A complete gold standard will require further work. However, the convening provided insights for how risk tiers might be adapted to advanced AI while also establishing a framework for broader standardization efforts.