Generative AI evaluation in the social sector

About the Event

GenAI tools in LMICs are multiplying—advancing development outcomes but also risking harm. Yet there is little agreement on what evaluating them actually means.

The Agency Fund, with IDInsight and CGD, developed a playbook organized around four levels—model, product, user, and impact—forming a logical progression from system performance to real-world outcomes to address this gap.

Our argument: continuous evaluation is central to product development, not an afterthought. This event introduces the framework and shows how it helps spot mistakes earlier and ensure GenAI delivers real impact.

Sessions

技能培训班 | Online

June 3, 2026 19:30 PM - 21:30 PM

1. Introduction to the 4-level evaluation playbook (https://eval.playbook.org.ai/) 2. Deep dive into Level 1 evaluation: Does the AI system perform as intended? 3. Deep dive into Level 2 and Level 3 evaluation: Does the overall product engage users and positively support their thoughts, feelings, knowledge, and behaviors? 4. Hands-on: Using the playbook with Claude and Gemini in your daily workflow

Session URL

Speakers

名称	标题	Biography
Aman Dalmia	AI Engineer, The Agency Fund	Aman has 8+ years applying AI in agriculture, healthcare, and education, building open-source tools at Wadhwani AI, Avanti Fellows, Noora Health and Artpark. At Agency Fund, he works with grantees to strengthen AI evals and helps partners adopt best practices for evaluating AI in development.
Edmund Korley	Software Engineer, The Agency Fund	Edmund is a software engineer at The Agency Fund and led the AI for Global Development accelerator 2025. He brings 10+ years of experience and has a strong passion for using software to empower global economic and social justice.
Linus Wong	Software Engineer, The Agency Fund	Linus holds a Master's from UIUC. An ex-Googler, he brings extensive software engineering experience to The Agency Fund, where he leads development of Evidential—an A/B testing platform that enables nonprofits to run experiments and learn from them at scale.
Zezhen Wu, Ph.D.	Behavioral Scientist, The Agency Fund	Zezhen completed his Ph.D. in Psychology and Social Intervention at NYU and applies behavioral science principles and data-driven approaches to create meaningful social impact. He also works extensively on designing and evaluating AI products in the social sector.

Moderators

名称	标题	Biography

Topics and Themes

Evaluators VOPEs / Evaluation networks Evaluation Capacity Development Evaluation Approaches and Methods

Back to events calendar

English

Generative AI evaluation in the social sector

About the Event

Sessions

Speakers

Moderators

Topics and Themes

活动详情

Do you have any questions?

Generative AI evaluation in the social sector

About the Event

Sessions

A playbook for AI evaluation in the social sector

Speakers

Moderators

Topics and Themes

活动详情

Do you have any questions?