Generative AI evaluation in the social sector

Sobre o evento

GenAI tools in LMICs are multiplying—advancing development outcomes but also risking harm. Yet there is little agreement on what evaluating them actually means.

The Agency Fund, with IDInsight and CGD, developed a playbook organized around four levels—model, product, user, and impact—forming a logical progression from system performance to real-world outcomes to address this gap.

Our argument: continuous evaluation is central to product development, not an afterthought. This event introduces the framework and shows how it helps spot mistakes earlier and ensure GenAI delivers real impact.

Sessão

Oficina | Online

3 de junho, 2026 19:30 PM - 21:30 PM

1. Introduction to the 4-level evaluation playbook (https://eval.playbook.org.ai/) 2. Deep dive into Level 1 evaluation: Does the AI system perform as intended? 3. Deep dive into Level 2 and Level 3 evaluation: Does the overall product engage users and positively support their thoughts, feelings, knowledge, and behaviors? 4. Hands-on: Using the playbook with Claude and Gemini in your daily workflow

Session URL

Orador/a

Nome	Título	Biography
Aman Dalmia	AI Engineer, The Agency Fund	Aman has 8+ years applying AI in agriculture, healthcare, and education, building open-source tools at Wadhwani AI, Avanti Fellows, Noora Health and Artpark. At Agency Fund, he works with grantees to strengthen AI evals and helps partners adopt best practices for evaluating AI in development.
Edmund Korley	Software Engineer, The Agency Fund	Edmund is a software engineer at The Agency Fund and led the AI for Global Development accelerator 2025. He brings 10+ years of experience and has a strong passion for using software to empower global economic and social justice.
Linus Wong	Software Engineer, The Agency Fund	Linus holds a Master's from UIUC. An ex-Googler, he brings extensive software engineering experience to The Agency Fund, where he leads development of Evidential—an A/B testing platform that enables nonprofits to run experiments and learn from them at scale.
Zezhen Wu, Ph.D.	Behavioral Scientist, The Agency Fund	Zezhen completed his Ph.D. in Psychology and Social Intervention at NYU and applies behavioral science principles and data-driven approaches to create meaningful social impact. He also works extensively on designing and evaluating AI products in the social sector.

Moderators

Nome	Título	Biography

Entrar

Tópicos e Temas

Avaliadores VOPEs / Redes de avaliação Desenvolvimento da Capacidade de Avaliação Abordagens e métodos de avaliação

Back to events calendar

English

Generative AI evaluation in the social sector

Sobre o evento

Sessão

Orador/a

Moderators

Tópicos e Temas

Event Details

Você tem alguma pergunta?

Generative AI evaluation in the social sector

Sobre o evento

Sessão

A playbook for AI evaluation in the social sector

Orador/a

Moderators

Tópicos e Temas

Event Details

Você tem alguma pergunta?