Generative AI evaluation in the social sector

Oficina | Online

Sobre o evento

GenAI tools in LMICs are multiplying—advancing development outcomes but also risking harm. Yet there is little agreement on what evaluating them actually means.

The Agency Fund, with IDInsight and CGD, developed a playbook organized around four levels—model, product, user, and impact—forming a logical progression from system performance to real-world outcomes to address this gap.

Our argument: continuous evaluation is central to product development, not an afterthought. This event introduces the framework and shows how it helps spot mistakes earlier and ensure GenAI delivers real impact.

Sessão

Oficina | Online
3 de junho, 2026 19:30 PM - 21:30 PM
1. Introduction to the 4-level evaluation playbook (https://eval.playbook.org.ai/) 2. Deep dive into Level 1 evaluation: Does the AI system perform as intended? 3. Deep dive into Level 2 and Level 3 evaluation: Does the overall product engage users and positively support their thoughts, feelings, knowledge, and behaviors? 4. Hands-on: Using the playbook with Claude and Gemini in your daily workflow

Orador/a

Nome Título Biography
Aman Dalmia AI Engineer, The Agency Fund Aman has 8+ years applying AI in agriculture, healthcare, and education, building open-source tools at Wadhwani AI, Avanti Fellows, Noora Health and Artpark. At Agency Fund, he works with grantees to strengthen AI evals and helps partners adopt best practices for evaluating AI in development.
Edmund Korley Software Engineer, The Agency Fund Edmund is a software engineer at The Agency Fund and led the AI for Global Development accelerator 2025. He brings 10+ years of experience and has a strong passion for using software to empower global economic and social justice.
Linus Wong Software Engineer, The Agency Fund Linus holds a Master's from UIUC. An ex-Googler, he brings extensive software engineering experience to The Agency Fund, where he leads development of Evidential—an A/B testing platform that enables nonprofits to run experiments and learn from them at scale.
Zezhen Wu, Ph.D. Behavioral Scientist, The Agency Fund Zezhen completed his Ph.D. in Psychology and Social Intervention at NYU and applies behavioral science principles and data-driven approaches to create meaningful social impact. He also works extensively on designing and evaluating AI products in the social sector.

Moderators

Nome Título Biography

Tópicos e Temas

Avaliadores VOPEs / Redes de avaliação Desenvolvimento da Capacidade de Avaliação Abordagens e métodos de avaliação

Event Details

Entrar