Evaluating AI for Early-Career STEM Support: Trust, Use, and Perceived Outcomes from a Chatbot Intervention

Seminar | Online

Organized by:
Claremont Graduate University
In partnership with: Joan Zheng

About the Event

This proposal includes two complementary sessions that examine the development and evaluation of an AI-powered chatbot, TrueNorth, through an interdisciplinary collaboration between graduate students in the Center for Information Systems & Technology (CISAT) and the Division of Behavioral and Organizational Sciences (DBOS) at Claremont Graduate University.
Promoted to incorporate “human evaluators” into the development process, CISAT students designed a retrieval-augmented generation (RAG) chatbot grounded in a positive psychology framework to support early-career STEM professionals. DBOS students contributed an evaluation framework to assess the system’s usefulness, trustworthiness, and perceived impact.
Together, these sessions use this collaboration as a case study to examine how evaluators can engage with AI system development. The sessions will address both the empirical evaluation of the chatbot and the process of integrating evaluation into AI design, including challenges, trade-offs, and future directions.
Rather than focusing solely on the tool, this proposal highlights broader implications for the evaluation field: how interdisciplinary collaboration reshapes evaluation practice, how trust is constructed in AI-mediated environments, and how evaluators can contribute to the responsible development of AI systems.

Sessions

Webinar | Online

June 4, 2026 10:30 AM - 11:30 AM

This session presents an evaluation of TrueNorth, an AI-powered chatbot designed to provide early-career guidance to STEM professionals. The evaluation was conducted through an interdisciplinary collaboration with active involvement from program partners throughout the design and implementation process. Using a mixed-methods, pre/post design, the evaluation examines participants’ trust in AI, their likelihood of using AI for work-related tasks, and perceived outcomes related to confidence, agency, and access to support. The session will present key findings alongside qualitative insights into how participants perceive and use AI-generated guidance. Beyond reporting results, this session highlights what it means to evaluate AI-based interventions in practice. It will discuss methodological considerations, including working with evolving systems, measuring trust and perceived usefulness, and navigating the role of evaluators in technology development contexts. The session concludes with implications for evaluation practice, focusing on how AI tools are reshaping expectations around evidence, support, and decision-making.

Day 1: https://cgu.zoom.us/j/86975530796 || Meeting ID: 869 7553 0796

Session URL

Speakers

Name	Title	Biography
Sonia Baron	PhD Student at Claremont Graduate Unviersity	Sonia Baron’s research and evaluation practice focus on mixed-methods, principles-based evaluation that integrates AI, human-centered research, and program evaluation to examine trust, evidence, and outcomes across social sectors. She contributes to multiple evaluation projects housed at the Claremont Evaluation Center.
Kayla Thompson	PhD Student at Claremont Graduate University	Kayla Thompson's research interests center on how creative reporting methods, such as photographs, can support learning and sense-making among interest holders. Her experience ranges from evaluating non-profits, behavioral health organizations, and after-school programs.
Emily Murtaugh	PhD Student at Claremont Graduate University	Prior to starting graduate school, Emily Murtaugh worked in various roles in the afterschool & expanded learning sector where she co-led research that informed federal advocacy efforts and developed & implemented evaluation plans for non-profit organizations. Her research interests focus on unpacking what mindsets and attitudes support individuals in thinking evaluatively.
Kaixin Liu	PsyD student in Organizational Psychology with a concentration in Positive Psychology	Kaixin’s areas of interest lie in alleviating burnt out and improving office operations within the healthcare field with the hope of becoming a consultant in the future. Currently, she works at a primary care clinic where she alternates between completing administrative tasks and working as a medical assistant.

Moderators

Name	Title	Biography
Jennifer Villalobos	Director of the Doctorate of Evaluation Practice (D.Eval) at Claremont Graduate University	Jennifer Pacheco Villalobos, PhD (she/her/ella), is an Assistant Professor of Practice at Claremont Graduate University (CGU), where she serves as Chair of the Evaluation Concentration and Director of the Doctorate of Evaluation Practice (D.Eval) program—the first professional doctorate in evaluation in the U.S. She also holds faculty appointments in the Drucker School of Management and with the Claremont Flourishing Center. Villalobos is an organizational psychologist and evaluator with more than 20 years of experience working across education, healthcare, philanthropy, and social impact sectors. She has led and collaborated on numerous contracts as a core affiliate of the Claremont Evaluation Center and as the founder of her independent consulting practice, Liki Evaluation & Organizational Psychology Consulting, Inc. Her portfolio includes multi-year, mixed-methods evaluations; strategic planning engagements; leadership development initiatives; and capacity-building efforts for mission-driven organizations.

Panel Discussion | Online

June 5, 2026 13:30 PM - 14:30 PM

This panel examines the development of TrueNorth, an AI-powered chatbot designed to support early-career STEM professionals, as a case study in what it means to integrate evaluation into AI system design. Emerging from a collaboration between computer science and evaluation students, TrueNorth was developed as a retrieval-augmented generation (RAG) system grounded in a positive psychology framework. Unlike traditional evaluation models, where evaluators assess completed interventions, this project required evaluators to engage during system development, translating constructs such as trust, support, and professional agency into features that could be operationalized within a chatbot environment. Panelists will reflect on key tensions encountered throughout the process, including: 1) how evaluative concepts are simplified, distorted, or constrained when translated into AI systems 2) the gap between what evaluators aim to measure and what systems are designed to produce 3) the challenge of aligning technical feasibility with evaluative rigor how user needs identified through evaluation shaped (and sometimes conflicted with) system design decisions Drawing on experiences from a hackathon, iterative prototyping, and the development of a supporting white paper, this session moves beyond describing collaboration to critically examine what is gained, and lost, when evaluation becomes embedded in AI development. The discussion situates this work within broader shifts in the social sector, where AI tools are increasingly deployed in contexts requiring judgment, care, and contextual understanding. It argues that evaluators must move upstream in the development process -- not only to assess AI systems, but to shape how they define problems, generate outputs, and influence decision-making.

Day 2: https://cgu.zoom.us/j/86136484884 || Meeting ID: 861 3648 4884

Session URL

Speakers

Name	Title	Biography
Katja Crusius	PhD Student at Claremont Graduate University	Katja Crusius is a third year PhD student in Information Systems & Technology at Claremont Graduate University. Her research focuses on data analytics, social networks, and educational technology. She works as a Data & Analytics Specialist on campus and has held teaching and research positions at CGU and Harvey Mudd College. She holds an M.S. in Information Systems from California State University, Long Beach, and a B.A. in Economics & Political Science from Georg-August University in Göttingen, Germany. Beyond academia, she has previously served as CGU Student Senate President and volunteers with Techies Without Borders and Zeta Tau Alpha to advance education and health awareness
Joan Zheng	PhD Student at Claremont Graduate University	Joan Zheng is a Graduate Fellow at the Murty Sunak QCL Lab and a PhD student in Information Systems and Technology at Claremont Graduate University. She earned her B.S. in Computer Science from the University of Minnesota, Twin Cities in 2019 and spent five years as a programmer and researcher at Smart Information Flow Technologies (SIFT), contributing to federally funded research projects. Her work has been featured in venues including AAAI, IEEE, and ACL.
Sonia Baron	PhD Student at Claremont Graduate University	Sonia Baron’s research and evaluation practice focus on mixed-methods, principles-based evaluation that integrates AI, human-centered research, and program evaluation to examine trust, evidence, and outcomes across social sectors. She contributes to multiple evaluation projects housed at the Claremont Evaluation Center.

Moderators

Name	Title	Biography
Dr. Yan Li	Associate Professor of Information Systems & Technology	Yan Li is an associate professor at Claremont Graduate University’s Center for Information Systems & Technology (CISAT). After working in the industry as a data scientist, Yan re-oriented her career to academia driven by her intellectual curiosity towards emergent technologies and her passion for building things. Her research focuses on data management and analytics with an emphasis on applying data science to discover knowledge from data to support crucial business decisions. Her other research stream focuses on developing and evaluating information and communication (ICT) artifacts to improve the social well-being of underserved populations in low-resource areas, primarily in the public health and education sectors.

Topics and Themes

Evaluators Evaluation Comissioners Decision makers Academics Yearly Theme: Evaluation, Evidence and Trust in the Age of AI

Back to events calendar

English

Evaluating AI for Early-Career STEM Support: Trust, Use, and Perceived Outcomes from a Chatbot Intervention

About the Event

Sessions

Speakers

Moderators

Speakers

Moderators

Topics and Themes

Event Details

Do you have any questions?

Evaluating AI for Early-Career STEM Support: Trust, Use, and Perceived Outcomes from a Chatbot Intervention

About the Event

Sessions

Evaluating a Chatbot: Design, Findings, and Implications for AI in Evaluation

Speakers

Moderators

Development & Evaluation: New Frontiers for the Social Sector

Speakers

Moderators

Topics and Themes

Event Details

Do you have any questions?