AUTHOR=Elmitwalli Sherif , Mehegan John , Braznell Sophie , Gallagher Allen TITLE=Development and validation of a multi-agent AI pipeline for automated credibility assessment of tobacco misinformation: a proof-of-concept study JOURNAL=Frontiers in Artificial Intelligence VOLUME=Volume 8 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2025.1659861 DOI=10.3389/frai.2025.1659861 ISSN=2624-8212 ABSTRACT=BackgroundThe proliferation of tobacco-related misinformation poses significant public health risks, requiring scalable solutions for credibility assessment. Traditional manual fact-checking approaches are resource-intensive and cannot match the pace of misinformation spread.ObjectiveTo develop and validate a proof-of-concept multi-agent AI pipeline for automated credibility assessment of tobacco misinformation claims, evaluating its performance against expert human reviewers.MethodsWe constructed a three-agent pipeline using OpenAI GPT-4.1 and the Crewai framework. The Serper API provided real-time evidence retrieval. The Content Analyzer classifies claims into four types: health impact, scientific assertion, policy, or statistical. The Scientific Fact Verifier queries authoritative sources (WHO, CDC, PubMed Central, Cochrane). The Health Evidence Assessor applies weighted scoring across five dimensions to assign 0–100 credibility scores on a five-level scale.ResultsThe framework achieved an MAE of 6.25 points against expert scores, a weighted Cohen’s κ of 0.68 (95% CI: 0.52–0.84) indicating substantial agreement, 70% exact category agreement, 95% adjacent-level agreement, and processed each claim in under 7 s—over 1,000 × faster than manual review.LimitationsWe validated our approach using 20 diverse tobacco claims through intensive expert review (2–4 h per claim). The system exhibited a conservative bias (+3.25 points, p = 0.03) and did not classify any claims as “Highly Unlikely” despite expert assignment of two claims to this category. This proof-of-concept demonstrates technical feasibility and substantial inter-rater agreement while identifying areas for calibration in future large-scale implementations.ConclusionOur proof-of-concept agentic AI pipeline demonstrates substantial agreement with expert assessments of tobacco-related claims while providing dramatic speed improvements. By combining zero-shot LLM reasoning, retrieval-grounded evidence verification, and a transparent five-level scoring schema, the system offers a practical tool for real-time misinformation monitoring in public health. This proof-of-concept establishes technical feasibility for automated tobacco misinformation assessment, with validation results supporting further development and larger-scale testing before operational deployment.