Evaluating Guideline Adherence in Gemini-Powered Dental Trauma Workflows: Standalone Gemini Chat vs. Document-Grounded NotebookLM
Author
Publication date
2026-03-02ISSN
1600-9657
Abstract
Aim: The aim of this study was to compare the accuracy and inter-account consistency of two Google Gemini–powered, user-facing workflows for dental trauma decision support: standalone Gemini chat and NotebookLM, a document-grounded workflow that generates responses grounded in uploaded European Society of Endodontology and International Association of Dental Traumatology guideline documents, when answering dichotomous (yes/no) clinical questions on the management of traumatized permanent teeth. Methodology: A cross-sectional simulation was conducted using 99 dichotomous (yes/no) questions derived from the European Society of Endodontology and International Association of Dental Traumatology guidelines. Three academic endodontists submitted each question to Gemini and NotebookLM using three independent Google accounts, generating 297 responses per workflow. Accuracy was defined as exact agreement with guideline-based answers, and consistency as the proportion of identical responses across the three trials. Statistical analyses included Wald and Wilson 95% confidence intervals, Fleiss' kappa for inter-account agreement, and Pearson's chi-squared tests to compare proportions. Results: Gemini demonstrated an overall accuracy of 83.83% (95% CI: 75.08–90.47) and a consistency of 74.74% (κ = 0.84).NotebookLM showed higher accuracy (92.93%; 95% CI: 85.97–97.11) and perfect consistency (100%; κ = 1.00). While the difference in accuracy did not reach statistical significance (p = 0.076), NotebookLM exhibited significantly greater consistency(p < 0.001). Conclusions: The responses generated from the guidelines were highly consistent with both workflows. Document groundingmay enhance repeatability and alignment with guideline-derived decision points for structured dichotomous inquiries, as evidenced by NotebookLM's ability to achieve complete inter-account consistency and to quantitatively increase accuracy. Theseresults are the outcome of workflow-level benchmarking; therefore, clinical utility cannot be inferred solely from them; professional oversight and additional validation remain necessary before any clinical application.
Document Type
Article
Document version
Published version
Language
English
Subject (CDU)
6 - Applied Sciences. Medicine. Technology
Keywords
Pages
9
Publisher
Wiley
Collection
0
Is part of
Dental Traumatology
Recommended citation
Dufey-Portilla, Nicolás; Abella Sans, Francesc; Duran-Sindreu, Fernando [et al.]. Evaluating Guideline Adherence in Gemini-Powered Dental Trauma Workflows: Standalone Gemini Chat vs. Document-Grounded NotebookLM. Dental Traumatology, 2026, 0, páginas 1-9. Disponible en <https://onlinelibrary.wiley.com/doi/10.1111/edt.70065>. Fecha de acceso: 5 mar. 2026. DOI: https://doi.org/10.1111/edt.70065
Note
The author, N. Dufey-Portilla, thanks the National Agency for Researchand Development (ANID) for its support through the DOCTORADOBECAS CHILE/2025 - 72250040 Scholarship Program.
This item appears in the following Collection(s)
- Odontologia [351]
Rights
This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in anymedium, provided the original work is properly cited, the use is non- commercial and no modifications or adaptations are made.© 2026 The Author(s). Dental Traumatology published by John Wiley & Sons Ltd.
Except where otherwise noted, this item's license is described as https://creativecommons.org/licenses/by-nc-nd/4.0/


