바로가기 메뉴
본문 바로가기
푸터 바로가기
TOP

Comparing manual and ChatGPT deep research on systematic search and selection in the PubMed database on the topic of dental implantology

Comparing manual and ChatGPT deep research on systematic search and selection in the PubMed database on the topic of dental implantology

Author

Bulcsú Bencze, Alwin Sokolowski, Jae-Hyun Lee, Péter Hermann, Tamás Hegedüs, Wataru Kozuma, Reo Ikumi, Michael Payer, Ángel-Orión Salgado-Peralvo, Dániel Végh

Journal

International Journal of Dentistry

Year

2025

Bencze, BulcsúSokolowski, AlwinLee, Jae-HyunHermann, PéterHegedüs, TamásKozuma, WataruIkumi, ReoPayer, MichaelSalgado-Peralvo, Ángel-OriónVégh, DánielComparing Manual and ChatGPT Deep Research on Systematic Search and Selection in the PubMed Database on the Topic of Dental ImplantologyInternational Journal of Dentistry2025, 2677641, 9 pages, 2025.

Abstract

Introduction

Dental implantology has seen rapid technological advancements, with artificial intelligence (AI) increasingly integrated into diagnostic, planning, and surgical processes. The release of chat-generative pretrained transformer (ChatGPT) and its subsequent updates, including the deep research function, presents opportunities for AI-assisted systematic reviews. However, its efficacy compared to traditional manual research has not been researched.

Materials and Methods

A systematic review was conducted on May 6, 2025, to evaluate recent innovations in dental implantology and AI. Two parallel searches were performed: one using ChatGPT 4.1’s deep research tool in the PubMed database and another manual PubMed search by two independent reviewers. Both searches used identical keywords and Boolean operators targeting studies from 2020 to 2025. Inclusion criteria were peer-reviewed studies related to implant design, osseointegration, guided placement, and other predefined outcomes.

Results

The manual search identified 124 articles, of which 23 met the inclusion criteria. ChatGPT retrieved 114 articles, selected 13 for inclusion, yet only included 11 in its synthesis. Two cited articles by the AI software were nonexistent, and numerous relevant studies were not retrieved, whereas the remaining articles were correct and found by manual search as well. ChatGPT had high specificity (98%) and low sensitivity (47.8%), with a statistically significant difference compared to manual search and selection.

Discussion

AI tools like ChatGPT show promise in literature search, synthesis, and assistance, especially in improving readability and identifying trending topics in science. Nevertheless, the current state of deep research function lacks the reliability required for conducting systematic reviews due to issues such as made-up references and missed articles. The results highlight the need for human supervision and improved safeguards.

Conclusions

ChatGPT’s deep research function can support, but not replace manual systematic search and selection. It offers substantial benefits in writing support and preliminary synthesis due to acceptable accuracy, but limitations in reliability and low sensitivity (47.8%) require cautious use and transparent reporting of any AI involvement in scientific research.