Students Trusted ChatGPT Even When It's Wrong: An Experiment

When ChatGPT becomes the smartest-looking person in the room

Jul 15, 2025

Dear Medical Educator,

What if I told you students will shift their answers to match ChatGPT, even when it’s wrong?

A recent study tested how much first-year medical students trusted ChatGPT’s answers over their own, and over their teammates’, in a team-based learning (TBL) session.

The Setup

Forty first-year students worked through 8 challenging clinical cases in three rounds:

Individual mode: Answered questions on their own.
AI assist: Saw ChatGPT’s answers and could revise.
Team mode: Discussed in small groups before a final answer.

Bonus twist: One case used GPT-4, the rest GPT-3.5.

What did they find?

* Students shifted answers toward ChatGPT, even when it was wrong:
When AI gave incorrect answers, students often revised their answers to match, and group discussion did not fix the damage.

* ChatGPT “correct” answers stuck:
When AI gave accurate answers, students reliably adopted and retained those responses after discussion.

* Group discussions didn’t significantly boost correctness:
Final accuracy barely improved, from 18.8 % initially to 26.3 % post-discussion (p = 0.118).

Why?

This looks like good old-fashioned automation bias and the fact that these were early-stage learners still building clinical confidence. Faced with tough questions and a confident AI, students deferred, sometimes despite prior knowledge or peer input.

It’s like picking answers from a shiny cheat sheet you found online, errors included, instead of figuring it out with classmates who are just as new as you.

What to do?

If you're using an LLM as a source in TBL, always debrief where it led students astray. Otherwise, you’re training them to trust it blindly.

ChatGPT can anchor answers powerfully, more than peers or instincts. This challenges the sacred cow of groupthink correction and nudges us to rethink how AI fits into collaborative learning.

Students trust ChatGPT, sometimes more than themselves or their team.

Did you find this helpful? Share it with your colleagues to help them as well.

Yavuz Selim Kıyak, MD, PhD (aka MedEdFlamingo)

Follow the flamingo on X (Twitter) at @MedEdFlamingo for daily content.
Subscribe to the flamingo’s YouTube channel.
LinkedIn is another option to follow.
Who is the flamingo?

Related #MedEd reading:

Çiçek, F. E., Ülker, M., Özer, M., & Kıyak, Y. S. (2025). ChatGPT versus expert feedback on clinical reasoning questions and their effect on learning: a randomized controlled trial. Postgraduate Medical Journal, 101(1195), 458–463. https://academic.oup.com/pmj/advance-article/doi/10.1093/postmj/qgae170/7917102

Kıyak, Y. S., & Emekli, E. (2024). ChatGPT prompts for generating multiple-choice questions in medical education and evidence on their validity: a literature review. Postgraduate Medical Journal, 100(1189), 858-865. https://academic.oup.com/pmj/advance-article/doi/10.1093/postmj/qgae065/7688383

MedEdFlamingo's Newsletter

Discussion about this post