What Makes Claude 3.7 Sonnet Better Than o1 and What’s Next with GPT-4.5 & GPT-5?

Feb 26, 2025

Dear Medical Educator,

In my previous post on OpenAI's o1 model, we explored how PhD-level AI models natively use System 2 Thinking—allowing for deliberate, logical reasoning akin to human problem-solving, unlike GPT-4o that uses System 1 thinking where it directly provides the response if you do not use a prompt like “think step-by-step”.

What makes PhD-level AI (o1) different from GPT?

Yavuz Selim Kıyak

October 26, 2024

Read full story

However, while o1 excelled in deep reasoning, it came with notable trade-offs: slower response times and significantly higher computational costs.

This is where Anthropic's Claude 3.7 Sonnet emerges.

What Problem Did Claude 3.7 Solve?

Anthropic is a company that was established by researchers who left OpenAI because of safety reasons (which I exemplified in an article: “OpenAI putting shiny products above safety”).

Their new model Claude 3.7 Sonnet introduces the first hybrid reasoning model (a combination of System 1 and System 2 thinking), balancing fast response times with extended reasoning capabilities—effectively mitigating o1’s major drawback.

This is an important development in growing new species (AI models).

Upcoming Developments: GPT-4.5 and GPT-5

OpenAI does not stand still.

OpenAI has announced plans to release GPT-4.5 within the coming weeks. This model is expected to be the company's final non-chain-of-thought (System 1 thinking) AI model, focusing on enhanced performance and efficiency over its predecessors.

Following GPT-4.5, OpenAI aims to launch GPT-5 in the coming months. GPT-5 will integrate various technologies, including the reasoning models (System 2 thinking), to create a unified AI system capable of dynamic task handling. This integration is part of OpenAI's effort to simplify its product offerings and eliminate the need for users to select between different models manually.

These advancements signify a continued evolution in AI. Staying informed about these developments will be essential as we integrate AI more deeply into educational practices, until AI takes over these tasks.

Yavuz Selim Kıyak, MD, PhD (aka MedEdFlamingo)

Follow the flamingo on X (Twitter) at @MedEdFlamingo for daily content.
Subscribe to the flamingo’s YouTube channel.
LinkedIn is another option to follow.
Who is the flamingo?

Related #MedEd reading:

Kıyak, Y. S., & Kononowicz, A. A. (2024). Case-based MCQ generator: a custom ChatGPT based on published prompts in the literature for automatic item generation. Medical Teacher, 46(8), 1018-1020.. https://www.tandfonline.com/doi/full/10.1080/0142159X.2024.2314723

Kıyak, Y. S., & Emekli, E. (2024). ChatGPT prompts for generating multiple-choice questions in medical education and evidence on their validity: a literature review. Postgraduate Medical Journal, 100(1189), 858-865. https://academic.oup.com/pmj/advance-article/doi/10.1093/postmj/qgae065/7688383

Kıyak, Y. S., & Emekli, E. (2024). A Prompt for Generating Script Concordance Test Using ChatGPT, Claude, and Llama Large Language Model Chatbots. Revista Española de Educación Médica, 5(3). https://revistas.um.es/edumed/article/view/612381

MedEdFlamingo's Newsletter

What makes PhD-level AI (o1) different from GPT?

Discussion about this post