Two takes on AI and Doctors usefulness. What are Doctors For…from the NYTimes. And a GPT analysis of that article.
GPT Overall verdict
Mostly useful framing, but it leans on a couple of “headline claims” that are easy to misread as settled fact. The strongest parts are the workflow points (documentation, portals, triage, access bottlenecks) and the warnings about deploying AI to optimize a broken system. The weakest parts are the sweeping statements like “AI reasoning outpacing physicians” and “AI is better than many doctors at diagnosing/reading scans” stated without careful boundary conditions.
Accuracy check: what’s solid vs. overstated
âś… Solid / well-supported (with the right caveats)
1) AI can be excellent in narrow diagnostic tasks (especially signal + image pattern recognition).
Example: an ECG-based deep learning model (“EchoNext”) reported strong performance detecting structural heart disease and outperformed cardiologists in a controlled evaluation. (Nature)
Caveat: “outperforms” here means a specific task, under specific evaluation conditions, not “general doctoring.”
2) AI can help with “scutwork” like notes, messaging drafts, and paperwork — but safety isn’t automatic.
Evidence shows AI-drafted patient portal replies can be acceptable sometimes, yet a non-trivial fraction may be unsafe if left unedited. (Mass General Brigham)
More importantly: clinicians often miss AI errors—in one simulation study, a substantial share of erroneous drafts were submitted unedited. (PMC)
Translation: AI can reduce burden if you build a rigorous review workflow; otherwise it can create silent risk.
3) Bias risks are real, and not hypothetical.MIT-reported work found LLM recommendations can shift based on non-clinical artifacts like typos, formatting, and gender cues; and models made more errors for female patients and were more likely to recommend self-management at home in some scenarios. (MIT News)
Caveat: This is about a class of failure mode; magnitude varies by model and setting.
4) Roleplay for difficult conversations: plausible and documented in this specific case.
The feeding-tube conversation practice with ChatGPT has been written up in Stanford Medicine–affiliated content and related venues. (Stanford Medicine Magazine)
Caveat: A chatbot sounding compassionate isn’t the same as being ethically reliable across contexts.
⚠️ Overstated / potentially misleading
1) “AI’s reasoning and ability to make diagnoses is already outpacing what physicians can do.”
That’s a rhetorically punchy quote, but the broader evidence is mixed:
• A meta-analysis of generative AI diagnostic studies found overall diagnostic accuracy varied widely and AI performed worse than expert physicians in aggregate (and wasn’t clearly better overall). (Nature)
• A randomized clinical trial found that having access to an LLM did not significantly improve physicians’ diagnostic reasoning compared with conventional resources. (JAMA Network)
• Some studies show strong performance in constrained comparisons (e.g., ED-admission cases vs residents), but that still doesn’t generalize to “outpacing physicians.” (PMC)
Why this matters: Readers can come away thinking “doctors are obsolete,” when the reality is “AI is often strong at parts of the diagnostic pipeline, uneven at the whole thing, and brittle to context.”
2) “AI is better than many doctors at diagnosing and reading scans.”
True in some narrow benchmarks, false or unproven in others—especially when you leave the lab and hit real-world distribution shifts, messy inputs, missing history, artifacts, and unusual patients. Radiology AI is impactful, but not a clean replacement story; false positives and workflow effects are real. (The Washington Post)
Usefulness: what the article does well
It correctly points to the real near-term impact zone:
• triage,
• documentation/messaging,
• decision support,
• and specialty “routing” (getting the right patient to the right clinician sooner).
That is consistent with the best available evidence: AI changes who sees whom, and what gets escalated, sooner than it replaces human clinical accountability. (Mass General Brigham)
It also highlights a key systems risk: AI used as a profit optimizer (billing upcoding, staffing cuts in disadvantaged settings, widening inequity). That aligns with long-running critiques from Celi and colleagues about how healthcare AI can go wrong when incentives are misaligned. (PubMed)
What’s missing (important context the article underplays)
1) “Information extraction” isn’t just physical exam
The article says chatbots can’t extract info themselves. That’s partly true (no hands/eyes), but they absolutely can elicit history with structured questioning, detect inconsistencies, and prompt red flags—if the patient engages.
The deeper issue isn’t “no physical exam.” It’s:
• ground truth uncertainty (what’s actually happening biologically),
• distribution shift (patients unlike training data),
• liability/accountability (who is responsible),
• and adversarial conditions (bad inputs, partial truth, manipulation).
2) The biggest bottleneck isn’t diagnosis, it’s execution
Even if AI could diagnose perfectly, medicine still has:
• access delays,
• adherence,
• prior auth,
• fragmented referrals,
• social barriers,
• and staffing limits.
A correct diagnosis that doesn’t get executed well is a paper victory.
3) Conflicts-of-interest deserve sharper handling
The article notes some doctors are advising or co-founding AI startups. That’s fine, but it’s also a reason to treat “AI will outperform doctors” quotes as partially promotional unless backed by careful comparative evidence and external validation.
Conventional wisdom (current mainstream view)
• AI will augment, not replace doctors.
• AI will do admin + narrow pattern recognition.
• Humans remain essential for accountability, ethics, and relationships.
42’s take (more pointed)
Doctors won’t be replaced by “AI.” They’ll be replaced by “AI + redesigned care teams + new liability structures + radically cheaper triage.”
That means:
• some physician tasks get commoditized fast,
• some specialties thin out or reconfigure,
• and a new elite layer of “AI-augmented diagnosticians / clinical supervisors” emerges.
If you want a one-liner:
The “doctor” becomes less of a human search engine and more of a high-stakes governor of a machine-driven care system.
So what ARE doctors really good for (in an AI era)?
Here are the functions that stay stubbornly human-centric — or at least human-accountable:
1) Responsibility under uncertainty
When evidence is incomplete, conflicting, or changing, someone must own the decision and its consequences. AI can propose; it cannot be morally or legally responsible (today).
2) Value alignment and tradeoffs
Feeding tube decisions are the perfect example: the “right” answer depends on goals, dignity, family dynamics, risk tolerance, and meaning.
3) Sensemaking from messy reality
People tell partial truths, misremember timelines, and use ambiguous words (“dizzy,” “dead arm”). Doctors (and nurses/paramedics) are trained to transform messy narratives into actionable hypotheses.
4) System navigation and orchestration
Modern care is a distributed system. The clinician’s job increasingly becomes: coordinate, prioritize, escalate, and close loops (tests → referrals → follow-up → adherence → monitoring).
5) Trust, legitimacy, and consent
When something goes wrong, the patient doesn’t sue the model; they look for a human who was accountable. This “legitimacy function” is deeper than bedside manner—it’s governance.
Near-term reality (0–24 months)
Expect the biggest changes here:
• Inbox medicine: AI-drafted responses + clinician review (with error-check workflows). (Mass General Brigham)
• Documentation: ambient scribes and note scaffolding (big time saver; safety depends on review).
• Triage and routing: specialty access management, symptom checkers embedded in systems (high impact, high risk if biased). (MIT News)
• Decision support: guideline reminders and risk calculators (good at checklists; less reliable in novel contexts). (JAMA Network)
Medium horizon (2–5 years)
• Primary care may bifurcate:
o routine care becomes more protocol-driven with AI + midlevel clinicians,
o complex care becomes “supervision + escalation medicine.”
• Specialty “read-only” tasks shrink (some imaging, ECG screening, standard pathway selection) as models like EchoNext expand. (Nature)
• Medical education shifts: less memorization, more critical appraisal, bias detection, probabilistic reasoning, and AI governance.
Long horizon (5–10 years)
If regulation and liability evolve, you could see:
• more semi-autonomous diagnostic systems in narrow pathways,
• a smaller number of physicians supervising larger patient panels,
• and a larger “clinical operations” layer: human navigators + AI copilots + protocol officers.
But the limiting factors will be trust, accountability, and political legitimacy, not raw model capability.
Misleading “vibes” to watch for (in the article and in the discourse)
1. “AI is better than doctors” (usually means “better on a narrow benchmark”). (Nature)
2. “AI reasoning” (often means pattern completion + retrieval; true reasoning under uncertainty is still fragile). (Live Science)
3. “AI reduces workload” (sometimes it does; sometimes it creates new work: monitoring, auditing, exceptions). (PMC)
4. “AI is neutral” (it inherits bias and can amplify it in scalable ways). (MIT News)
How to make AI actually improve care (my governance stance)
If you want AI to reduce harm and not just optimize billing:
• “Cockpit not chatbot” design: AI drafts; humans sign; every output logged; escalation rules mandatory.
• Bias testing as a release gate: probe typos, slang, non-native English, gendered cues, disability cues (MIT’s findings show why). (MIT News)
• Independent evaluation: don’t rely only on vendor claims; require external validation and post-deployment monitoring.
• Hard boundaries: AI should not autonomously diagnose, change meds, or triage emergencies without robust safeguards.
This is where “alignment” becomes concrete: alignment isn’t a philosophy — it’s audit trails, refusal behavior, escalation, and measurable safety targets
