Andreas Batsis / Wednesday, November 27, 2024 / Categories: The AI-Scape

LLMs don’t do formal reasoning - and that is a HUGE problem

TL;DR Version

Where no health or money is at stake, LLMs can make all the difference!

A Little Longer Version

Apple researchers have critically assessed the reasoning capabilities of 𝗟𝗟𝗠𝘀, arguing that 𝘁𝗵𝗲𝗶𝗿 𝗶𝗻𝘁𝗲𝗹𝗹𝗶𝗴𝗲𝗻𝗰𝗲 𝗶𝘀 𝗼𝘃𝗲𝗿𝘀𝘁𝗮𝘁𝗲𝗱 𝗮𝗻𝗱 𝗹𝗮𝗿𝗴𝗲𝗹𝘆 𝗯𝗮𝘀𝗲𝗱 𝗼𝗻 𝗺𝗲𝗺𝗼𝗿𝗶𝘇𝗮𝘁𝗶𝗼𝗻 𝗿𝗮𝘁𝗵𝗲𝗿 𝘁𝗵𝗮𝗻 𝗴𝗲𝗻𝘂𝗶𝗻𝗲 𝗿𝗲𝗮𝘀𝗼𝗻𝗶𝗻𝗴.

THE HARSH TRUTH: LLMs struggle with reasoning when faced with ostensibly relevant, but deliberately distracting information.

Key points of "excellence" of LLMs are summarised below:

Performance Decline: LLMs perform adequately on small problems but exhibit a marked decline in performance as problem complexity increases, a trend observed in both older and newer models.
Arithmetic Limitations: LLMs consistently fail at basic arithmetic tasks, particularly with larger numbers, unlike traditional calculators which maintain accuracy.
Chess Rule Violations: LLM inability to follow chess rules exemplifies the broader issue of inadequate formal reasoning in LLMs.

THE EVEN HARSHER TRUTH: Patterns of failure observed are systematic and not merely isolated incidents.