The LOT-LLM Paradox

The LOT-LLM Paradox

Fodor & Pylyshyn (1988) argued that connectionist networks cannot support systematic, compositional reasoning — the defining properties of genuine cognition per language-of-thought. LLMs are connectionist networks. Yet LLMs appear to reason systematically and compositionally across novel inputs. This generates a trilemma: either Fodor was wrong about what cognition requires, or LLMs are not genuinely reasoning, or the right unit of analysis is not the LLM alone.

Three Positions in the Literature

Position 1 — Mahowald et al. (2024): LLMs Have Language Without Thought

Drawing on neuroscience of language processing: in humans, the language network (Broca's/Wernicke's areas) and the multiple demand (MD) network (domain-general reasoning, planning, working memory) are anatomically and functionally dissociated. They co-activate in conversation but are computationally separate systems doing different work.

LLMs appear to have learned near-perfect formal linguistic competence — syntax, morphology, discourse structure, pragmatics, statistical regularities of language at scale. What they lack is functional competence: grounded causal reasoning, persistent world models, social cognition, executive control — the MD network's contributions. They are fluent in the surface form of Mentalese without the referential structures that make Mentalese about anything.

The implication is that LLM outputs that look like reasoning are artifacts of training on text produced by reasoners — linguistic residue of human cognition, not cognition itself.

— Mahowald, K. et al. (2024), Trends in Cognitive Sciences 28(6), arXiv:2301.06627

Position 2 — Wong et al. (2023): LLMs Learn to Translate NL into a Probabilistic LOT

The right model is not "LLMs reason in natural language." LLMs learn to translate natural language into probabilistic programs — an implicit Mentalese implemented as executable world models, distributed across weights rather than classical symbols. The LOT is real but learned, not innate; it emerges from sufficient exposure to language produced by LOT-equipped reasoners.

This vindicates a weaker Fodor: LOT exists as the substrate of systematic cognition, but it does not have to be hardwired. A system can learn to approximate LOT structure from the linguistic trace that LOT-using minds leave behind. Whether this approximation satisfies the structural guarantees Fodor required — not just behaving systematically but being systematic — is the residual question.

— Wong, L. et al. (2023), "From Word Models to World Models," arXiv:2306.12672

Position 3 — Rothschild (2025): Natural Language Is Already Good Enough for Thought

The boldest inversion of Fodor. The premise that natural language is unsuited for cognition — that a separate Mentalese is required — is wrong. Natural language was shaped by evolutionary and cultural selection to encode exactly the structural scaffolding that inference needs: compositionality, quantification, negation, conditionality, aspect, modality. It already approximates LOT well enough that a system operating in it can reason.

LLMs succeed in part because they operate in natural language rather than despite it. No separate Mentalese is required if natural language is already a sufficient approximation. This position also undermines Fodor's internalism: if natural language — a social, external system — is the vehicle of thought, then cognition is not purely internal by construction.

— Rothschild, D. (2025), "Language and Thought: The View from LLMs," arXiv:2505.13561

The Synthesis: The Right Unit of Analysis

All three positions take the LLM in isolation as their object. The more productive frame, consistent with extended-cognition, is the human-LLM system.

Mahowald's critique holds at the component level: the LLM lacks functional competence. The language/MD network dissociation is real, and LLMs map onto the language side. But the coupled system is not subject to the same critique. The human provides what the MD network provides — grounded causal models, goal specification, evaluative judgment, working memory across sessions. The LLM provides the language network at superhuman compositional scale. The coupled system exhibits LOT-like properties that neither component has alone.

This dissolves the paradox without requiring either that Fodor was simply wrong or that LLMs are secretly doing genuine internal reasoning. You don't need LLMs to have LOT structure internally. You need the human-LLM system to exhibit the relevant cognitive properties systematically. Whether it does depends on coupling quality — and coupling quality is primarily a function of the human side. See extended-cognition.

The Fodor-Clark Opposition

Fodor's internalism (nothing outside the skull is constitutive of cognition) and Clark's extended mind are the poles. The LOT-LLM paradox sits directly between them. The distributed LOT frame is not the extended mind thesis — the mind remains the human's, not distributed across the system. It is the weaker claim that the cognitive process of reasoning can be extended, with the LLM contributing operations the human's biological hardware cannot efficiently perform alone.

Key Figures

  • Evelina Fedorenko (MIT EvLab) — neuroscience of language/thought dissociation, underpins Mahowald et al.
  • Kyle Mahowald (UT Austin) — formal vs. functional competence framework
  • Josh Tenenbaum (MIT) — probabilistic LOT, Bayesian program induction
  • Jacob Andreas (MIT) — compositionality and systematicity in NLP
  • Gary Marcus — compositionality critique of deep learning
  • François Chollet — systematic generalization, ARC benchmark as a test for genuine LOT-like reasoning

Related

language-of-thought · extended-cognition · reasoning-models · evals · agentic-workflows

Sources