[TUHS] On the unreliability of LLM-based search results (was: Listing of early Unix source code from the Computer History Museum)

Tue May 27 03:20:16 AEST 2025

May 26, 2025 11:57:18 Henry Bent <henry.r.bent at gmail.com>:

> It's like Wikipedia.

No, Wikipedia has (at least historically) human editors who supposedly have some knowledge of reality and history.

An LLM response is going to be a series of tokens predicted based on probabilities from its training data. The output may correspond to a ground truth in the real world, but only because it was trained on data which contained that ground truth.

Assuming the sources it cites are real works, it seems fine as a search engine, but the text that it outputs should absolutely not be thought of as something arrived at by similar means as text produced by supposedly knowledgeable and well-intentioned humans.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.tuhs.org/pipermail/tuhs/attachments/20250526/b04ccafe/attachment.htm>