[TUHS] On the unreliability of LLM-based search results (was: Listing of early Unix source code from the Computer History Museum)

Tue May 27 03:30:34 AEST 2025

On Mon, 26 May 2025 at 13:20, Jason Bowen <jbowen at infinitecactus.com> wrote:

> May 26, 2025 11:57:18 Henry Bent <henry.r.bent at gmail.com>:
>
> > It's like Wikipedia.
>
> No, Wikipedia has (at least historically) human editors who supposedly
> have some knowledge of reality and history.
>
> An LLM response is going to be a series of tokens predicted based on
> probabilities from its training data. The output may correspond to a ground
> truth in the real world, but only because it was trained on data which
> contained that ground truth.
>
> Assuming the sources it cites are real works, it seems fine as a search
> engine, but the text that it outputs should absolutely not be thought of as
> something arrived at by similar means as text produced by supposedly
> knowledgeable and well-intentioned humans.
>

An LLM can weigh sources, but it has to be taught to do that.  A human can
weigh sources, but it has to be taught to do that.

-Henry
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.tuhs.org/pipermail/tuhs/attachments/20250526/2705f771/attachment.htm>