<div dir="auto">Amen<br clear="all"><br clear="all"><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature">Sent from a handheld expect more typos than usual</div></div></div><div><br></div><div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Nov 28, 2024 at 1:17 PM Warner Losh <<a href="mailto:imp@bsdimp.com">imp@bsdimp.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="auto"></div><div dir="auto"><div><br><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Nov 28, 2024, 11:11 AM Edouard Klein <<a href="mailto:edouardklein@gmail.com" target="_blank">edouardklein@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">So, I ran the following experiment:<br>
<br>
I downloaded the last month of email from the archive, and split them<br>
into one email per file using a quick and dirty state machine script in<br>
python looking an empty line followed by two lines starting with "From"<br>
in order to know where to split.<br>
<br>
I then uploaded the files into <a href="https://openwebui.com/" rel="noreferrer noreferrer" target="_blank">https://openwebui.com/</a> built-in RAG<br>
system.<br>
The vectorization model is nvidia/NV-Embed-v2, and the LLM is<br>
<a href="https://huggingface.co/bartowski/gemma-2-27b-it-GGUF/blob/main/gemma-2-27b-it-Q8_0.gguf" rel="noreferrer noreferrer" target="_blank">https://huggingface.co/bartowski/gemma-2-27b-it-GGUF/blob/main/gemma-2-27b-it-Q8_0.gguf</a><br>
<br>
I then put the following very vague query, but which I knew was relevent<br>
to last month's discussion:<br>
<br>
What has changed in the C language in the last few decades ?<br>
<br>
I copied the response at the end of the email. It seems quite good to<br>
me, but it may get worse when the whole archive is uploaded.<br>
<br>
I think this proof of concept shows that there could be value in doing<br>
the whole list, with a cleaner email extraction a few tweaks to the<br>
system prompt.<br>
<br>
I can't do it now because I have a few deadlines coming soon (not the<br>
least of which is the International Workshop on Plan 9<br>
<a href="https://iwp9.org/" rel="noreferrer noreferrer" target="_blank">https://iwp9.org/</a> SEND YOUR PAPERS, PEOPLE !), but I would like to<br>
tackle that in the near future.<br>
<br>
Hosting this require significant but not overly expansive hardware (from<br>
1 to few k€), if a university lab or a generous patron wants to see it<br>
done and make it available to the public, please contact me :)<br>
<br>
Otherwise llamafile (<a href="https://github.com/Mozilla-Ocho/llamafile" rel="noreferrer noreferrer" target="_blank">https://github.com/Mozilla-Ocho/llamafile</a> justine<br>
single-handedly restored my faith in Mozilla) could be used to<br>
distribute a file that would contain the model+the emails+the vector<br>
database in a single huge file that can run anywhere.<br>
<br>
Exciting times !<br>
<br>
<br>
Here is the LLM answer:<br>
<br>
Based on the provided context, it appears that the C language has<br>
undergone significant changes over the last few decades. Here are some<br>
key changes mentioned:<br>
<br>
Standardization and Formalization : The C language has become more<br>
formalized and standardized, with the introduction of new standards<br>
(e.g., C89, C99, C23). This has led to a more rigid and legalistic<br>
approach to the language, which some users find less useful.<br>
<br>
Changes to realloc function : The behavior of the realloc function<br>
has changed over time. In the first ANSI C standard, realloc(0, ptr)<br>
was well-defined when ptr was not nil. However, this behavior was<br>
changed in C99, and now, in C23, realloc(0, ptr) is undefined<br>
behavior (UB) regardless of the value of ptr.<br>
<br>
Strict Aliasing : The introduction of strict aliasing rules has led<br>
to compilers assuming that pointers to incompatible types always<br>
point to different non-overlapping locations. This has caused issues<br>
for some programmers.<br>
<br>
Additional Features and Functions : The C language has gained new<br>
features and functions over the years, such as stdint.h (introduced<br>
in C89), which provides integer types with specific widths.<br>
<br>
Evolution of Language Principles : The language has evolved with no<br>
appreciation or understanding of the aesthetic and practical<br>
principles of the original language effort, leading to a more<br>
verbose and less concise language.<br>
<br>
<br>
These changes reflect the C language's growth and adaptation to new<br>
requirements and use cases over the decades. However, some users, like<br>
the ones in the provided context, feel that these changes have made the<br>
language less useful and more complicated.<br></blockquote></div></div><div dir="auto"><br></div></div><div dir="auto"><div dir="auto">Now you kids get off my damn lawn!</div><div dir="auto"><br></div><div dir="auto">But it is a fairly accurate summary. </div><div dir="auto"><br></div><div dir="auto">Warner</div></div><div dir="auto"><div dir="auto"><br></div><div dir="auto">Warner</div><div dir="auto"><br></div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Warren Toomey via COFF <<a href="mailto:coff@tuhs.org" rel="noreferrer" target="_blank">coff@tuhs.org</a>> writes:<br>
<br>
> I was just trying to find out if there was a way of uploading the TUHS<br>
> mailing list as a corpus of text into a LLM so that I could ask questions<br>
> based on the knowledge contained therein. I'm not having much luck yet.<br>
><br>
> Anyway, while messing around with things I asked a simple question and got<br>
> this answer:<br>
><br>
> What is a spin lock?<br>
><br>
> A spin lock, also known as a spinning lock or spinning lock mechanism, is<br>
> a type of mechanical locking device used to secure and protect sensitive<br>
> electronic components, such as microprocessors, memory chips, and other<br>
> high-value devices.<br>
><br>
> Here's how it worked: when a CPU instruction was executed and the<br>
> necessary data was stored in registers, the CPU would "lock" the registers<br>
> by spinning them around to prevent other instructions from accessing them.<br>
><br>
> Those poor registers must get dizzy with all the spinning that they<br>
> have to endure :-)<br>
><br>
> Cheers, Warren<br>
><br>
> P.S. If anybody has ideas on how to usefully import the TUHS mailing list<br>
> into a LLM, let me/us know.<br>
</blockquote></div></div></div>
</blockquote></div></div>