Working with Large Language Models

Prologue

Why this exists, the troubling state of “artificial intelligence” as a convention

At this point most people likely have used a large language model, or at least heard of the “AI bubble”. This book exists to correct the societal path of thinking in regards to this technology, which has been repeatedly misconstrued in order to fund its creation or consequentially to pay off debts taken to fund its creation. My first problem with the whole thing is the way that the term “AI” has been completely bastardized. Artificial intelligence means almost nothing at this point, first of all because we are yet to still discover what our meaning of intelligence even is, or what makes us “intelligent”, and secondly because it has been used so avariciously in marketing. AI can refer to something as trivial in modern times as voice or facial recognition, or simple machine learning classifiers, but can also refer to advanced physics engines, complex neural networks, transformers, military targeting technology: the scope of that term has become way too broad.

This book is about Large Language Models: autocomplete systems, scaled up enormously, that predict the next word in a string of text. That’s it. Think ChatGPT, Claude, Gemini, Copilot, etc. You input a string, it gets converted into a set of tokens, and the most likely next token is predicted, appended, and the process repeats. No ideas, no personality, no tools. Everything else you see around the technology is adjacent software infrastructure built by humans or constructed in your head by marketing.

This book is made to give readers an intuitive understanding of what large language models actually do so they can use them more effectively as a tool, which is what they are and what they will become in the future in the absence of some other technological or quantum breakthroughs.

If you stick with the book you’ll come out with a working picture of what an LLM actually does when you send it a prompt. How it got the answer. What the context window really is. Where output breaks, and what to do about that. How to route between models, give them tools, and ship production systems against them without the whole thing collapsing under its own complexity.

Everything here is measured. Where the research is settled, the book tells you what it says. Where it’s contested, the book tells you both sides and lets you pick. Where first-hand evidence matters more than citations, there are first-hand numbers.

First-person, empirical, opinionated where the evidence warrants. Written by someone who runs and trains these models in production daily, as opposed to by a company who is trying to sell you a subscription.