Large Language Models : Juha-Matti Santala

This note is a collection of articles about Large Language Models (LLMs).

Yes, and we must also consider that agents are optimised to deliver more rather than less code. More code is always more challenging to review, and humans are terrible at code review. Review fatigue is an actual problem in our industry, and for most of us, it hits even after reviewing a handful of modified source files.

My AI Agents Are All Nuts by Niko Heikkilä

Self-experimentation is exactly how smart people get pulled into homeopathy or naturopathy, for example. It’s what makes them often more likely to fall for superstitions and odd ideas. The smart person’s self-identity means they can’t believe their own psychological biases are fooling them.

Trusting your own judgement on ‘AI’ is a huge risk by Baldur Bjarnason

Yet for all their novelty, these predictions are strikingly familiar. They rehearse, in updated form, the same automation discourse I critiqued in this book: an enduring narrative that imagines technology autonomously remaking human life, while obscuring the social structures in which technological change is embedded.

Is the AI Bubble About to Burst? | Verso Books by Aaron Benanav

In fact, pairing with top LLMs evokes many memories of pairing with top human programmers.

The worst memories.

Memories of my pair grabbing the keyboard and—in total and unhelpful silence—hammering out code faster than I could ever hope to read it. Memories of slowly, inevitably becoming disengaged after expending all my mental energy in a futile attempt to keep up.

Why agents are bad pair programmers | justin․searls․co by Justin Searls

EEG revealed significant differences in brain connectivity: Brain-only participants exhibited the strongest, most distributed networks; Search Engine users showed moderate engagement; and LLM users displayed the weakest connectivity.

[2506.08872] Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task

The problem is that I’m going to be responsible for that code, so I cannot blindly add it to my project and hope for the best. I could only incorporate AI generated code into a project of mine after I thoroughly review it and make sure I understand it well. I have to feel confident that I can modify or extend this piece of code in the future, or else I cannot use it.

Unfortunately reviewing code is actually harder than most people think. It takes me at least the same amount of time to review code not written by me than it would take me to write the code myself, if not more.

Why Generative AI Coding Tools and Agents Do Not Work For Me - miguelgrinberg.com by Miguel Grinberg

On 1 March 2024, a research preprint from Valentin Hofmann et al. was published on arXiv that investigates racism in Large Language Models (LLMs). The conclusions are extremely illustrative of the fundamental barriers that LLMs are up against.

Ain’t No Lie — The unsolvable(?) prejudice problem in ChatGPT and friends – R&A IT Strategy & Architecture

Waiting on the platform for a morning train that was nowhere to be seen, he asked Meta’s WhatsApp AI assistant for a contact number for TransPennine Express. The chatbot confidently sent him a mobile phone number for customer services, but it turned out to be the private number of a completely unconnected WhatsApp user 170 miles away in Oxfordshire.

‘It’s terrifying’: WhatsApp AI helper mistakenly shares user’s number | Artificial intelligence (AI) | The Guardian

Deciphering Glyph - I Think I’m Done Thinking About genAI For Now

We conduct a randomized controlled trial (RCT) to understand how early-2025 AI tools affect the productivity of experienced open-source developers working on their own repositories. Surprisingly, we find that when developers use AI tools, they take 19% longer than without—AI makes them slower. We view this result as a snapshot of early-2025 AI capabilities in one relevant setting; as these systems continue to rapidly evolve, we plan on continuing to use this methodology to help estimate AI acceleration from AI R&D automation .

Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity - METR

Musk revealed a similar line of thinking recently when he suggested “general artificial intelligence” was close because he had asked Grok “about materials science that are not in any books or on the Internet.” The idea, of course, is that Musk had hit the limits of known science rather than the limit of his scientific understanding. The billionaire really seems convinced that Grok was working on something new.

Billionaires Convince Themselves AI Chatbots Are Close to Making New Scientific Discoveries

When we hype up the technology, we mostly help the people who put money into it. This post isn’t about those people or that money, maybe they could use the help… my point is, they are irrelevant when we want to understand the merits of AI. They muddy the waters and overshadow the important questions.

How to avoid that your post about AI helps the hype | hidde.blog by Hidde de Vries

Everyone says 2025 is the year of AI agents. The headlines are everywhere: “Autonomous AI will transform work,” “Agents are the next frontier,” “The future is agentic.” Meanwhile, I’ve spent the last year building many different agent systems that actually work in production. And that’s exactly why I’m betting against the current hype.

Why I’m Betting Against AI Agents in 2025 (Despite Building Them)

“Oh! We’re making all of our food blue, all the best restaurants are doing it now.” the waiter explained.

But I didn’t want my burger to be blue. I like my burgers to be the same reliable dark brown color cooked meats are supposed to be.

They’re putting blue food coloring in everything

If you want to multiply two numbers together, a quick way to do it is to roll a few d10s and use their digits as the answer.

It’s fast and very easy, but there is a little skill in knowing how many dice to use. Right now it sometimes hallucinates the wrong answer but dice tech is improving all the time so it’s only going to get better from here on.

https://glitterkitten.co.uk/@diffractie/114936882179055798

We found a problematic pattern after spending hours vetting the sources of pros, cons and overall product recommendations spouted by Google’s AI Overview: most ‘facts’ are sourced from the manufacturer itself, online shops, sponsored articles and press materials.

Beware of the Google AI salesman and its cronies

This study presents the first large-scale security analysis of LLM-generated patches using 20,000+ issues from the SWE-bench dataset and reveals that the standalone LLM introduces nearly 9x more new vulnerabilities than developers, with many of these exhibiting unique patterns not found in developers’code.

[PDF] Are AI-Generated Fixes Secure? Analyzing LLM and Agent Patches on SWE-bench | Semantic Scholar

AI Browsers promise a future where an Agentic AI working for you fully automates your online tasks, from shopping to handling emails. Yet, our research shows that this convenience comes with a cost: security guardrails were missing or inconsistent, leaving the AI free to interact with phishing pages, fake shops, and even hidden malicious prompts, all without the human’s awareness or ability to intervene.

“Scamlexity”: When Agentic AI Browsers Get Scammed

Except this time, the person warning about a bubble is Sam Altman, the CEO most responsible for creating it. When OpenAI’s chief executive warned last week that investors were “overexcited” about AI, markets reacted immediately. Nvidia fell 3.5%, Palantir dropped nearly 10%, and the selloff spread globally.

The Bubble That Knows It’s a Bubble

“Please do not use Google AI to find out our specials. Please go on our Facebook page or our website,” the restaurant wrote in a weary Facebook post. “Google AI is not accurate and is telling people specials that do not exist which is causing angry customers yelling at our employees.”

Local Restaurant Exhausted as Google AI Keeps Telling Customers About Daily Specials That Don’t Exist

Recently, a woman slowed down a line at the post office, waving her phone at the clerk. ChatGPT told her there’s a “price match promise” on the USPS website. No such promise exists. But she trusted what the AI “knows” more than the postal worker—as if she’d consulted an oracle rather than a statistical text generator accommodating her wishes.

and

LLMs are intelligence without agency—what we might call “vox sine persona”: voice without person. Not the voice of someone, not even the collective voice of many someones, but a voice emanating from no one at all.

The personhood trap: How AI fakes human personality - Ars Technica

So if you’re a developer feeling pressured to adopt these tools — by your manager, your peers, or the general industry hysteria — trust your gut. If these tools feel clunky, if they’re slowing you down, if you’re confused how other people can be so productive, you’re not broken. The data backs up what you’re experiencing. You’re not falling behind by sticking with what you know works. If you’re feeling brave, show your manager these charts and ask them what they think about it.

Where’s the Shovelware? Why AI Coding Claims Don’t Add Up

Garden of Learning

Garden Plan

Large Language Models

Garden Plan

Recent Notes

Homelab

How I manage my configuration dotfiles

NFL Fantasy Football 2025-26

Avoid replicating long paths in shell with brace expansion

Backlinks

Recent Notes

Homelab

How I manage my configuration dotfiles

NFL Fantasy Football 2025-26

Avoid replicating long paths in shell with brace expansion