This note is a collection of articles about Large Language Models (LLMs).
Yes, and we must also consider that agents are optimised to deliver more rather than less code. More code is always more challenging to review, and humans are terrible at code review. Review fatigue is an actual problem in our industry, and for most of us, it hits even after reviewing a handful of modified source files.
My AI Agents Are All Nuts by Niko Heikkilä
Self-experimentation is exactly how smart people get pulled into homeopathy or naturopathy, for example. It’s what makes them often more likely to fall for superstitions and odd ideas. The smart person’s self-identity means they can’t believe their own psychological biases are fooling them.
Trusting your own judgement on ‘AI’ is a huge risk by Baldur Bjarnason
Yet for all their novelty, these predictions are strikingly familiar. They rehearse, in updated form, the same automation discourse I critiqued in this book: an enduring narrative that imagines technology autonomously remaking human life, while obscuring the social structures in which technological change is embedded.
Is the AI Bubble About to Burst? | Verso Books by Aaron Benanav
In fact, pairing with top LLMs evokes many memories of pairing with top human programmers.
The worst memories.
Memories of my pair grabbing the keyboard and—in total and unhelpful silence—hammering out code faster than I could ever hope to read it. Memories of slowly, inevitably becoming disengaged after expending all my mental energy in a futile attempt to keep up.
Why agents are bad pair programmers | justin․searls․co by Justin Searls
EEG revealed significant differences in brain connectivity: Brain-only participants exhibited the strongest, most distributed networks; Search Engine users showed moderate engagement; and LLM users displayed the weakest connectivity.
The problem is that I’m going to be responsible for that code, so I cannot blindly add it to my project and hope for the best. I could only incorporate AI generated code into a project of mine after I thoroughly review it and make sure I understand it well. I have to feel confident that I can modify or extend this piece of code in the future, or else I cannot use it.
Unfortunately reviewing code is actually harder than most people think. It takes me at least the same amount of time to review code not written by me than it would take me to write the code myself, if not more.
Why Generative AI Coding Tools and Agents Do Not Work For Me - miguelgrinberg.com by Miguel Grinberg
On 1 March 2024, a research preprint from Valentin Hofmann et al. was published on arXiv that investigates racism in Large Language Models (LLMs). The conclusions are extremely illustrative of the fundamental barriers that LLMs are up against.
Waiting on the platform for a morning train that was nowhere to be seen, he asked Meta’s WhatsApp AI assistant for a contact number for TransPennine Express. The chatbot confidently sent him a mobile phone number for customer services, but it turned out to be the private number of a completely unconnected WhatsApp user 170 miles away in Oxfordshire.
Deciphering Glyph - I Think I’m Done Thinking About genAI For Now
We conduct a randomized controlled trial (RCT) to understand how early-2025 AI tools affect the productivity of experienced open-source developers working on their own repositories. Surprisingly, we find that when developers use AI tools, they take 19% longer than without—AI makes them slower. We view this result as a snapshot of early-2025 AI capabilities in one relevant setting; as these systems continue to rapidly evolve, we plan on continuing to use this methodology to help estimate AI acceleration from AI R&D automation .
Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity - METR
Musk revealed a similar line of thinking recently when he suggested “general artificial intelligence” was close because he had asked Grok “about materials science that are not in any books or on the Internet.” The idea, of course, is that Musk had hit the limits of known science rather than the limit of his scientific understanding. The billionaire really seems convinced that Grok was working on something new.
Billionaires Convince Themselves AI Chatbots Are Close to Making New Scientific Discoveries
When we hype up the technology, we mostly help the people who put money into it. This post isn’t about those people or that money, maybe they could use the help… my point is, they are irrelevant when we want to understand the merits of AI. They muddy the waters and overshadow the important questions.
How to avoid that your post about AI helps the hype | hidde.blog by Hidde de Vries
Everyone says 2025 is the year of AI agents. The headlines are everywhere: “Autonomous AI will transform work,” “Agents are the next frontier,” “The future is agentic.” Meanwhile, I’ve spent the last year building many different agent systems that actually work in production. And that’s exactly why I’m betting against the current hype.
Why I’m Betting Against AI Agents in 2025 (Despite Building Them)
“Oh! We’re making all of our food blue, all the best restaurants are doing it now.” the waiter explained.
But I didn’t want my burger to be blue. I like my burgers to be the same reliable dark brown color cooked meats are supposed to be.
They’re putting blue food coloring in everything
If you want to multiply two numbers together, a quick way to do it is to roll a few d10s and use their digits as the answer.
It’s fast and very easy, but there is a little skill in knowing how many dice to use. Right now it sometimes hallucinates the wrong answer but dice tech is improving all the time so it’s only going to get better from here on.
https://glitterkitten.co.uk/@diffractie/114936882179055798
We found a problematic pattern after spending hours vetting the sources of pros, cons and overall product recommendations spouted by Google’s AI Overview: most ‘facts’ are sourced from the manufacturer itself, online shops, sponsored articles and press materials.