Want to wade into the snowy surf of the abyss? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid.
Welcome to the Stubsack, your first port of call for learning fresh Awful youāll near-instantly regret.
Any awful.systems sub may be subsneered in this subthread, techtakes or no.
If your sneer seems higher quality than you thought, feel free to cutānāpaste it into its own post ā thereās no quota for posting and the bar really isnāt that high.
The post Xitter web has spawned so many āesotericā right wing freaks, but thereās no appropriate sneer-space for them. Iām talking redscare-ish, reality challenged āculture criticsā who write about everything but understand nothing. Iām talking about reply-guys who make the same 6 tweets about the same 3 subjects. Theyāre inescapable at this point, yet I donāt see them mocked (as much as they should be)
Like, there was one dude a while back who insisted that women couldnāt be surgeons because they didnāt believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I canāt escape them, I would love to sneer at them.
(Credit and/or blame to David Gerard for starting this. Also, hope you had a wonderful Valentineās Day!)


That was a good read.
Corey doc wrote:
Equivocating what LLMs do and what goes into LLM web scraping with āa search engineā is messed up. His article that he links about scraping is mostly about how badly copyright works and how analysing trade-secret-walled data can be beneficial both to consumers and science but occasionally bad for citizen privacy, which youāll recognize as mostly irrelevant to the concerns people tend to have against LLM training data providers ddosing the fuck out of everything, and all the rest of the stuff tante does a good job of explaining.
Corey also provides this anecdote:
what the actual shit
edit: I mean, he tried transformer powered voice-to-text and liked it, and now heās all in on the LLMs are a rigorous and accurate tool actually bandwagon?
Also the web scraping article is from 2023 but CD linked it in the recent pluralistic post so I assume his views havenāt changed.
I was a bit alarmed by this, a client brought in that Colombia data for their dissertation last month, and did not mention this. I looked up the paper https://www.arxiv.org/abs/2509.04523 - what they /actually/ did was use GPT 4o-mini only for feature extraction, then stack into a random forest in a supervised setting to dedupe. This is very different than what he described. And the GPT features werenāt even the most important ones, the RF preferred cosine similarity of articles, a decidedly not-large approachā¦
That he went from that all the way to itās mostly ok when sam altman steals all your data, misrepresents it and then steals all your traffic is⦠bad.
At any rate itās definitely good to know that that war crime forensics data project isnāt quite the unintentional shambles corey makes it out to be.
This one hurts. Maybe CD can be brought back around but oof.
I the post he keeps referring to Ollama as an LLM (itās a desktop app that runs a local server that lets you download and interface with a local LLM via CLI or http API) so itās possible heās just that far behind in his technical understanding of LLMs that heās fallen to taking the wrong peopleās word for it.
The post certainly reads like he doesnāt even know which local LLM heās using, let alone what it takes to make one.
This is probably just me, but that doesnāt seem particularly shocking. If this AI bubbleās taught me anything, its that tech culture (if not tech as a whole) was deeply, deeply vulnerable to the LLM rot from the start.