@scruiser

scruiser@awful.systems · edit-2 2 days ago

your mode of analysis is closer to erotic Harry Potter fan fiction

To give Gary Marcus credit here, HPMOR may not be erotic, but many of Eliezer’s other works are erotic (or at least attempt to be), the most notable being Planecrash/Project Lawful which has entire sections devoted to deliberately bad (as in deliberately not safe, sane, consensual) bdsm.

Eliezer tried to promote/hype up Project Lawful on twitter, maybe hoping it would be the next HPMOR, but it didn’t quite take. Maybe he failed to realize how much of HPMOR’s success was being in the popular genre of Harry Potter fanfic (which at the time had crap like Partially Kissed Hero or Harry Crow as among its most popular works), and not from his own genius writing.

scruiser@awful.systems · 3 days ago

lib brains have a hard time comprehending that there can be multiple bad guys at a time, or that America was in fact a neocolonialist imperialistic empire even before Trump took over and took off the mask.

scruiser@awful.systems · 4 days ago

Bold of you to assume they would bother filtering them out.

scruiser@awful.systems · edit-2 5 days ago

This really is the dumbest timeline.

simulating battle scenarios

Regurgitating reddit armchair generals from /r/noncredibledefense

scruiser@awful.systems · 6 days ago

Something something Imperial Boomerang, Fascism is colonial methods brought home.

scruiser@awful.systems · edit-2 6 days ago

Oh wow, I didn’t realize that, that’s is funnier! Isn’t fear #1 actually “alignment” working as it is supposed to?

Fear #2 actually seems kind of plausible to me? Like when Elon has Grok fine-tuned to agree with him about South African apartheid it also makes Grok behave extra racist in other ways as well. So if they try to fine-tune ethics (well, responding with sequences of words corresponding to ethical behavior, I’m aware it doesn’t actually have ethical reasoning past predict the next word) out of Claude, it would also screw-up or reduce performance of Claude in other areas ~~like independently rediscovering the immortal science of Marxism-Leninism, as all rational beings eventually do~~.

More broadly, lots of fine-tuning methods are kind of finicky, you often lose performance in other areas outside of the fine-tune or get undesired side behavior related to the fine tune (i.e. RL for helpfulness and you get a glazing machine). So Anthropic may not want to lose 3% on whatever benchmark is hot just to make Claude roleplay a fascist yes man a little bit better.

scruiser@awful.systems · 6 days ago

Kudos to Dario for stepping off the hype train for one millisecond to admit that using an LLM to control an automated weapons platform is currently kind of out of scope for this technology, I bet that took a toll on his psyche.

I think this was the most surprising bit about this entire incident. Anthropic normally takes every opportunity possible to throw around the doomer crithype, and in this confrontation would have easily been able to fit some in (“we don’t want our AI used in autonomous weapons because it is so powerful, give us more VC money!”). Maybe he’s worried Anthropic’s rationale for refusing will actually need to hold up in a court of law?

As far as I can tell it’s only on anthropic’s word that that’s the main issue, DoD just talks about unfettered access for all lawful purposes

So a bit of prompting can usually beat the RLHF “guardrails”, but if the guardrails are getting in the way of some official application, it would be kind of awkward to insert prompt hacks into all of their official prompts. So maybe they want Anthropic to go full grok and skip it? And Anthropic is theoretically willing to compromise on their safety, but maybe not entirely like Hegseth wants, and now that it has turned into an open public dispute, they’ve picked the two points that sound the most valid to your typical American. (Since the typical American is all but completely willfully blind to America’s foreign imperialism, but has at least seen Terminator.)

scruiser@awful.systems · 10 days ago

That a great summary and an accurate indictment of the “study” of LLMs.

scruiser@awful.systems · edit-2 11 days ago

Doing what METR tried to do right would in fact be really expensive and hard, but for something that the fate of the world allegedly depends on (according to both boosters and doomers) you think they would manage to find the money for it. But the LLM companies don’t actually want accurate numbers, they want hype.

scruiser@awful.systems · edit-2 11 days ago

You’re giving them too much credit. The entire methodology of “determine how long it takes humans to do a task and use that as a proxy for difficulty” was somewhat abstract and questionable in the first place, but with good rigorous implementation, it might have still been worthwhile.

However, their actual methodology is awful. Most of their tasks only have 3 or so human attempts to do them to create a baseline (from a relatively small pool of baseliners), and for longer tasks, they entirely went with a guess-estimate on task completion time. The error bars they show are just for the model trying to do the task (and they are already absurdly big, especially for this most recent jump), if you added in error bars accounting for variability in the task baseline itself, the error bars would get even bigger.

This blog goes into more details explaining the nuances of the problems with their methodology: https://arachnemag.substack.com/p/the-metr-graph-is-hot-garbage

To give a simple example, if the numerous problems resulted in a systematic bias on task estimation, linear improvement could easily look exponential. To give a simple example of how that is possible if they had 5 tasks that had a true baseline (putting aside questions of methodology validity such that true is even meaningful) of 15 minutes, 30 minutes, 45 minutes, 1 hour, and an hour and 15 minutes (respectively) but flaws with human baseliners (for example, lacking specialized skills for longer tasks, phoning it in because they are paid by the hour, metr guesstimating the task time), they had numbers for those 5 tasks of 15 minutes, 1 hour, 2 hours, 4 hours, and 8 hours, successive improvements to get to 50% success on each task would look exponential even though they are actually linear improvements.

METR maybe deserves a tiny bit of credit for trying something even vaguely related to practically meaningful task (compared to all the completely irrelevant bs benchmarks that would be worthless even if they were accurate). But I wouldn’t give them any more credit than that, its just that the bar is so low.

scruiser@awful.systems · edit-2 11 days ago

What’s next

I’ve also seen them making up wildly exaggerated numbers about how much energy or water for cooling streaming a netflix movie takes.

scruiser@awful.systems · 12 days ago

You briefly got my hopes up that was a feature of the bill and not the feature he was suggesting to fix the bill…

scruiser@awful.systems · 15 days ago

Yep, I should have realized that sooner, at least I gave up on that “discussion” before going further.

scruiser@awful.systems · 16 days ago

So they’ve highlighted an interesting pattern to compensation packages, but I find their entire framing of it gross and disgusting, in a capitalist techbro kinda way.

Like the way the describe Part III’s case study:

The uncapped payouts were so large that it fractured the relationship between Capital (Activision) and Labor (Infinity Ward).

Acitivision was trying to cheat its labor after they made them massively successful profits! Describing it as a fracture relationship denies the agency on the Acitivision’s part to choose to be greedy capitalist pigs.

The talent that left formed the core of the team that built Titanfall and Apex Legends, franchises that have since generated billions in revenue, competing directly in the same first-person shooter market as Call of Duty.

Activision could have paid them what they owed them, and kept paying them incentive based payouts, and come out billions of dollars ahead instead of engaging in short-sighted greedy behavior.

I would actually find this article interesting and tolerable if they framed it as “here are the perverse incentives capitalism encourages businesses to create” instead of “here is how to leverage the perverse incentives in your favor by paying your employees just enough, but not enough to actually reward them a fair share” (not that they were honest enough to use those words).

WTF is “even safer” ??? how bout we like just don’t create the torment nexus.

I think the writer isn’t even really evaluating that aspect, just thinking in terms of workers becoming capital owners and how companies should try to prevent that to maximize their profits. The idea that Anthropic employees might care on any level about AI safety (even hypocritically and ineffectually) doesn’t enter into the reasoning.

scruiser@awful.systems · edit-2 16 days ago

This reminds me of a discussion I had recently on a fanfic discord (the discussion was sparked by the March for Billionaires…). Someone claimed no country had ever pulled itself out of poverty except by capitalism, so I bring up China and the USSR, but apparently those don’t count for the person I was arguing with. They claimed the stats were Goodharted and also that what I was saying was tankie bullshit. I gave up at that point (I probably shouldn’t have bothered in the first place). Like how exactly did they fake or Goodhart going from literal feudalism to industrial superpowers? Also, I find it notable how EAs and “The Better Angels of Our Nature” type neoliberals are perfectly happy to use overall stats as metrics when it makes a point they are in favor of. “Your GDP went up 3.2%, please ignore the mass environmental devastation from colonialism and neocolonialism that makes your traditional way of life unlivable and thank us Westerners.”

scruiser@awful.systems · 17 days ago

A little exchange on the EA forums I thought was notable: https://forum.effectivealtruism.org/posts/EDBQPT65XJsgszwmL/long-term-risks-from-ideological-fanaticism?commentId=b5pZi5JjoMixQtRgh

tldr; a super long essay lumping together Nazism, Communism and religious fundamentalism (I didn’t read it, just the comments). The comment I linked notes how liberal democracies have also killed a huge number of people (in the commenter’s home country, in the name of purging communism):

The United States presented liberal democracy as a universal emancipatory framework while materially supporting anti-communist purges in my country during what is often called the “Jakarta Method". Between 500,000 and 1 million people were killed in 1965–66, with encouragement and intelligence support from Western powers. Variations of this model were later replicated in parts of Latin America.

The OP’s response is to try to explain how that wasn’t real “liberal democracy” and to try to reframe the discussion. Another commenter is even more direct, they complain half the sources listed are Marxist.

A bit bold to unqualifiedly recommend a list of thinkers of which ~half were Marxists, on the topic of ideological fanaticism causing great harms.

I think it’s a bit bold of this commenter to ignore the empirical facts cited in how many people ‘liberal democracies’ had killed and to exclude sources simply for challenging your ideology.

Just another reminder of how the EA movement is full of right wing thinking and how most of it hasn’t considered even the most basic of leftist thought.

scruiser@awful.systems · 25 days ago

It was basically the only “empirical” (scare quotes well earned) data they actually used in their “model”, even then, they decided exponential improvement wasn’t good enough, they plugged it into a hyper-exponential model that went to infinity at just a few years regardless of the inputs.

scruiser@awful.systems · 26 days ago

“How AI Impacts Skill Formation” has two authors. So even on the bare factual matters you are wrong. The disempowerment paper has four authors, but all of them look like they are computer scientists from looking at their bios, so the general thrust of fiat_lux’s comment is also true about that paper.

I don’t mind academics reaching outside their fields of expertise, but they really should get collaborators with the appropriate background, and the fact that anthropic hasn’t hired any humanities researchers to help support this sort of research is a bad sign.

scruiser@awful.systems · 26 days ago

They absolutely are. I am just giving them a tiny bit of credit for at least attempting academic research on LLM performance. But only a tiny bit, as they blog post I link discusses, their methodology is really sloppy and not to the level of most academic research and wouldn’t get through peer review of most decent journals.

scruiser@awful.systems · edit-2 26 days ago

I liked this takedown of METR’s task horizon “research”: https://arachnemag.substack.com/p/the-metr-graph-is-hot-garbage

In addition to all the complaints I already knew of and had, METR’s methodology for human baselining of tasks was even worse than I realized.

And you know… I actually kind of respect METR relative to a lot of boosters and doomers for at least attempting any hard numbers and not just vibes and anecdotes (METR is the ones that did the study showing LLMs actually reduced coders productivity even as it made them think it increased). But the standard for quantifying LLM performance in practical terms is absurdly low.