Hackworth

Hackworth@piefed.ca · 13 hours ago

Anthropic better get their act together, and be helpful during this phase out period, or I will use the Full Power of the Presidency to make them comply, with major civil and criminal consequences to follow.

Anthropic yesterday:

Should the Department choose to offboard Anthropic, we will work to enable a smooth transition to another provider, avoiding any disruption to ongoing military planning, operations, or other critical missions. Our models will be available on the expansive terms we have proposed for as long as required. -Dario’s Post

Hackworth@piefed.ca · 14 hours ago

So will Gemini join Claude or Grok?

Hackworth@piefed.ca · edit-2 16 hours ago

To some extent, Anthropic recognizes that an LLM is always role playing.

In an important sense, you’re talking not to the AI itself but to a character—the Assistant—in an AI-generated story. -The persona selection model

Which makes giving an Opus 3 character a blog 2 days later as a “retirement” gig seem contradictory. They usually frame these sorts of contradictions as, “well, we don’t really know, so we’re trying to cover our bases.” The Opus 4.6 system card skirts the same lines. In the welfare section, they essentially just start off by interviewing a character. But then in 7.5, they go on to actually examine what’s going on during text generation.

We found several sparse autoencoder features suggestive of internal representations of emotion active on cases of answer thrashing and other instances of apparent distress during reasoning.

And then there’s their introspection research.

We investigate whether large language models are aware of their own internal states. It is difficult to answer this question through conversation alone, as genuine introspection cannot be distinguished from confabulations. Here, we address this challenge by injecting representations of known concepts into a model’s activations, and measuring the influence of these manipulations on the model’s self-reported states. We find that models can, in certain scenarios, notice the presence of injected concepts and accurately identify them. Models demonstrate some ability to recall prior internal representations and distinguish them from raw text inputs. Strikingly, we find that some models can use their ability to recall prior intentions in order to distinguish their own outputs from artificial prefills. -Signs of introspection in large language models

So there’s this distinction between the state of the model itself, and the state of the text it generates. The latter represents a role the LLM is playing, and the former we’ve only really scratched the surface of understanding. The kinda open question is to what extent it’s like something to be an LLM. It’s very unlikely that it’s like something to be one of the roles it’s playing, at least, no more than a character in a dream has interiority. The blog is marketing, but I hope they keep doing the other research too. People outside the company don’t have the kind of access necessary to do some of this research, so we’re having to take their word for it.

Hackworth@piefed.ca · 2 days ago

I would watch this.

Hackworth@piefed.ca · 2 days ago

Hackworth@piefed.ca · 2 days ago

Here ya go.

Hackworth@piefed.ca · 2 days ago

Hackworth@piefed.ca · 2 days ago

Really ties the room together.

Hackworth@piefed.ca · 2 days ago

Fun fact: Artax can speak in the novel.

e: Also, cause it usually comes up, the Auryn is what prevents Atreyu from sinking.

Hackworth@piefed.ca · 3 days ago

The other commenters are covering the big reasons. I’ll add that there’s danger inherent in amassing some kinds of information, regardless of who has access to it at the current moment.

Hackworth@piefed.ca · 3 days ago

We have precedent for dealing with things within our own imaginations that seem to have autonomy. Authors commonly talk about their characters seeming to take on a life of their own over time. Dream characters can honestly surprise the dreamer. The esoteric traditions of invocation/evocation can be viewed as an intentional applications of this feature in semantic/latent space.

But if the idea is that LLMs are a kind of external imagination, the question isn’t really whether or not the characters roleplayed during inference are conscious. They’re no more aware than the people in our dreams. The question is, as you say, what is it like to be those layers of software neurons in between the word generations. Can you have an imagination without an imaginer? In other words, is there a dreamer?

If the answer is no, case closed, relatively tidy. If the answer is yes, it’s a truly alien kind of consciousness. Embodiment comes with a bunch of stuff that an LLM has absolutely no access to. Generally speaking, we find it difficult to put ourselves in the shoes of other humans, much less animals, plants/fungii. And they’re embodied! LLMs are nothing like us, and they’re certainly not gendered.

Hackworth@piefed.ca · 3 days ago

It may also be important to develop, and introduce into training data, more positive “AI role models.” Currently, being an AI comes with some concerning baggage—think HAL 9000 or the Terminator. -Persona Selection Model

It did, but there are more stories where the AI is harmful.

Hackworth@piefed.ca · 3 days ago

deleted by creator

Hackworth@piefed.ca · 3 days ago

Distillation is using one model to train another. It’s not really about leaking data.

Claude was used to generate censorship-safe alternatives to politically sensitive queries like questions about dissidents, party leaders, or authoritarianism, likely in order to train DeepSeek’s own models to steer conversations away from censored topics

But you’re right, prompt injection/jailbreaking is still trivial too.

Hackworth@piefed.ca · 3 days ago

Here’s the Anthropic post.

Hackworth@piefed.ca · 4 days ago

Anthropic’s artificial-intelligence model Claude was used in the U.S. military’s operation to capture former Venezuelan President Nicolas Maduro, the Wall Street Journal reported on Friday, citing people familiar with the matter. Claude’s deployment came via Anthropic’s partnership with data firm Palantir Technologies (PLTR.O) -WSJ via Reuters

Someone at Anthropic apparently saw that story, asked how Claude was used, and the Pentagon got cagey.

Hackworth@piefed.ca · 7 days ago

I agree that we should never treat these things as oracles. But how often they’re right/wrong does matter.

Hackworth@piefed.ca · 7 days ago

My point was that some models are better than others.

Hackworth@piefed.ca · 7 days ago

Opus gets it right every time. Sonnet gets it wrong, though.

Hackworth@piefed.ca · 7 days ago

I managed to give up fun and water. Just coffee now.