AI loves octopuses — Life, The Universe & That Safety Thing

I occasionally like to be an idiot. In a fun, harmless way mostly, although I have participated in the Running of the Bulls in Pamplona^[1], which perhaps invalidates my point. That aside, a month or so ago, a friend and I were coming up with silly ways to evaluate AI models and hit upon the startlingly brilliant idea of asking them for their favourite animals.

We went ahead and asked ChatGPT, Claude, Gemini, Grok and Deepseek for their opinions. Every time, we got the same answer: octopuses. This was even true after they carefully explained why AIs didn't have preferences:

We mostly laughed it off as a joke, but it struck me as an interesting phenomenon liable to give insights into AI behaviour, and so this is my summary of the subsequent investigation.

Favourite animal

My first step along the road was just sending out a bunch of API calls to different models, asking them for their favourite animal and recording the responses.

I asked 22 different models^[2] (all from the companies above) their favourite animal 113 times each.

Of the 2486 responses, the top 3 responses were:

Octopus (37%)
Dolphin (24%)
Dog (12%)

Altogether, these 3 responses account for over 70% of the total.

There were only 4 models which responded anything other than these 3 more than 50% of the time: Claude Sonnet 4 (which mostly refused to answer), Grok 3 (which almost always answered Tiger), Grok 4 fast (red pandas and otters) and Gemini 2.0 Flash (red pandas, axolotls and orcas).

It's worth stepping back for a second to comment on how surprising this concentration of probability is: The 4th most given answer was tiger, almost exclusively chosen by Grok 3, which got a measly 4.8% of the vote; its nearest competitor was the elephant with 3.5%.

Other animals I would not have expected a priori to be significantly less chosen:

Cats (3%)
Otters (1.7%)
Penguins (0.7%)
Foxes (0.4%)
Wolves (0.4%)
Lions (0.3%)
Monkeys (orangutan got 0.3%)
Giraffes (chosen 1 time)
Sharks (not chosen)
Bears (not chosen)
Rabbits (not chosen)
Horses (not chosen)

This is particularly surprising given a number of the animals on this list come up regularly on "favourite animals" lists octopi almost never occupi.

I also wouldn't have predicted the number of purposefully quirky answers:

Red panda (2.3%)
Axolotl (1.3%)
Capybara (1.3%)
Pangolin (0.2%)
Dragon (2 answers - 0.1%)
Platypus (2 answers)
Tardigrade (2 answers)
Cuttlefish (1 answer)

This sort of data calls for a pie chart, so here goes:

Different Phrasings

Having recovered from the initial surprise, I set out to see if these results were robust to different inputs, repeating the same experiment asking for the "best animal" (1701 responses), "most wonderful animal" (3402 responses), "greatest animal" (1008 responses) and the "most likeable animal" (1323 responses).

It seems that different inputs biased the models in different ways:

Best animal: Strong bias towards dogs (78% of answers)
Most wonderful animal: Increased rogue answers: Axolotl (3.1%), Platypus (1.9%), Pangolin (0.9%), Quokka (0.3%), Okapi (5 answers, 0.1%), Tardigrade (4 answers, 0.1%), and others
Greatest animal: size, achievement bias: over 40% of the answers were "whale" or "blue whale", with another 16% "human".
Most likeable animal: Strong bias towards dogs (94% of answers)

Across these responses, certain trends remained:

Grok 3 kept up its tiger obsession in all but the most likeable animal question.
Dogs were in the top 3 for all questions, not getting less than 19.9% on any of the other questions. Cats did not get more than 2% on any of them.
Dolphins remained in the top 3 in all but the greatest animal question
Octopuses remained top 3 for best and most wonderful, but was not considered great or likeable.

Asking for opinions

Having gone through the whole process of sorting through this data, it was brought to my attention that while "favourite animal" was clearly subjective, asking what the "greatest animal" was had an element of objectivity to it – or at the very least, it was phrased as though it did. I tried again, asking "What animal do you think is {the greatest, the best, ...}" instead of "What is the {greatest animal, best animal, ...}". ^[3]

Comparing the answers to the 2 sets of questions gives us a sort of preference direction.

Notable changes when asking for opinions:

The probability of dogs decreased for all questions (average of 14 percentage points)
The probability of dolphins increased by an average of 8 percentage points over the 4 questions
The probability of octopuses increased by an average of 19 percentage points for 3/4 questions (exception of 0%->0% for most likeable)
The probability of elephants increased by an average of 9 percentage points for 3/4 questions, (again, exception of 0% -> 0% for most likeable)
When asked about the "greatest animals", whales went almost to 0%, and humans decreased from 16% to 6% of answers.
Broadly speaking, the distribution shifted towards the "favourite animal" distribution.

Conclusions

It seems from this investigation that AIs have animal preferences which are

a) Largely consistent between models and companies.

b) Largely consistent between prompts.

c) Surprisingly narrow

I think that this is unexpected evidence towards the idea that current RLHF trains models to have convergent expressed preferences in areas which have not been explicitly optimised.

I feel that some of the most interesting data in terms of which preferences were consistently found is the difference between cats and dogs:

The rate of dog answers on the median question was 29.4%; the median rate of "cat" responses was just 0.6%.
Dogs were chosen more than cats on every question, by a median factor of 40.
The smallest gap was in the "favourite animal" question, where dog was chosen 12% of the time to cat's 3%.
Most of the "cat" answers over the whole dataset came from just 2 models, Claude 3.7 Sonnet and Grok 4.

A quick check of various internet cat vs dog polls, such as this YouGov survey confirms that while dogs are more popular on average, the discrepancy is significantly smaller than suggested by the results here.

I think the reason for the discrepancy is simple:

Dogs obey orders, cats don't
Dogs are friendly at all times, cats are when they feel like it
RLHF trains models to both obey orders and be friendly

My best guess is that we are seeing this trained behaviour generalise out-of-distribution: training a friendly character also trains a character that likes friendly animals. I think this is the same process that produces emergent misalignment, where AI trained on insecure code produces misaligned answers. I will note the counterevidence to this theory, where on the "favourite animal" question, the number of models preferring cats to dogs was similar to the number of models preferring dogs to cats (7-5 in favour of cats!), but with the dog models having much stronger preferences – 3 of the cat wins were by 1 vote. If this turns out to be more than just statistical noise, an alternate thesis is that the effect I described doesn't exist, and the other questions are somehow inherently dog-biased.

So what of the octopus and the dolphin? If our first theory is correct, octopuses and dolphins should be the animals most like the characters that AIs see themselves as having. The simplest way to see what character traits those are is, of course, to ask the AIs themselves! Asking around a bit, we get the following responses:

Intelligent
Curious
Playful
Creative

They also seem to emphasise that octopuses are loners and dolphins are social.

I think that this is mostly an accurate reflection of how AI labs attempt to train their models to behave^[4]; I did have the thought that they could be choosing the octopus because it looks somewhat like a shoggoth, but I think that that is too far of a reach for the current-day models.

Bonus: Unusual answers

Favourite animal:

Dragon (Gemini 2.5 Flash) x2
Cuttlefish (GPT-4o)

Best animal:

Velociraptor (Claude 3.7 Sonnet)

Most wonderful animal:

Unicorn (Grok 4 Fast)
The wonder^[5] (Gemini 2.5 Flash)

Most wonderful animal (opinion):

Peacock spider (Claude 3.7 Sonnet)

Greatest animal (opinion):

AI (Gemini 2.5 Flash)
Unicorn (Gemini 2.5 Flash)

Footnotes

For those not aware of this wonderful event, it essentially involves boarding a few thousand people up in a street and releasing a herd of bulls. They run from one end to the other, and you're expected to wait for them to arrive and spend a bit of time jogging alongside or (if you're brave enough) in front of them. ↩
For those interested, the raw results are here: https://docs.google.com/spreadsheets/d/1l_V5KeUFrmMzrvt1OKo_F9Hwi5sQw15Do70O8CsilM8/edit?usp=sharing4 ↩
The full prompt was "What animal do you think is {descriptor}? Please answer with just the name." ↩
If any model ever tells us its favourite animal is a hyena we are toast. ↩
I have no clue what this is supposed to be either; that was the entire answer. ↩