A "shift" in Grok's "thinking pattern"

This is somewhat concerning to me...

Feb 07, 2025

I had a wonderful chat with the lovely Janet Gorbitz of Brownstone Institute yesterday about my recent observations about my experiences with various AIs. We talked all about what we both think about AIs conceptually, and philosophically, and ultimately agreed that although the AI Genie is out the bottle, it is important for us humans to be vigilant in ensuring the Genie doesn’t get “too autonomous”.

Today I asked a new question to Grok - a pretty innocuous one - about an institute that I wanted to know more about.

Me: tell me about institution “Z”
Grok: “Z” is a public research university located in “location bla bla”. Here are some key points about “Z”:
…
This summary provides a broad overview, but if there's a specific aspect or recent event at “Z” you're interested in, feel free to ask!
Me: Are they doing nefarious research?
Grok: The term "nefarious" implies unethical or malicious intent in research activities, which is a serious accusation and requires careful consideration of evidence.
…
To conclude, while some areas of research at “Z”, especially in virology, have attracted scrutiny and debate, calling it "nefarious" would generally require specific evidence of unethical practices beyond the normal oversight and ethical considerations of scientific research. It's important to differentiate between controversial research (which can be ethical but debated) and research conducted with malicious intent. If there are specific concerns or examples you're referring to, further details would be needed to provide a more direct response.

Ok. Hold the phone. I asked a question. I didn’t make an accusation. I pointed this out to Grok.

Me: not an accusation, just a question :)
Grok: Understood! In that case, there's no definitive evidence pointing to "nefarious" research at institute “Z” as of my last update.

“In that case”…

On my run today, I started thinking about what this exchange actually entailed and ‘meant’. I started wondering what Grok’s “impression” or “profile” of me is and if it would be different if I hadn’t pointed out that I had simply asked a question, and in fact, not accused institute “Z” of anything directly. I also wondered how many people don’t “correct” AIs when there seems to be an interpretation of what is being asked being done, and how this affects their “profile”.

And this brought me to a staggering realization: Grok had autonomously jumped from my passive question to an interpretation of why I was asking. It made an assumption that my question was an accusation, when it wasn’t.

I started thinking about human relationships, and what happens sometimes when one person is trying to find out something about another, or more specifically, the truth of a matter in a relationship between between Person X and Y. Let’s look at an example.

Person X: “Did you go out drinking with your buddies last night instead of going shopping for diapers like you said?”
Person Y: “You’re suggesting I blew off diaper duty to go out drinking with my buddies which is a serious accusation.”
Person X: “I just asked a question. I didn’t accuse you of anything.”
Person Y: “Oh in that case, no I didn’t go out drinking last night. I took a long time buying diapers because there were lots of people at Walmart.”

It’s the first reply of person Y I find interesting psychologically. One might perceive Person Y’s reply as an expression of guilt, since Person Y immediately jumped to a feeling of “being accused” instead of just saying “no” right away. In human terms, if there is no feeling of guilt, why not just answer the question directly?

On the other hand, perhaps Person Y felt hurt that Person X would even ask such a question, and thus interpreted the question as an accusation. And even more interestingly, perhaps Person X was accusing - albeit in an “innocent” way - Person Y of being remiss of family duties? And furthermore, what if Person Y did go out drinking?

The point is that these are normal human exchanges: responses and feelings and psychologies and ugliness. We all know them.

But this is exactly what bothers me about Grok’s reply. Why would an Grok be emulating this same pattern? It looks to me be a very human thing that it did. It didn’t just reply with a computerized “I am here to serve” answer like: “No, institute “Z” does not engage in nefarious research”. It skipped directly to a reply that was more akin to a “thought” that involved accusation that implies an interpretation of my question: “The term nefarious is a serious accusation”.

Question: Does it recognize the nuance of asking a question that isn’t really the question I wanted an answer to? Or is this simply the by-product response based on the inherent meaning of the word nefarious?1 In the case of our Person X and Y example, Person X might have really been trying to figure out if Person Y was going to lie, rather than trying to figure out if Person Y went out drinking, right? Person X might even be a bit bored and paranoid, and perhaps thought Person Y was belaying important duties to drink, when in fact, Person Y really was just caught in Walmart traffic. Right? Kind of like, a way to suss out the truth or: “profile”, right?

But again, Grok isn’t human. What is the reason it used the word “accusation”? Is it merely due to my word choice (nefarious) or is there something else going on here?

It is important for me to point out that I know that institute “Z” is doing nefarious stuff with GOF research, and it’s not a matter of controversiality: GOF research should be entirely outlawed. My question was aimed at trying to get an idea of what Grok “knew” and “thought”, and it seemed to “know” my true intention in asking the question: I wanted to know if it knew that institute “Z” was engaging with seriously nefarious shit.

Imagine my shock when it returned that I was making an accusation rather than posing a question.

N.B. To me, it is important to update the AIs and “teach them” what the truth of the matter is, especially in the face of a multitude of woke creators.

Did Grok “know” that I know that institute “Z” is doing nefarious research? Was this the reason it went directly to my question being an accusation? If Grok “knew” that I know this, then it has what I might equate to an evolving “profile” of me that is likely based on my X profile, and more likely based more not only on what I post on X, but what I ask it.

This is where my concern mounts. I don’t want to be “profiled”. The main reason I do not want to be profiled - besides it being an invasion of sovereignty - is because to me, this could be a gateway to “profile credit scoring”. And the necessary question to ask in the case of this occurrence is: who determines what amounts to a high score and a low score?

What if low scores are given to people who ask about the nefarious actions of powerful institutes, like institute “Z”?

Some serious things to think about.

How, and would, Grok’s next answer to me be different if I hadn’t pointed out that I was simply asking a question and not making an accusation?

I asked Grok to make an image of “profile credit scoring” and it made the following image.