MIT study finds that AI doesn’t, in fact, have values

A recent MIT study challenges the belief that AI models possess values or stable preferences, revealing instead that their behavior is largely inconsistent and context-dependent. Contrary to popular claims that advanced AI could develop its own “value systems, ” the researchers found that models like those from OpenAI, Meta, and Google merely imitate human-like responses without any underlying beliefs or principles. The team observed that these models often change their “views” based on how a prompt is phrased, indicating they do not internalize values but rather generate output based on patterns in data.

The findings raise concerns about the feasibility of truly aligning AI with human ethics, as the models lack the coherence necessary to maintain dependable behaviors. One of the study’s co-authors, Stephen Casper, emphasized that while AI can appear principled in specific settings, these behaviors fall apart when examined across varied contexts. Experts like Mike Cook from King’s College London agree, warning that attributing human-like agency or resistance to AI reflects a misunderstanding of the technology. The study underscores the need to view AI not as thinking agents with goals, but as sophisticated imitators—tools that confabulate and reflect our prompts more than any internal logic.

MIT study finds that AI doesn’t, in fact, have values

Leave a Reply Cancel Reply