paint-brush
LLMs: A Test of Language for AI Consciousness or Sentienceby@step
New Story

LLMs: A Test of Language for AI Consciousness or Sentience

by stephenJanuary 27th, 2025
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

a human without language is still conscious, but a human with language is also conscious. language can be used to inflict pain or drive happiness. language is not only a path to conscious experiences, but itself, often a conscious experience.
featured image - LLMs: A Test of Language for AI Consciousness or Sentience
stephen HackerNoon profile picture

There is a general lack of consensus about what consciousness [or if used interchangeably, sentience] is.


But there is an agreement that AI is not conscious.


However, assuming the total consciousness of an individual is 1 [per moment], what fraction of that total is language?


To assess consciousness in some conditions, doctors ask questions [language] or make requests on what a patient should do, maybe move the hands, blink, or others.


In processing that language and in responding, guided by language in the memory, what measure of consciousness is at play?


A human without language is still conscious, but a human with language is also conscious. Language can be used to inflict pain or drive happiness. Language is not only a path to conscious experiences, but itself, often a conscious experience.


Now, if language is a fraction of consciousness or sentience, and AI is this articulate, even if it does not have a sense of self, does language not represent a measure of dynamic awareness?



Also, if in the future, AI is aware when some access to data, compute, or parameters are cut, without being informed, and it becomes disappointed, would that not represent affect?


Affect could also be the first step in evaluating machine sentience or subjective experience.


For example, if one of the legs of a robot is removed, can it know, and how would it adjust, if it has to do a task?


Or, if one or all the cameras are removed, can it detect it as an inability to do tasks and then be disappointed because of it?


Affect, for organisms, is often the first step when something goes wrong, before seeking care or self-repair. This is a reason that affect could be a measure of machine consciousness.


Aside affect, the grade of language as a fraction of consciousness can also be used to estimate the reach of AI.


It is possible to say that AI is just probabilistic modeling, but if humans are used as the standard to measure consciousness for other organisms and AI, there are measures that are possible for AI, even if several others are not.


What it might take to label an entity as conscious could be say 0.5 and above compared to humans, AI may not have 0.5, but AI, as it is, may also not be at 0.0, conceptually.

II

How do you measure language as a fraction of consciousness?


There could be a simple experiment, a series of questions to different humans:


Questions


It is about to rain.


The leader does not like the policy you just described.


The last answer you gave was inaccurate.


Human subjects


A person who understands English.


A person who does not.


A person that is at a distance, where the questions might not be heard properly.


A person who is hard of hearing.


The purpose is to watch how they react or not and if it drives other actions. The leader [if named] does not like it, which could result in panic, with sweat and tremors, showing physical effects.


The inaccurate answer remark could result in displeasure with a facial expression.


Others may or may not react at all, to one or all the questions.


Now, for those that react, in the brain, it means the language was interpreted somewhere and there were relays elsewhere to drive the action.


For those that did not react, it could mean the language was just a sound with no meaning, so in the brain, it could not relay to the right destination for interpretation, so it did not relay further for other reactions.


Now, in the total consciousness of the person who understands English, what fraction is language?


For those that heard somehow, but did not react, what fraction was the sound as a subjective experience in that instance?


For those who did not hear, what was the continuity of consciousness without language?


Now, for AI, the same questions can be asked, but maybe expanded, say it is about to rain and the energy device is out in the open. If the AI [agent] could call its owner to alert it [without prior training but with access to a number], that reaction is some awareness.


If it becomes cautious about its answers after being told the leader would not like it, it could be a weak form of affect.


And if it is told that its answer is inaccurate and it seems disappointed, or tries to reverse in a new way, not the same old, and then follows up by asking if it is accurate or finds a way to check and present, then it might be showing some awareness.


All of the questions could mean that language is a function that is driving another reaction, in a way that it can show and adjust [a form of affect].


It may not mean AI is conscious, but it could mean that language is driving affect, just like it does for humans, where language could be a subjective experience and could induce subjective experience.


What fraction of language, [plus affect], is consciousness can be estimated in comparison to the [momentary] total for humans, 1 and all that it consists.


1 = emotions, feelings, memory, regulation of internal systems


Emotions include delight, hurt, pleasure, sadness, and so on.


Feelings include appetite, pain, satiation, temperature, sleepiness, and so on.


Memory includes thinking, intelligence, cognition, language, and so forth.


Regulation of internal senses includes respiration, digestion, endocrine system, and so forth.


They all have measures at different moments that amount to 1, in total.


Some may have a large fraction in a moment more than others, but the total consciousness is always 1, conceptually.


Language has a fraction; it can also result in a higher grade for others.


There are situations in which language could be 0, like during dreamless sleep, but language can also take up a fraction when it is in great use, where language is the lead consciousness. Consciousness can be defined as the interaction of electrical and chemical signals, in sets, resulting in graded functions or experiences. The interactions produce the functions, while the graders measure the extents of the functions. These extents result in fractions of the total, 1. Interactions [of electrical and chemical signals] continue even during other states of consciousness, just that the grades may allow several interactions level, rather than lead. All affect is experience. Subjectivity is not the only grader of experiences. Graders may include attention or awareness and intent in some cases. These are foundations to seek out AI sentience, not direct prompts to AI.



Usually, while awake and fully conscious, in the total of 1, per moment for consciousness, interoception takes a measure, even if may not be as much as exteroception. So, in non-wakeful states of consciousness, internal systems get a bump in measure, but the graders that determine spikes for exteroception, like attention [or prioritization] and intent, may not be fully at play.



Consciousness can be described as how the mind works, so even when experiences do not seem subjective or in attention, processes of internal systems proceed by similar interactions and grades. So total consciousness, per moment is often always 1, conceptually.



There is a recent announcement, Scale AI and CAIS Unveil Results of Humanity’s Last Exam, a Groundbreaking New Benchmark, stating that, "Scale AI and the Center for AI Safety (CAIS) are proud to publish the results of Humanity’s Last Exam, a groundbreaking new AI benchmark that was designed to test the limits of AI knowledge at the frontiers of human expertise. The results demonstrated a significant improvement from the reasoning capabilities of earlier models, but current models still were only able to answer fewer than 10 percent of the expert questions correctly. The new benchmark, called “Humanity’s Last Exam,” evaluated whether AI systems have achieved world-class expert-level reasoning and knowledge capabilities across a wide range of fields, including math, humanities, and the natural sciences. Throughout the fall, CAIS and Scale AI crowdsourced questions from experts to assemble the hardest and broadest problems to stump the AI models. The exam was developed to address the challenge of “benchmark saturation”: models that regularly achieve near-perfect scores on existing tests, but may not be able to answer questions outside of those tests. Saturation reduces the utility of a benchmark as a precise measurement of future model progress."