AI Sentience (AGI) is NOT the real danger
We are asking the wrong questions about AI and its about time we fix it
Ever since the advent of ChatGPT and other AI chatbots in late 2022, we have witnessed numerous discussions on the risks of AI development. It is beyond doubt that there are aspects to AI development that pose an existential threat to humanity. The focus of this article is not to question the risks but to understand them through a better framework.
Recently, there have been many discussions by experts and industry leaders on the public platform about AI Sentience or Artificial General Intelligence (AGI).
Given below is a broad summarisation of how these discussions generally go.
It starts with a discussion on the staggering developments in AI such as ChatGPT by OpenAI, Bard by Google, Github Co-pilot and Bing Chat by Microsoft and the fully automated self-driving cars by Tesla.
The discussions then move onto the question of AGI. The proverbial “Is it alive” question is asked to the speakers.
The speakers generally answer to the question of AGI in the following terms.
“We are already seeing sparks of generalised intelligence in Large Language Models (LLMs) such as GPT-4”.
“By this pace, we could reach AGI by [insert your predictive date]”.
“AGI poses existential risks to humanity”.
“We need to mitigate these risks by [insert your solutions]".
There is a problem in this manner of discussion. We are approaching the question of AI sentience (AGI) by using an incorrect framework.
Below is a summarisation of my disagreement with this framework of thinking.
By saying that AI becomes existentially dangerous only when AGI is achieved, we are denying the risks that AI poses in its current state to society.
In addition, we are accepting the wrongful notion that the risks with AGI are exclusive to AGI and do not apply to the current state of AI.
Furthermore, by giving predictions about the year when we finally ‘achieve AGI’, we are self-imposing a naive fantasy that there will be a definitive event in history that will mark the birth of AGI and the entire world will magically come to a consensus about it.
So with these concerns in mind, in this article, we shall discuss a new framework for understanding the risks of AI by moving beyond the argument of AI Sentience (AGI).
To do that, we shall have to discuss the following.
A framework for understanding consciousness.
A framework for understanding AI sentience.
We shall discuss why the question of AI Sentience (AGI) is incorrect and / or irrelevant.
Finally, we shall define a new framework for understanding AI and mitigating its risks.
Consciousness is a complex subject with varying definitions from experts of different fields. From the olden days of Greek philosophers to the present day scientific communities, people have been trying to come up with a universal definition for consciousness. We are yet to reach a consensus on it across disciplines.
So in order to understand consciousness, specifically for this article, we shall be using the model of consciousness as described by Bernard Baars, a prominent cognitive psychologist.
The Global Workspace Theory
Bernard Baars’ theory on consciousness is popularly known as the Global Workspace Theory. The following is a quote by Baars that succinctly explains it.
"Consciousness is a global workspace that integrates information from many sources and allows flexible control of behaviour" - Bernard Baars
According to Baars, consciousness can be thought of as a "global workspace" that integrates information from many sources and allows for flexible control of behaviour.
To understand this idea, it can be helpful to think of the brain as a kind of information processing system. Sensory inputs from the environment are processed by different regions of the brain, which create representations of the world and the body. These representations are then integrated and combined in various ways to create more complex mental states, such as perceptions, memories, and thoughts.
According to Baars, consciousness arises when these representations are brought together in a "global workspace" that can be accessed by multiple regions of the brain. This workspace serves as a kind of mental stage, where different representations can be combined and integrated to create a coherent conscious experience. Baars suggests that this integration is what allows us to have a sense of self, to make decisions, and to engage in complex behaviours.
This model of consciousness has helped us to understand attention and awareness by proposing that they are closely related to the concept of a global workspace. In Baars' model, attention is seen as the process by which certain mental representations are selected for integration into the global workspace. This allows these representations to be shared and accessed by multiple regions of the brain, leading to a more integrated and coherent conscious experience.
In other words, attention helps to control the flow of information into the global workspace, so that only the most relevant or important information is integrated and processed at any given time. This allows us to focus on specific aspects of the environment or our internal thoughts, and to ignore distractions or irrelevant stimuli.
The reason for selecting this model of consciousness for a new framework of thinking about AI is because it comes very close to the way that modern AI is designed. Similar to Baar’s model of global workspace, LLMs such GPT-4 also utilise different algorithms for processing information in different ways and then combine it together in the form of a multi-modal representation by harnessing the power of attention. I recommend reading my article on AI attention to further understand how this works.
Before we move on, it is important to note that Baar’s model of consciousness is not immune to criticisms and disagreements by experts. Some researchers have argued that the Global Workspace Theory is too simplistic and fails to fully capture the complexity of conscious experience. Others have suggested that the theory relies too heavily on metaphorical language and lacks clear empirical support.
As valid as all these criticisms maybe, Baar’s model of consciousness is ideal for understanding consciousness with regard to its striking similarities with how we design natural language understanding models.
What is AI Sentience (AGI)
Now that we have a framework for consciousness to fall back on, we can consider understanding AI Sentience (AGI).
Same as with consciousness, the definitions for Artificial General Intelligence are varied. To simplify this article, we will use the definition by Ray Kurzweil, a famous technologist, inventor, futurist, and the writer of many prominent books such as “How to Create a Mind” and “The Age of Intelligent Machines”.
"Artificial General Intelligence is the hypothetical ability of an AI to understand or learn any intellectual task that a human being can. It would have human-level cognitive abilities in all domains of thinking, such as reasoning, problem-solving, perception, and natural language understanding, and could perform any intellectual task that a human can" - Ray Kurzweil
This definition succinctly explains the features of an AGI system. Currently we do not have any AI model that satisfies this definition of AGI including advanced LLMs such as GPT-4 which are still considered as ‘Narrow AI’, wherein AI is good at certain reasoning and intellectual tasks but is limited by its knowledge and other parameters.
In order to achieve AGI as defined by Kurzweil, the current AI systems will have to evolve from learning through human input into what is called ‘recursive self improvement’ where AI starts to learn recursively through the information that it creates.
The argument goes that once an AI system reaches a certain level of intelligence, it will be able to improve itself, leading to even greater intelligence. This self-improvement loop would continue recursively, resulting in an intelligence explosion that could have far-reaching consequences for humanity.
It is important to note that AI sentience of AGI achieved through recursive self improvement is a controversial theory not supported by all the experts in the field.
The following are the arguments raised against this theory.
It assumes that AI systems will be able to improve themselves in an unbounded manner, without any limitations or constraints. This is unlikely to be the case, as there are always practical limits to what an AI system can do, depending on its architecture, data, and algorithms.
The idea of an intelligence explosion assumes that intelligence is a single, monolithic construct that can be measured and improved in a straightforward manner. However, intelligence is a complex and multifaceted phenomenon that is difficult to define and measure. Even if an AI system were able to improve its performance on a specific task, this would not necessarily translate into broader or more general intelligence.
There are ethical and moral considerations to be taken into account. The idea of an intelligence explosion raises questions about the control, ownership, and governance of such systems, as well as the potential impact on society and the environment.
What are the major concerns with AI sentience (AGI)?
Now that we have understood consciousness and AI sentience (AGI), we shall look into the major concerns raised by experts in the field regarding the development of AGI.
While the concerns with AGI could end up being an inexhaustible list, we can neatly summarise them into 6 generalised buckets using Nick Bostrom’s “Superintelligence: Paths, Dangers, Strategies”.
As AI systems become more advanced and capable, they may converge on a set of instrumental goals, such as self-preservation or resource acquisition, that are not aligned with human values. This is popularly known as instrumental convergence.
There is a risk that advanced AI systems may develop objectives that are misaligned with human values, potentially leading to unintended consequences or even harm.
As AI systems become more advanced and capable, they may become difficult to predict or control, potentially leading to unexpected or even dangerous behaviour.
If one country or organisation develops advanced AI systems before others, they may have a significant strategic advantage in various areas, including military applications and economic productivity.
There is a risk that advanced AI systems may be able to improve themselves recursively, potentially leading to a rapid and uncontrollable increase in intelligence, also known as intelligence explosion.
If advanced AI systems become sufficiently intelligent and are misaligned with human values, they may pose an existential risk to humanity by causing widespread harm or even leading to human extinction.
“Is it alive?” - is the wrong question to ask
Now that we have covered a definition for AGI and its possible risks to society, we come to the core topic of this article - “Why AI Sentience (AGI) should not be our major concern right now”.
The debate about the ifs and whens of the emergence of a definitive AGI is garbing the existential risks that AI, in its current form, already poses. The argument of whether there will be an AGI in the future is rendered useless given that all the risks associated with AGI can already manifest today.
To understand this, let us go over the risks that we discussed in the previous section from the perspective of AI today.
With regards to instrumental convergence, ChatGPT in its latest versions has the ability to connect to the internet and interact with web applications on behalf of the user. This feature is called Plugins. This can be considered as the first step towards instrumental convergence because collaboration across applications requires a set of negotiated universal policies to be in place in order to facilitate a smooth and errorless transaction. Currently these policies are set by human engineers by using APIs. Can AI’s design such policies in their current state? Absolutely. Almost all LLMs today have the ability to understand and write computer code. It is very much possible that with a few updates, AI companies automate the whole process by letting the AIs negotiate and create their own policies for establishing communication with other web applications.
With regards to misaligned objectives and value alignment problem, LLMs, as they are designed today, are optimised to give the most efficient response to a prompt. While there are a few guardrails set in place by AI companies such as blocking harmful, violent or sexually explicit content, AI is NOT trained to understand and reply in accordance with human values. As a result, we see many examples of people manipulating these models to generate harmful, dangerous and malevolent responses. This is a problem that can be fixed as stated by Yejin Choi in her TedTalk, but it requires a different approach that we shall discuss in the next section.
With regards to the unpredictability in the behaviour of AI, we have already seen plenty of examples of that, with the erratic and bizarre responses during the early days of Bing Chat and the extreme hallucination problems of Google Bard. Transformer models are designed to give human like responses to user queries and that requires a certain amount of unpredictability or an element of surprise in the answers. However, such unpredictability can backfire and have catastrophic consequences for humanity given the scale at which these technologies are being regularly used today.
With regards to strategic advantage and existential risk of human extinction, we can already witness the first glimpses of the danger when the major reason given by people who were opposing the open letter calling for a 6 month pause on AI development earlier this year, stated their primary reason to be that China might use this as an opportunity to ‘catch up’ with the western world in the development of its own LLMs. Many countries have already started to see the opportunity with AI and have setup budgets for the development of their own LLMs that are in alignment with the political interests of their country. If the control of powerful AI systems gets monopolised into the hands of a few tech giants and countries, then this can lead to a mass disparity in quality of life across the world, even more so than it is today. Additionally, political opportunists with malevolent intentions could harness the power of AI to wage nuclear war which can have catastrophic consequences including the possibility of human extinction.
Finally, with regards to recursive self improvement, technically, there is nothing stopping any AI company today from using this methodology except the fact that if done in a haphazard manner, it can quickly result in the demise of the AI model itself. The most attractive quality that drove millions of people to sign up for ChatGPT was that the underlying AI model (GPT-3 / GPT-4) has so far been efficient and mostly correct in its answers despite its many issues. But one wrong move in the training data can lead to blatantly incorrect or dangerous responses that can jeopardise the company brand. With many LLM alternatives entering into the field, it only makes sense that companies are treading carefully in deploying such features. But it is only a matter of time. We are soon to see these models improve at least some of their features (if not all), independently without the requirement of human intervention. Will this lead to intelligence explosion or something else entirely? That is yet to be seen.
So as you can see, the so called theoretical risks with AGI are in fact practical risks with the current state of AI in 2023. Therefore, to spend our time worrying about AGI is quite useless. What is needed instead is a shift in focus to the AI today and the dangers that it already poses.
A new framework for understanding AI
In conclusion to this article, I would like to present an alternative approach to addressing the dangers and risks with AI. I will present this approach in the format of ‘questions’ being asked and discussed in the field today and offer ‘alternative questions’ that need to be asked and discussed instead.
Current Approach: What will be the state of AI in 2030? Will we achieve AGI by then?
Alternative Approach: What will be the state of AI by the end of 2023? What do we know about the upgrades being made presently to AI systems?
Current Approach: Is AI alive? Is AI becoming conscious? Is AI showing signs of Sentience?
Alternative approach: Is AI becoming aligned with human values? And if not, what is being done about it?”
Current Approach: What will be the consequences of an intelligence explosion caused by self learning AI?
Alternative approach: What is being done to achieve recursive self improvement in AI algorithms today? Are we placing proper guardrails and limits upon what can be self-improved by AI and what requires human intervention?
Current Approach: Will AGI lead to an extinction level event?
Alternative Approach: We have policies in place to block certain countries from developing nuclear weapons. We also have policies and treaties in place the effectively ban nuclear warfare. What policies and treaties are being designed by the UN to effectively block any country from weaponising AI? What are the regulations and laws that can be created to stop the deployment of AI in sensitive places such as nuclear reactors?
Current Approach: Is AGI inevitable?
Alternative Approach: What is the requirement for developing AGI if we can achieve all our goals and interests with Narrow AI?
The course of AI development can be drastically altered if we shift our focus from the current inquiries to the ones that are truly imperative. While AI's growth follows an exponential trajectory, we have the ability to ensure that this growth is NOT accompanied by exponential hazards. The responsibility to steer the future of AI towards a positive outcome is in our hands.
As a befitting conclusion to this discourse, I would like to draw your attention to a statement made by Satya Nadella, the CEO of Microsoft, regarding the future of AI development. While this quote does not advocate for the methodologies employed by Microsoft in AI development, it offers a glimmer of hope. Should we take this declaration earnestly and adopt it as our own, both as individuals and as a society, AI could become an asset that benefits humanity on a grand scale.
The future is ours to shape. I am as optimistic as ever that we can shape a future in which AI serves humanity to its fullest potential - Satya Nadella
Thanks for reading The AI Decode! Subscribe for free today to receive more such interesting articles and insights straight to your inbox.