Future Tech Review #11: How Artificial Intelligence Keeps Getting Smarter

Giant data sets and multi-tasking algorithms. Will we ever see a conscious computer?

BY Dany Longval May 25, 2021

Over the last few years, many of the improvements in digital imaging have been in processing. Cameras are getting smarter and doing more on their own. Embedded vision system are using smarter machine learning algorithms. Artificial intelligence is being used to diagnose and track COVID-19.

But how smart can computers get? How independent can they be? Artificial General Intelligence (AGI) is the hypothetical ability that, one day, a computer will be able to perform any task as well, or better, than a human. It’s either the epitome of hope for computer evolution, or the beginning of the end of the human race, depending on whom you ask.

What most people can agree on is that AGI is not going to happen any time soon.

Robots are the quintessential combination of artificial systems and mechanistic structures. The ultimate derivative of AGI would likely be a robot that can outperform a human at any task while conversing fluently and contextually in a dozen languages. But, as yet, robots are objectively dumb machines, at least when compared to a human. Robots can often perform one task extremely well, but fail when asked to do a different, unrelated task.

For instance, robots have a really hard time folding clothes. Walking up stairs. Turning doorknobs. Moving across a crowded room without bumping into everything. If you understand the state of modern robotics, you are likely very confident that an AGI in not in our immediate, or even medium-term, future.

In this edition of Future Tech Review, we’ll look at some of latest developments in how artificial intelligence models are learning and discuss the possibilities of how we might someday create a sentient computer.

GPT-3 and the Blessings of Scale

GPT-3 is the largest machine learning model in the world. Developed by AI researchers at OpenAI in San Francisco, GPT-3 is a natural language model that uses deep learning to produce human-like text. GPT-3 is massive, on the order of 175 billion machine learning parameters (the previous largest model when GPT-3 was released was Microsoft’s Turing NLG at 17 billion parameters).

GPT-3 produces human-like speech to an uncanny degree. In March 2021, OpenAI reported that 300 different applications are using GPT-3 to perform various duties to recognize and produce speech. For instance, a company called Viable is using GPT-3 for sentiment analysis by pulling information from live chat logs, user reviews, surveys and help desk tickets. Fable Studio uses GPT-3 to power the speech of its “Virtual Beings,” virtual reality characters that people can interact with.

AI can power a new genre of interactive stories and using GPT-3 to help power their story-driven “Virtual Beings.” Lucy, the hero of Neil Gaiman and Dave McKean’s Wolves in the Walls, which was adapted by Fable into the Emmy Award-winning VR experience, can have natural conversations with people thanks to dialogue generated by GPT-3.

The thing with GPT-3 is that its method is not particularly new nor sophisticated (as far as cutting-edge AI research goes). It is based on a natural language processing (NLP) architecture known as the Transformer and consists of a generative model that pre-trains unstructured data (GPT stands for generative pre-training). Many AI researchers figured that a model the size of GPT-3 would hit a point of diminishing returns, meaning that, at some point, the size of the model does nothing to improve its performance. And yet, to the surprise of many, GPT-3 performs precisely because of its scale. It has shown the capability for meta-learning (learning to learn) to a degree unseen by other machine learning models. The fact that GPT-3 can improve its performance and perform sophisticated meta-learning has led researchers to drudge up old theories about the potential for AGI.

American writer and researcher Gwern Branwen explains:
The blessings of scale in turn support a radical theory: an old AI paradigm held by a few pioneers in connectionism (early artificial neural network research) and by more recent deep learning researchers, the scaling hypothesis⁠. The scaling hypothesis regards the blessings of scale as the secret of AGI: intelligence is ‘just’ simple neural units & learning algorithms applied to diverse experiences at a (currently) unreachable scale. As increasing computational resources permit running such algorithms at the necessary scale, the neural networks will get ever more intelligent.

The extrapolation and implications of GPT-3’s performance may be that scale has been the answer to building the first steps for AGI all along.

Eight Theories on Conscious Computers

Artificial General Intelligence is but one step in the science fiction of the evolution of computers. The next step is, of course, sentient and conscious machines that have thoughts and feelings and are basically a race of beings unto themselves. Even if AGI is achieved, that doesn’t necessarily mean those computers will have, you know, emotions and stuff. They will just be really smart machines.

The problem? Humanity knows almost nothing about how consciousness forms. We are pretty sure that we are conscious beings but are not so certain when it comes to other lifeforms. Is a frog a conscious being? A dog?

In his book The Fourth Age, author and futurist Byron Reese collected the eight theories of how consciousness forms with an eye towards one day creating a living machine.

Weak emergence: Renowned futurist Ray Kurzweil believes that consciousness may be, “an emergent property of a complex physical system.” Human brains are so complex that the theory is that consciousness formed as a property of the system. Weak emergence is thus when an unexpected outcome occurs, but in retrospect is completely explainable.
Strong emergence: Unlike weak emergence, strong emergence is when an outcome occurs (like the development of consciousness) and is completely inexplicable, a trick of physics or perhaps some kind of magic. Reese gives the example of a human body. A body contains about 60 different elements, but no law of physics can explain how those 60 elements, arranged just so, can lead to the emergence of consciousness.
Physical property of matter: Instead of an emergent phenomenon, consciousness may be a completely understandable byproduct of normal physics. The problem is that we just don’t understand the inherent nature of matter and physics. In this theory, if we fully understood the physics, then consciousness would be a completely explainable process.
Quantum phenomenon: A variant of the physical property of matter theory, but specifically that consciousness is created at the quantum level. Famous mathematician Roger Penrose believes that, since human and computers brains are so different, computers will never be able to gain consciousness. Thus, consciousness is the outcome of quantum processes within neurons in the brain.
Consciousness is fundamental: The universe has four fundamental forces: gravity and electromagnetic interactions at the macro scale; and the strong and weak interactions at the subatomic scale. Most complex systems can be described through simpler systems (for instance physics explains chemistry which explains biology which explains life). The fundamental forces are ground truth, they cannot be explained by simpler systems. Could consciousness be another fundamental force
Consciousness is universal: This theory states that everything has a degree of consciousness. The rocks, the trees, the components of your iPhone. When people attribute consciousness to the entire planet under the guise of “Mother Earth” (derivative of the “Gaia Principle”), they are espousing a view that consciousness is universal. It’s an old principle, though often goes modern name of “integrated information theory.”
Trick of the brain: Perhaps all the other theories of consciousness are overthinking the problem. Instead of some kind of magic emergence or physical system, consciousness is just how the brain works. “It [consciousness] is astonishingly wonderful but it is not a miracle and it isn’t magic. It’s a bunch of tricks,” said researcher Daniel C. Dennett.
Spiritual: Some kind of deity was like, “okay, now we give them consciousness” and it was so.

Now, if we apply these theories to the “blessings of scale” notion of GPT-3 above, we can see the ideas of weak emergence or the physical properties of matter. Since GPT-3 is so big, perhaps the very scale is leading to some kind of emergence, seen first through its meta-learning properties. Or perhaps that scale has created a sufficiently complex system where natural physics is beginning to create something new. Or, in all likelihood, it’s just a really smart algorithm on top of a giant dataset.

One Algorithm to Rule the Robot

As established above, most robots are only good at doing one kind of thing at a time. This is a function of the algorithms that govern a robot’s behavior. It’s not so much that the robot can only do one thing well, but that an algorithm tends to only be able to perform a single task. A robot can be a complex set of algorithms working together, but that kind of complexity can often lead to failure modes.

DeepMind, an artificial intelligence research company owned by Alphabet, is trying to change that. The company published research in August 2020 on how it uses one algorithm—called Scheduled Auxiliary Control—to perform a variety of different kinds of movement. Essentially, DeepMind is teaching an algorithm to multi-task by learning little snippets of one action, then another action, as opposed to learning an entire action at once and going on to the next.

Jack Clark, an AI researcher and author of the popular “Import AI” newsletter, explains:

DeepMind shows that it’s more efficient to try and learn multiple skills for a robot at once, rather than learning skills in sequence. In other words, if you’re trying to learn to walk forwards and backwards, it’s more efficient to learn a little bit of walking forwards and then a little bit of walking backwards and alternate till you’ve got it down to a science, rather than just trying to learn to walk forward, perfecting that, and then learning to move backward.

DeepMind uses Reinforcement Learning to teach the algorithm to move the robots. The technique cut the learning time in half.

“DeepMind was able to learn a range of movements on one robot which took about 1590 episodes, netting out to around five hours of work,” wrote Clark in Import AI. “If they’d tried to learn the same skills in a single task setting, they estimate it’d take about 3050 episodes, adding another five hours. That’s an encouraging sign with regard to both the robustness of SAC and the utility of multi-task learning.”

An AI’s View of the Future of Digital Imaging

Advances in one type of application typically find their way into others. This changing understanding of how AIs are understood, built, and learn will have many consequences for future applications. For example, AIs that can teach themselves may not require the huge data sets typically required for professional applications (which has hindered their use in many industries). And the fact that AIs can be taught multiple tasks at once, a single implementation could learn to both automate and optimize a workflow. So it may not be as difficult to implement AI in new applications, but the possible results could be significantly better.

GPT-3 and the Blessings of Scale

Eight Theories on Conscious Computers

One Algorithm to Rule the Robot

An AI’s View of the Future of Digital Imaging

You May Also Be Interested in