There are few academic lists of theories of consciousness (Doerig 2020) as well as some good blog post series about specific ideas (shout out to SelfAwarePatterns), but as far as I know, there is no casually approachable comprehensive list of current theories in a single post yet. Well, until now. “Consciousness” is used here in the intentionally vague way Thomas Nagel defined it, namely the way that it feels to be something. As with some other terms, any further definition already makes debatable assumptions, and since this is not a post about semantics, we will hold on to the easy, intuitive definitions. The term “theory” is used in a conversational way here. If you want more technical correctness, think of “hypothesis” every time you read it.

No current theory gets everything right, and some feel more wrong to me than others. My goal here is to still give each of them a fair representation, limiting my commentary to the end of each section if possible. If any theory here seems completely misguided, that is certainly my fault and not one of the original authors. At the end of each theory, I try to apply its reasoning to some problems that I find particularly interesting. The theories are ordered the way they are so that I can easily cross-reference points I made in an earlier section. It is neither chronological nor prioritized sorting.

Alright, got the disclaimers out of the way. Time for the actual post. For your convenience, here are the relevant sections:

Most images in this post were generated by DALL·E, btw!

Preliminaries

To understand the terms used in this post here is a very quick and dirty rundown of the relevant concepts. Many of these have historically been interpreted in very different ways. In my opinion, a lot of time is wasted today arguing about their exact definition, so I will very briefly state which definitions are used. To not get bogged down by semantics, I keep the definitions intentionally very broad and short. Whole books can be written about any of these, but we will be satisfied with a few sentences instead today. Each definition links to its corresponding Stanford Encyclopedia of Philosophy entry for further reading.

  • The Mind-Body Problem: How do physical states in the brain correspond to mental states? How does one result from the other?

  • Qualia: The subjective experience of… stuff. The way the color red looks to you. The fact that pain does not just feel like a generic input but, well, painful. “an unfamiliar term for something that could not be more familiar to each of us: the ways things seem to us”. Define the term any further and you will have people debating whether qualia exists at all or not. Whenever I use this term in this post, think about this very simple definition and forget any further technicalities you might already know of.

  • The Chinese Room: Say there’s a person in a room. There’s a slit where people push letters in Chinese in. The operator in the room does not speak nor read Chinese at all, but there are some thick books in the room telling them how to handle the (for them) incomprehensible symbols. Based on these instructions, they assemble a response out of a series of symbols and return it through the slit. The person outside the box is then able to read a perfectly fine answer to their letter, written in Chinese, and will thus conclude that there is a person inside the room that is fluent in Chinese. But there isn’t. What’s up here? Note that this thought experiment was originally designed to provide an argument for why computers might never achieve a real understanding of the world and thus could never be conscious, but several authors disagree with this conclusion in many interesting ways.

  • Philosophical Zombies: A hypothetical living being that outwardly behaves exactly like a conscious person but experiences no qualia. Give it a piece of chocolate and it will happily say thanks and appear to enjoy eating it, all the while there is no consciousness experiencing any of it. A bit like a Chinese Room in the form of a person[1]. We will encounter different opinions on whether the thought experiment is a) conceivable in your imagination and b) possible in reality.

  • The Teletransporter Paradox: Imagine that in a few years engineers manage to build a teleporter. You step into it and get scanned while your body is being split into single molecules. This information is used at a destination to perfectly rebuild you from the molecule level up. After many successful trials, it gets rolled out to the public. Like many others, you refuse to enter the teleporter because you fear that it essentially murders you and replaces you with a mere clone. Years pass, and since most workplaces are now heavily teleporter based, you’ve seen countless friends and family walk into them, disintegrate, and be recreated in the evening again. Although it felt weird to interact with them for the first few weeks after the teleporters were introduced, your mind soon adapted and you stopped feeling awkward about their teleporter usage. Over time, you got so used to the omnipresence of teleporters that you decide that you can’t get left behind like this forever. You tell yourself that you’ve led a good life and in the worst case your death would be instant and painless. But these post hoc rationalizations don’t matter; you already made up your unconscious mind because the teleporters are now so commonplace that you cannot see them as murder machines anymore. You step into one, take a deep breath, and then… you emerge unharmed at your workplace as if nothing happened. First, you feel immense relief. Then you feel proud that you went through it and cannot help but feel a bit silly about your initial doubts. The next time you use the teleporter, things are not that intense anymore; you’ve been here before, you know the deal. Soon, teleporting becomes as mundane for you as it is for everyone else. One evening when you want to teleport home, you see that your boss installed a new model. You enter it; it has a fancy new monitor that shows you your destination. You first feel a little invaded in your privacy, but you decide it’s okay since the preview only shows up once you have already pressed your badge against the terminal. You close your eyes, hear the familiar humming of the machine and open your eyes. To your big surprise, you’re still in the office teleporter! You assume the teleport got canceled, that happens from time to time. You check the monitor and a cold shiver runs down your entire body as you see what appears to be yourself emerging on the other side. Suddenly, an engineer wielding a gun comes into the small cabin. With a bored facial expression, he points the weapon at you and mumbles: “Sorry, the disintegrator didn’t go off. Hold still for a second please” The story brings up many interesting questions (and intuitions!) to argue about, but for this blog post, we will ask ourselves if we should feel afraid before being shot. Of course, being human, I don’t think I have much of a choice and will be afraid no matter what because of my instincts, but keep in mind that the question is only whether or not I should feel afraid.

  • The _ Problem of Consciousness

    • Easy: how do the physical and chemical processes in the brain result in our behavior? The word “easy” is used tongue-in-cheek.
    • Hard: why is behavior accompanied by consciousness? If we can explain all human behavior as a series of processes going on in the neurons, why is there “a light on inside”? Why are we not all philosophical zombies?
    • Real: how do the physical and the chemical processes in the brain result in the properties we associate with consciousness? This formulation evolved as an answer to the perceived inadequacy of the easy and hard problems.

I will summarize what, through my conclusions, each of the theories presented has to say about some of these problems. I highlight them because the answer to what happens to zombies teleporting into Chinese rooms has extreme consequences for the feasibility of mind uploading, which is a topic very dear to me.

Alright, all caught up? Again, if you know these already, you might be angry at me for not doing them justice, I know. But they are not the focus of this post, so we must compromise.

Mysterianism

In 1989 Colin McGinn asked himself “Can We Solve the Mind-Body Problem?”. His humble conclusion: probably not [2]. Probably never. Even if we had the solution right in front of us, we might not be able to comprehend it as such. For all we know, someone might have already found out all there is to how consciousness arises in the brain, but we are doomed to never be happy with physical explanations for a phenomenon that seems so magical to us. While the fact that you are currently reading a blog post about more than one theory of consciousness already gives away that I do not agree with this view, I will do my best to explain it in good faith.

Intuition versus reality

The reason for our ignorance is that our understanding of what consciousness even is, can only be formulated using whatever kind of consciousness we happen to have. But we have no reason to believe that our minds are able of omniscience. Try understanding the Monty Hall Problem intuitively and you will very quickly make acquaintance with your mental limits. Anecdotally, at least for me, doing an undergrad minor in statistics deeply humbled my limits of intuition. Let me recount my favorite example. Say you’re dealt 13 out of 52 standard playing cards. Call the chance of getting two aces A. Now imagine a second round, in which I tell you that I know that you already have at least one ace in your hand. The chance of holding two aces in this scenario is B. Lastly, I tell you that the ace I know you’re holding happens to be the ace of spades. The chance of holding two aces is now C. Can you sort A, B and C? The obvious solution is A < B = C, but the actual, mind-boggling answer is A < B < C. I was so surprised by this result when I first encountered it that I wrote a computer simulation to test it, and yes, the math is right. My grasp of how reality works were not as good as my intuition wanted to tell me[3].

Just as it will always be beyond the grasp of a dog to understand how its liver works, it might be beyond our grasp to understand how consciousness works.

The march of science

You might interject: “Aha! But in contrast to a dog, we have the steady advance of science! We simply have not yet found the answer, but surely we can get closer and closer until we reach it!”. Not so fast. McGinn has thought of this too:

People sometimes ask me if I am still a mysterian, as if perhaps the growth of neuroscience has given me pause; they fail to grasp the depth of mystery I sense in the problem. The more we know of the brain, the less it looks like a device for creating consciousness: it’s just a big collection of biological cells and a blur of electrical activity - all machine and no ghost.

Even if we already had the solution to the problem of consciousness right in front of us, we wouldn’t accept it. Our own consciousness feels special, it feels so much as if we had a soul in us that we cannot but think of our mind as something otherwordly. No matter that I am a convinced determinist, I still feel so much as if I had free will that it borders on an irrational feeling of knowing. For this reason, we will never intuitively accept a theory of consciousness that explains us in terms of flesh and electricity, much less so in terms of a set of equations.

Going beyond our limits

All that said, it seems to me that any claim that contains the word “never” is a bit bold. Putting such an emphasis on how we feel about ourselves is also too speculative for my taste. Sure, it is a strange situation to have a consciousness try to understand the phenomenon of consciousness through the lens of its own consciousness. A strange loop, yes, but is this necessarily limiting us in any fundamental way?

Recall my example of how I could not comprehend how by just knowing that there is a spade drawn on my ace, my chance of holding two aces increases. I eventually understood the intuition behind the solution, but it still competes with the primal intuition I first formed when looking at the problem. Here, my mind is also holding a conflicting set of beliefs: one thinks the result is magical, and the other one knows it is mundane. This does not hinder me from understanding the solution because as a human I have the ability of metacognition, noticing the flaws in the way my mind works if I look carefully enough.

Just as I can flip between seeing a young and an old woman but never both at the same time, I can flip between the intuitive and the real answer to problems my intuition is not built for.

Granted, I will probably always see my consciousness as magical, but that should not stop me from also knowing that it is not. Just as I can see the world as deterministic even though everything in my being screams against it, I believe one day someone will be able to see the way flesh and electricity, and even mere equations, give rise to consciousness.

Stances (according to me)

Topic Stance
Mind-Body Problem Not able to be answered.
Chinese Room Not able to be answered.
Philosophical Zombies Conceivable, but their physical plausibility is not able to be answered.
Teletransportation Paradox Not able to be answered.

Cartesian Dualism

Long ago, René Descartes tried doubting as much as he could. He found he could doubt that his body existed; after all, an evil demon might be controlling his perceptions in a dream [4]. But while he was doubting, he noticed that he was doubting. Since doubting cannot happen without someone doing it, he concluded that by the mere act of doubting he must fail to doubt his existence. In fancy Latin: “Dubito, ergo sum”, or, its more famous cousin, “Cogito, ergo sum”. Since the mind and body thus don’t share a property, namely whether we can doubt their existence or not, he further concluded that mind and body cannot be the same thing.

All of this led Descartes to separate the world into res extensa, the physical stuff whose existence can be doubted, and res cogitans, the thinking stuff which must exist. This means that the thinking stuff can be conceived to be able to exist independently of any physical stuff. To explain this, Descartes concludes that consciousness resides in a different realm than physics.

A solipsist might not extend this privilege to others, since their exclamations of the cogito could still be the works of the evil demon. Animals in particular are often imagined as flesh robots with no mind stuff at all in this view.

Does Batman Exist?

Incognito ergo sum

There have been numerous refutations of this argument, but I retell my favorite one. Imagine Batman and the Joker facing each other in a room. Barring any evil demons, the Joker cannot doubt the existence of Batman, because he is right there with him. But the Joker can doubt the existence of Bruce Wayne, Batman’s alter ego, who might have been killed by the Joker’s goons. Thus, he wrongly concludes that since he can doubt the existence of Bruce Wayne but not of Batman, they cannot be the same person.

In short: just because you can imagine a state of the world doesn’t mean it’s logically possible. Keep this in mind for philosophical zombies.

In addition, I must reject a central claim in the argument: given my current knowledge of computational neuroscience, I cannot really imagine a world where consciousness exists without some kind of physical substrate.

Dualism Today

This theory of consciousness is quite different from the others discussed insofar that it is the only one that does not assume physicalism, i.e. that all phenomena are the result of physical interactions. I included it because it’s arguably the globally most accepted theory. This might surprise you. The reason is that having a way that consciousness can exist independently of physical phenomena is the only mechanism by which most[5] religions can plausibly promise life after death. Thus, most religious people have implicitly accepted Cartesian Dualism, even if they are not aware of it. For religions positing the existence of a soul, we can directly equate it with the aforementioned mind stuff. On the other hand, physicalists are far from a generally accepted framework of consciousness, as this post should be able to tell you.

Stances (according to me)

Topic Stance
Mind-Body Problem The mind exists independently of any physical reality, so the body does not give rise to it at all.
Chinese Room Since physical stuff cannot create mind stuff, the Chinese room is not conscious.
Philosophical Zombies Conceivable and realistic, they are just people without mind stuff
Teletransporter Paradox Since mind stuff is indivisible, the person on the other end cannot be you. Physical stuff cannot create mind stuff, the person on the other end cannot be conscious, making them a philosophical zombie, since per the premise of the thought experiment they are indistinguishable from you.

Global Workspace Theory

Think about what activities of your mind you can consciously perceive. You are aware of the images in front of your eye (or your illusion thereof). You can be aware of your emotions. You are aware of whether you’re feeling a bit cold or too hot. You are aware of this sentence you’re reading. Now imagine all the things going on in your mind that are not currently part of your consciousness. There’s passive information, like how this blog post started. If you think about it, you might recall the words, so the information must be stored in your brain. But had I not prompted you, you would probably not have thought about them. Similarly, even though you’re not aware of it at this very moment, you can instantly bring the time you planned on going to bed today into your consciousness. Then there’s also activity that you cannot bring into your consciousness. You cannot perceive how your hypothalamus decides which hormones the pituitary gland should secrete. You cannot perceive how a memory is formed or erased. Ever had the feeling you were being watched but did not know why you felt that way? Your brain generated a hypothesis, but you had no access to it [6].

Think about the last movie you saw. Close your eyes and try to list as many names of characters in the movie as you can, including minor roles. Done? Now ask yourself: what exactly happened when you tried to recall the names? If you’re like me, flashes of some scenes might have popped into your head and some dialog started playing in your memories. The protagonists and villains came up instantly, and when remembering the story, some side characters will have come up as well. But the really interesting part is when you hit the metaphorical brick wall and know there are some missing names that you just cannot recall. You already went through the plot and all scenes you remember and are stuck. No methods are helping anymore. You concentrate on… something? And poof, at some point a new name comes into your mind after all. What happened the second before you remembered the name? Your mind must have been doing something, otherwise, you wouldn’t have gotten a result, but the action is hidden from you [7].

The Theater of the Mind

So, some things are part of our consciousness, some are not. This is often viewed through the metaphor of the mind as a theater. There is an audience in front of it, but they are kept in absolute darkness. The only people visible are the actors playing thoughts on the stage because they are illuminated by the spotlight of consciousness. Sometimes an actor leaves the stage, and sometimes another one enters. Sometimes the spotlight is smaller, sometimes bigger. But the spotlight always remains the only light in the room.

Life's a game, but consciousness is a play

Different theories of consciousness interpret the metaphor in different ways. Global Workspace Theory interprets the audience as unconscious thinking modules of the brain. They might have discussions and do all sorts of things, but they are not under the spotlight and thus can only communicate with their neighbors. The actors are the items in your consciousness, and the stage they’re on is the titular global workspace, visible to all other guests in the room. The key idea here is that mental items like thoughts can be generated by different parts of the brain, but are locally confined. Through some mechanism, be it a heuristic of importance or a voting system, a thought can be “upgraded” and enter the global workspace. This makes it available to other brain areas and your consciousness. The claim is indeed that the content of our consciousness is identical to the content of the global workspace.

A Small and Neat Idea

You may have noticed that the given explanation does not address the Hard Problem of consciousness at all. One can argue that it doesn’t address the Real Problem either, since it is not concerned with exploring how things appear to us, but only when. It only talks about the contents of consciousness, not consciousness itself. In its simplest form, it simply states that a global workspace is necessary for consciousness. Of course, some authors go beyond: the strictest form states that the global workspace is consciousness.

I think the theory offers a nice part of a grander puzzle. The simple form is such a nicely behaving hypothesis with a small scope and intuitive claim that it can easily be plugged into other more encompassing theories. Different authors offer specific mechanisms of where in the brain the postulated global workspace is found, how information is pushed there, and how it disperses through the brain. This lets researchers generate testable hypotheses and make the theory thus realistically experimentally falsifiable, which makes it doubly sympathetic in my mind. We will see that many other theories, interesting though they might be, do not offer this essential feature.

There is one notable conflict though: this view predicts that an empty global workspace results in no contents of consciousness and vice versa. What then is the reported feeling of “pure consciousness devoid of content except for this experience” reported during deep meditation? It seems like the only answer global workspace theory can provide is that there must still be some kind of item left, and thus the “devoidness of content” is a mere illusion.

Stances (according to me)

Topic Stance
Mind-Body Problem The contents of consciousness are the mental items that are made globally accessible to the brain.
Chinese Room No stance, but the strong form would predict that a Chinese Room implementing a global workspace as part of its algorithm is conscious.
Philosophical Zombies No stance, but since the zombie’s brain presumably includes a global workspace, the stronger form denies their possibility.
Teletransporter Paradox No stance on individuality or continuity, but since the global workspace was duplicated, the two versions of the passenger at least have identical contents of consciousness. But this is presumably given in the premise anyways.

Predictive Processing

When you attend a typical introductory neuroscience lecture, the brain will usually be presented to you as a typical input-processing-output machine that processes information from the bottom up. The example given is almost always the visual system, courtesy of it being the most studied part of the brain. The sensory inputs, in this case, are the (sometimes merely a dozen) photons landing on your retina, causing a signal cascade running up a series of cells until the on/off signal is turned into a frequency, which is sent to the visual cortex. The primary visual cortex processes the input by extracting lines with rotational degrees, which are further processed in the secondary and tertiary visual cortices into movements and shapes and at some point into the recognition of entire objects. If the object is a baseball you’re trying to catch, the brain will now activate the primary motor cortex, which creates a high-level plan of action like “stretch out an arm and open its hand”. The flow of information goes into regions that refine the plan like the cerebellum and specify which muscle groups should be involved. Finally, an output signal shoots down your spinal cord and activates many different muscles in your arm, hand, legs, etc. to catch the ball. Consciousness is usually entirely left out of this picture, leaving people wondering why it’s needed in the first place.

And now the other way around

The predictive processing view turns this upside down. It posits that the rich world we perceive in our mind’s eye is not the result of input being processed, but rather our prediction of how the world should look given our current knowledge. Part of this prediction is anticipating how our sensors will be stimulated next. This is a predominantly top-down view!

Of course, our predictions will never be perfect. If we subtract the actual sensory inputs we get from their predicted patterns we get the prediction error. The brain can be seen as a machine with only one task: to reduce this prediction error [8]. To this end, we have two options: we can either update our mental model to a more accurate one or change the inputs we get so that they correspond to what we predicted them to be. Doing the first should be a straightforward idea, but the latter is more intriguing. We can either change the inputs themselves by e.g. downregulating them or manipulate the world around us. In this view, if we wish to catch a ball playing baseball, we simply predict that we will catch it and manipulate the external world to achieve this goal. The external world in this case includes our own body since we can move our muscles so that our prediction becomes true. But there are more layers to this prediction: moving our muscles is itself done via a prediction. I predict that I will see my arm in front of my face and this prediction is the signal to move the arm that way. Thus, all predictions are themselves active causes for something to happen, either in our mind’s models or in the real world. This principle is called active inference.

Many prediction errors need to be minimized, so the brain must prioritize. Whatever promises to minimize the prediction error most will be further analyzed. This is exactly the content of our attention (Hohwy 2012).

Hallucinating the world

The resulting content of consciousness can be called a “controlled and controlling hallucination” [(Seth, Being You: A New Science of Consciousness, 2020)]. Since we only experience what our brain predicts, and what we predict in turn guides our actions, we hallucinate the world in a very controlled way so that we can control the world around us. This principle can be applied to many things. If we have a very strong prediction of something, we can selectively filter our inputs so that the prediction comes true, perfectly explaining confirmation bias (Kaaronen 2018). Maintaining our body at a certain heart rate, temperature, etc. is also a prediction. If some external event forces primitive parts of our brain into changing these variables, our prediction is violated and our prediction-generating parts might not be able to adjust the environment. In this case, we must create a new mental model accounting for the changes our body experiences to better predict the next states. This model might summarize the changes as a single concept which we call an emotion (Seth 2013).

If we wish to achieve, create or maintain something, we hallucinate its existence into a kind of self-fulfilling promise.

Hallucinating yourself

Dreaming yourself into existence

By the same logic, if we wish to preserve ourselves, we simply hallucinate our inner lives as an actor. Thus we will take actions that ensure that this prediction stays true, culminating in our desire to stay alive. This means that our consciousness is nothing but an illusion; the prediction that there will be someone there in the future and thus making it so. This is technically not a proper theory of consciousness per se, but a theory for how to examine consciousness. Through this lens, we can make testable predictions about the contents of consciousness (which include all high-level predictions including one about our own continued existence), our actions, emotions, perceptions, and biases (Seth 2021).

Stances (according to me)

Topic Stance
Mind-Body Problem Consciousness is a product of the same predictions that govern all of our actions through active inference
Chinese Room No direct stance. If one can construct a Chinese room without using any kind of predictive system, it will not be conscious. It is however not clear if such a room could fully emulate a human being.
Philosophical Zombies Consciousness is an automatic product of our prediction-generating brain, thus a philosophical zombie that is physically identical to a normal human must get consciousness for free as well. They are not conceivable.
Teletransporter Paradox No direct stance, but since consciousness is just an illusion resulting from certain data, it stands to reason that the same illusion can be created anywhere at any time, thus making it okay to kill the person left behind.

Integrated Information Theory

This one is special because it is the only theory on this list that promises to produce an objectively quantifiable number, Φ (Phi)[9], that tells you how conscious not just any human, but any system is. When I first heard about this theory, it felt to me as if Φ might as well be magic, since I only learned its derivation in very broad strokes. As a consequence, it was hard for me to give it a charitable view. Since I want to give you a fairer introduction, this part of the post will contain quite some maths. For people who did just a bit of probability theory, this should all be familiar. If you don’t feel like looking at equations today, you can safely skip them all and still understand everything. I just provide them to show you how to, in principle, calculate the measures proposed by the theory for actual examples with actual numbers. At least for me, doing that demystifies a difficult concept like nothing else.

I will be using the methods described in the original publication of the theory (Tononi 2004). Since it was first proposed, many years have passed and brought new iterations and refinements to the calculation of Φ. Since these were additive and designed to implement the existing ideas more robustly, the calculations got more complicated. I don’t think that this changed the core of the maths behind it that much. Thus, I will skip these new details and just focus on the originals, trading accuracy for clarity. For reference, here is the specification for how to calculate Φ in the current version at the time of writing.

Information and Entropy

Each time you experience consciousness, there is a nearly endless amount of experiences you are not having, but could be having. A simple example is that right now, you are experiencing reading this blog post and thus (probably) missing the experience of playing videogames, doing a headstand, doing a headstand while playing videogames, or commanding the Royal Malaysian Air Force. Not all of these are realistic, but they are all possible in the sense that your brain is capable of experiencing them given the right circumstances.

This insight is formalized as your consciousness carrying information by being in a specific state distinct from others. The idea of information theory is a well-developed one, so we can use a measure called Shannon entropy to calculate how much information a given system has. The entropy H of a random variable x with a known probability p(x) can be expressed as

H(x):=E[log2p(x)]=i=1Np(xi)log2p(xi)H(x) := E[-\log_2 p(x)] = -\sum_{i=1}^N p(x_i) \log_2 p(x_i)

If all of our outcomes are just as likely, this simplifies to:

H(x)=i=1Np(xi)log2p(xi)=i=1N1Nlog2p(xi)=1Ni=1Nlog2p(xi)=1NNlog2p(xi)=log2p(xi)H(x) = -\sum_{i=1}^N p(x_i) \log_2 p(x_i) = -\sum_{i=1}^N \frac{1}{N} \log_2 p(x_i) \\ = -\frac{1}{N} \sum_{i=1}^N\log_2 p(x_i) = -\frac{1}{N} N \log_2 p(x_i) \\ = -\log_2 p(x_i)

For example, the probability of a coin flip yielding heads is 0.5, so if the coin is flipped and it lands on heads, its information value is H(heads)=log2p(heads)=log20.5=1H(heads) = -\log_2 p(heads) = -\log_2 0.5 = 1
Let’s repeat this for a fair die roll. Say we rolled a three. We get

H(three)=log2p(three)=log2162.6H(three) = -\log_2 p(three) = -\log_2 \frac{1}{6} \approx 2.6

Aha! Seems like a die roll creates more information than a coin flip. You can easily see that the entropy is higher for rare events.

In other words, entropy can be interpreted as a measure of our surprise at the result. Imagine a friend comes up to you and tells you that they again didn’t win the lottery. Are you surprised? Hardly. You probably don’t know the exact probability of winning, but just hearing the word “lottery” created the assumption of a low probability of winning in your mind. [10]

Integration

Integration encodes the idea of how dependent systems are on each other. Think of a total system composed of a camera where each photodiode is its separate subsystem. The camera’s sensor has a gigantic amount of photodiodes, each either on or off. This means that it contains quite a bit of entropy! But this information is poorly integrated since each photodiode subsystem is independent of all others; they don’t influence each other at all. In contrast, imagine if each photodiode was forced to adopt the state of its neighbors. The whole camera would only be able to produce either a picture of an entirely white or black screen! The system now has the same entropy as a coin flip, but a very high integration.

A measure of how dependent two variables are on each other is the mutual information (I). If you imagine a Venn diagram of a system containing the variables A and B, the mutual information is the intersection of A and B. Thus:

I(A,B)=H(A)+H(B)H(A,B)I(A,B) = H(A) + H(B) - H(A,B)

where H(A) and H(B) are the marginal entropies and H(A, B) is the joint entropy.

Mutual Information in Action

Let’s do an example. Say system A consists of me tossing a three-sided die. System B is a lamp that starts turned off (represented by 0) and is turned on when I roll a three (represented by 1). We can express all possible scenarios in a table, where we write down the probability of a certain combination of states A and B happening. For example, the probability of A being 2 and B being 0 is one in three, because we have a probability of one in three, but the probability of A being 2 and B being 1 is zero because we only turn the lamp on when we roll a three. These values are the joint probabilities.

- A=1A = 1 A=2A = 2 A=3A = 3
B=0B = 0 13\frac{1}{3} 13\frac{1}{3} 0
B=1B = 1 0 0 13\frac{1}{3}

The joint entropy uses all those joint probabilities inside the table. Reading the table from left to right, top to bottom, we get:

H(A,B)=i=1Np(Ai)j=1Mp(Bj)log2p(Ai,Bj)=13log21313log2130log200log200log2013log213=313log213=log2131.6H(A, B) = -\sum_{i=1}^N p(A_i) \sum_{j=1}^M p(B_j) \log_2 p(A_i, B_j) \\ = -\frac{1}{3} \log_2 \frac{1}{3} -\frac{1}{3} \log_2 \frac{1}{3}- 0 \log_2 0 \\ - 0 \log_2 0 - 0 \log_2 0 - \frac{1}{3} \log_2 \frac{1}{3} \\ = 3 \frac{1}{3} \log_2 \frac{1}{3} = \log_2 \frac{1}{3} \approx 1.6

We can extend this table by writing the sums of the probabilities in the rows or columns in the margins, yielding the aptly named marginal probabilities:

- A=1A = 1 A=2A = 2 A=3A = 3 Marginals of B
B=0B = 0 13\frac{1}{3} 13\frac{1}{3} 0 23\frac{2}{3}
B=1B = 1 0 0 13\frac{1}{3} 13\frac{1}{3}
Marginals of A 13\frac{1}{3} 13\frac{1}{3} 13\frac{1}{3} -

Now, we can calculate the constituents of the mutual information! The marginal entropy of A is just the entropy of the marginal probabilities of A:

H(A)=i=1Np(Ai)log2p(Ai)=13log21313log21313log213=313log213=log2131.6H(A) = -\sum_{i=1}^N p(A_i) \log_2 p(A_i) = - \frac{1}{3} \log_2 \frac{1}{3} - \frac{1}{3} \log_2 \frac{1}{3} - \frac{1}{3} \log_2 \frac{1}{3} \\ = - 3 \frac{1}{3} \log_2 \frac{1}{3} = - \log_2 \frac{1}{3} \\ \approx 1.6

In the same vein, we get the following for the marginal probabilities of B (try it for yourself if you’ve never done this before!):

H(B)=i=1Np(Bi)log2p(Bi)=23log22313log2130.9H(B) = -\sum_{i=1}^N p(B_i) \log_2 p(B_i) = - \frac{2}{3} \log_2 \frac{2}{3} - \frac{1}{3} \log_2 \frac{1}{3} \\ \approx 0.9

Finally, we can calculate our mutual information:

I(A,B)=H(A)+H(B)H(A,B)1.6+0.91.60.9I(A,B) = H(A) + H(B) - H(A,B) \approx 1.6 + 0.9 - 1.6 \approx 0.9

Effective Information

For real, messy systems, we still need to build up a probability table. We built the last one by using a priori knowledge about the systems, namely that the lamp does not influence the die and that the die has a one in three chance of rolling any given number. In reality, we don’t have this a priori knowledge; we don’t know how likely each state of A or B is individually and we might be only able to measure them together. To do this, we use a little trick: to check the effect of A on B, we replace A with an entirely random source of inputs and see how B behaves. Completely random means maximum entropy, thus we can formalize this as replacing A by AHmaxA_{H_{max}}. Measuring the mutual information after this manipulation results in the effective information (EI) (Tononi 2004):

EI(AB):=I(AHmax,B)EI(A \rightarrow B) := I(A_{H_{max}}, B)

We can both the effect of A on B and vice versa. The sum of both is called the cause/effect repertoire:

EI(AB)=EI(AB)+EI(BA)EI(A \leftrightharpoons B) = EI(A \rightarrow B) + EI(B \rightarrow A)

Let’s calculate this as well! Luckily for us, setting A to maximum entropy is the same as using a dice roll to determine the value of A, which we already do! So, for our scenario (but not in general!):

EI(AB)=I(AHmax,B)=I(A,B)0.9EI(A \rightarrow B) = I(A_{H_{max}}, B) = I(A, B) \approx 0.9

The other way around is also easy since we defined it as our lamp not having any effect on the die. If we treat the state of B as a completely random value of either 0 or 1, i.e. a coin flip, we get the following probability table, reflective of two completely independent variables:

- A=1A = 1 A=2A = 2 A=3A = 3 Marginals of BHmaxB_{H_{max}}
BHmax=0B_{H_{max}} = 0 16\frac{1}{6} 16\frac{1}{6} 16\frac{1}{6} 36=12\frac{3}{6}=\frac{1}{2}
BHmax=1B_{H_{max}} = 1 16\frac{1}{6} 16\frac{1}{6} 16\frac{1}{6} 36=12\frac{3}{6}=\frac{1}{2}
Marginals of A 26=13\frac{2}{6}=\frac{1}{3} 26=13\frac{2}{6}=\frac{1}{3} 26=13\frac{2}{6}=\frac{1}{3} -

Since we already did all of these calculations, I’ll skip the step-by-step explanations for how to get the effective information / mutual information now. Again, if you’ve never done this kind of probability juggling, feel free to try it for yourself! In any case, even if you don’t want to calculate anything, take a moment and ask yourself which result you expect. Given that B does not influence A, what do you think EI(BA)EI(B \rightarrow A)should be?

The maths says:

H(A)1.6H(BHmax)=1H(A,BHmax)2.6EI(BA)=I(A,BHmax)=H(A)+H(BHmax)H(A,BHmax)1.6+12.6=0H(A) \approx 1.6 \\ H(B_{H_{max}}) = 1 \\ H(A,B_{H_{max}}) \approx 2.6 \\ \\ EI(B \rightarrow A) = I(A, B_{H_{max}}) = H(A) + H(B_{H_{max}}) - H(A,B_{H_{max}}) \\ \approx 1.6 + 1 - 2.6 = 0

Unsurprisingly, it is zero! Thus, our cause/effect repertoire has the following value:

EI(AB)=EI(AB)+EI(BA)0.9+00.9EI(A \leftrightharpoons B) = EI(A \rightarrow B) + EI(B \rightarrow A) \approx 0.9 + 0 \approx 0.9

Partitions

There remains one little problem. It is not yet clear on which level we define a system when looking at the brain. It seems to us that we have one consciousness, but why do we need to treat the whole brain as the system in question? Is it not plausible to define the cerebellum as a system as well? Or why not every lobe, I’m sure they are doing plenty of integrated information. This is the final operation we need to get to the elusive Φ. We are interested in the minimum amount of integrated information we can produce in a system. Thus, we need to find the partition of a system that results in the smallest effective information. Think of it as electricity, pressure, or water following the path of least resistance. Just, in this case, it’s information.

We take a given system and partition it into two subsystems A and B. We then calculate the cause/effect repertoire as we did before. Repeat this for all possible subsystems. The smaller the systems, the smaller the cause/effect repertoire, so to compare them, we need a more sophisticated strategy than just comparing the numbers. In my opinion, this part changed the most over the years in comparison to the original publication. I’ll skip the now obsolete details and summarize the new parts by saying we choose one of several different ways to measure how similar the cause/effect repertoires of the partitions are to those of the unpartitioned, whole system. Sum these up and you get Φ. Whichever way you divide the system, the smallest Φ you can get is the one that is the most cohesive unit of indivisibility, because it will lose the most information in proportion to its complexity.

In our example, there is no way to partition anything, since A and B are already indivisible. Thus, our Φ is still 0.9\approx 0.9 .

Getting to Φ

Hurray! We have a number! What now? Does this mean that our simple system of throwing a die and turning a lamp on is conscious? According to Integrated Information Theory, yes. The theory does not state that Φ correlates with consciousness; it states that Φ is consciousness. I can’t directly compare this Φ of 0.9 to the Φ of human consciousness, however. Estimates for the latter using EEG data were made using the full up-to-date calculations, which produce significantly lower values than the old ones (Kim 2018). Suffice it to say that we can expect an estimation of human Φ to be astronomically higher than whatever we can compute for our little example system.

All consciousness is data, some data is consciousness

But here lies one of the biggest problems. We cannot compute human Φ. Not only do we not know enough about the involved structures to describe them all, but even if we did, the calculations grow faster than any computer could calculate (Tegmark 2016). Thus, we can only estimate Φ. And the current estimates unfortunately do not correlate well enough yet with what we know about consciousness to be able to predict anesthetic conditions.

Strange Consequences

You may have noticed a theme of most theories leading us eventually to weird places. Integrated information theory is no exception. Notice that only the range of theoretically possible states of the involved systems is considered in Φ, but not the actually realized states. This means that if a neurosurgeon were to attach a single neuron to your brain, Φ will rise. It will rise even if the neuron is positioned in a way that nearly guarantees it will never fire. No matter if this neuron stays inactive forever, you will be slightly more conscious through its existence. Were the neurosurgeon to attach an extra region to your brain that was only active when you are playing Hornussen, you would possibly become quite a bit more conscious, even if you never played Hornussen once in your life (Fallon 2022). This is a stark contrast to most other theories which correlate consciousness only with the currently active models in the brain.

The other consequence you will have noticed by now is that Φ implies panpsychism. Nearly everything in the universe can be seen as integrating some kind of information. This means that nearly everything is, to some degree conscious [11]. The quality of consciousness must not be the same for everything, mind you. The qualia we experience are a byproduct of the specific information we are integrating, thus the alien consciousness of geological formations has very different qualia than us.

Stances (according to me)

Topic Stance
Mind-Body Problem Integrated information is consciousness itself. The brain is one of many systems doing this.
Chinese Room Since the Chinese room must integrate information to simulate a human response, it is conscious. It would even be so if its responses were gibberish.
Philosophical Zombies Since its behavior is indistinguishable from conscious humans, it must integrate the present in its brain information in the same way and is thus just as conscious. A philosophical zombie is thus not conceivable.
Teletransporter Paradox Since consciousness is integrated information, the same information being integrated in the same way must be the same consciousness. Killing the person left behind is morally okay.

Orchestrated Objective Reduction

Before we can properly start, we have to get some background on Gödel’s incompleteness theorem. That’s a scary name, but we can get through it fairly easily, I promise. Like for Integrated Information Theory, this part will contain a few mathematical explanations. Again, they are not required to understand the theory but should be able to demystify some of the concepts behind it. This stuff is really interesting in general, so I recommend going through the next two subsections.

Gödel numbering and the big scary proof

About a century ago, Bertrand Russel was very annoyed by the fact that math seemed to contain paradoxes. “Does the set of all sets that don’t contain itself contain itself?”, “Is this sentence wrong?” and all that. He saw self-referential structures as the root of all evil and redefined the formal language of mathematics in a way that he thought outlawed all of these. This new foundation was called the Principia Mathematica (PM). After he was done, a young mathematician named Kurt Gödel proved that the effort was futile. His key idea was finding a way in which a mathematical statement could uniquely be identified by a single big number.

Let’s take the statement “1+1=2”. First, we introduce the Successor function S(n), which means “the number after n”, e.g. S(0) = 1, S(S(0)) = S(1) = 2, etc. Now we can rewrite the statement using less numbers: “S(0) + S(0) = S(S(0))”. can be written as “(0+1) + (0+1) = (0+1+1)”. We now need a Gödel numbering system. The following is an adaptation from the one presented in my favorite book, Gödel, Escher, Bach:

Symbol Number Meaning
0 666 The number zero
S 123 The successor function
+ 112 The Addition operator
= 111 The equality sign
space 0 The separation between two symbols

Finally, we just translate every symbol into the corresponding number, e.g. “S(0)” consists of “S” and “0”, which is thus “123 0 666”, or, written together, 1230666. The statement “1+1=2” can thus be written as “S(0) + S(0) = S(S(0))”, which corresponds to “123 0 666 0 112 0 123 0 666 0 111 0 123 0 123 0 666”, or, written together, 12306660112012306660111012301230666. This big number is the Gödel number of the original statement. The translation goes both ways; if I give you the number 123012306660112012306660112066601110123012301230666, you can figure out the original statement (try it! I recommend separating the string on every “0” for visual clarity like before). Thus the statement and the number are in a sense the same thing. This also goes for more complex statements such as “there are infinitely many prime numbers” or entire proofs such as Euclid’s now 2300 year old classic proof thereof.

Notice that 12306660112012306660111012301230666 is both a mathematical statement and a natural number at the same time. Thus we can make mathematical statements about the number, which translate into statements about the original statement. We have now successfully reintroduced self-referential structures! Let’s do some mayhem with it. With some effort, we can generate a very weird statement that for a given Gödel number S roughly says “The statement behind the number S is not provable within PM”. We convert this statement into a huge number and call it G. Now we just plug G into itself: “The statement behind the number G is not provable”, or in simpler terms, “This statement is not provable within PM”. Since we want our math to be useful, we generally assume that any statement we can reach by following mathematical rules correctly is correct [12], e.g. we will never reach bogus like “1 + 1 = 3” no matter how many operations we do. Since we reach the statement “This statement is not provable” by strictly following mathematical rules, it must be true.

The consequence is quite shocking. Think about it: we have just demonstrated that there is at least one true mathematical statement that PM cannot ever prove. This seems highly counterintuitive and like a death sentence for PM. But it gets worse. Nothing is special about the way we used PM in the self-referential statement. We can plug in any mathematical system that allows simple arithmetic and logical operations and we will get the same result: there is always at least one statement that cannot be proven by a given mathematical system. That is the heart of Gödel’s incompleteness theorem.

In my opinion, this is one of the most humbling truths of reality. No matter how much we try, for any given system, there is always a black box of forbidden knowledge floating just beyond our reach.

The nature of computation

Put a pin in the incompleteness theorem, we will need it in a second again. For now, we jump to another young mathematician Alan Turing, who is about to invent computer science before joining the British efforts in the second world war, which would later repay him by killing him for the crime of being gay. Turing asked himself how one could mathematically formalize the process of running any sort of algorithm. Since this was a time when “computer” was still a (predominantly female) job description for someone who mentally juggled numbers, the only way of running a big and complicated algorithm was to grab a pencil and some paper and mathematically transform one statement into the other and the next, and the one after until you got your result. This is probably still how you learned to e.g. multiply! Turing split this process into two parts: the tape is an infinite piece of paper where one can write down symbols and the head is the set of instructions that will tell you what to do with this tape. The head is always hovering over a specific symbol on the tape and can read it, overwrite it, move to the last symbol or move to the next symbol. Notice that this models what you do with your actual, physical head when doing a calculation! Put a pin in this as well. This simple model is a Turing machine. Mind you that this machine is just a mathematical abstraction and not something you can physically build (have you got infinite tape at home?) or touch.

In the same year across the Pacific, another mathematician called Alonzo Church independently tried to do the same thing and created a seemingly completely different model based on how he thought one could model mathematical functions themselves called lambda calculus [13]. The two compared their models and realized that they were able to fully simulate both models in each other: lambda calculus can work as a Turing machine and Turing machines can evaluate lambda calculus. Every sufficiently powerful model of computation to date has been shown to both be able to run a Turing machine and to be run on a Turing machine. This is a strong hint toward Church and Turing having successfully captured the essence of what it means to compute something. This is called the Church-Turing thesis, which can be paraphrased informally as “every calculation can be run on a Turing machine”.

The consequences of computation

Turing, who was a friend of Gödel, demonstrated something very similar to the incompleteness theorem: by imagining all possible Turing machines, we can list all possibly computable numbers. We can however prove that this list is incomplete, which means that there must be numbers that cannot be computed. Again, we face a kind of fundamentally hidden knowledge.

Remember when I mentioned that the “head” of the Turing machine models your actual, physical head? Your head, which is to mean your brain, is also just running a set of instructions. Granted, the instructions of the brain are far more complex than anything you can calculate by hand, but in principle, if we reject Cartesian Dualism, we must confront the fact that our brain’s state is only dependent on its last state and the physical laws that govern what the electrobiochemical components of your body do. Thus, every process of the brain, including consciousness, is a (very very complicated) kind of calculation in the end. If we accept this premise and invoke the Church-Turing thesis, we must accept that the brain must be able to be simulated on a computer. If you do not agree with this conclusion, you must also either refute the Church-Turing thesis or assert that brain processes are fundamentally not calculable. Both options are extraordinary claims, so be ready to bring extraordinary evidence.

But what if I do?

Now we finally get to the theory of consciousness itself. Roger Penrose, who jointly received a Nobel prize with other scientists in 2020 for their work on black holes, proposes that Turing machines cannot run consciousness. His argument goes as follows. Assume that the brain is running a calculable algorithm. If so, it must operate within some mathematical model. Thus, we can generate a Gödel statement that cannot be proven within this system. Suppose we generated this statement. Knowing the incompleteness theorem, it should be easy to see that the statement is true, even if it’s not provable by the human mind. But knowing that something is true is proving it, so we just proved it despite it being unprovable. A contradiction! Penrose deduces that whatever process the brain is running cannot be a regular calculation, is thus not computable, and can thus not run on a Turing machine. (The Emperor’s New Mind, 1989)

If not an algorithm, just what is running consciousness in the brain? If the brain is not a kind of Turing machine, what is it? As a physicist, Penrose searches through physics for a solution. At first glance, the best contender seems to be extending the standard Turing machine with quantum properties such as superpositions and true randomness. Unfortunately, it turns out that although a quantum Turing machine can be faster than a regular Turing machine, it still cannot compute anything new. (Deutsch, 1985). It is uncontroversial to state that some part of the quantum world has not been fully understood yet, after all, we still have no commonly accepted theory that unifies quantum theory with general relativity. In this vein, Penrose suggests that we have simply not yet found the mechanisms required for a more sophisticated quantum Turing machine to produce non-computational behavior.

Now we only need a place for this quantum behavior to occur, since a neuron doesn’t look or behave like a quantum phenomenon. Penrose’s co-author Hameroff claims that he has identified a smaller unit of computation inside every cell: the microtubules (Hameroff 1984). These little tubes are usually seen in biology as big support beams that help a cell maintain its structure. Especially in neurons, they also serve as highways for transporting substances across the cell, such as neurotransmitters. Together, Penrose and Hameroff posit that since microtubules are the computational unit of each neuron and that consciousness requires a non-computational quantum process, each microtubule is exhibiting quantum behavior. Neurons merely magnify the quantum behavior and allow interactions with physical systems like our muscles (Penrose, Shadows of the Mind, 1994). They provide an outline for how such a process might look like involving quantum gravity called Object Reduction (OR), but the key idea here is just that any such process takes place and not how exactly it looks, so we will not look at OR itself.

Consciousness from Orchestration

Remember when Integrated Information Theory claimed that consciousness is information being integrated? A similar claim is made at this point: qualia, the smallest possible constituents of consciousness, are these quantum computations (Hameroff & Penrose 2014). Each of these little “sparks” is a proto-consciousness. It is postulated that the human brain orchestrates the quantum computations by entangling the quantum phenomena inside the microtubules across big parts of the brain. Thus, we go from a single proto-consciousness with random contents to a coherent, orchestrated, and unified conscious experience.

This shares the substrate independence of some theories of consciousness, but interestingly, while it explicitly allows for entangled quantum computations on the surface of neutron stars to be conscious, it disallows the possibility of running consciousness on a computer. According to this view, human-like AI is impossible to achieve until we have a much deeper understanding of quantum physics.

Stances (according to me)

Topic Stance
Mind-Body Problem Consciousness is orchestrated quantum computation, which governs the actions of the body
Chinese Room Per the premise of the Chinese room, there is a computational algorithm running the room. Since consciousness cannot be run by a Turing machine, the Chinese room is not conscious. The room is also not able to generate Gödel statements that are unprovable in its mathematical system and can thus not fully emulate a human.
Philosophical Zombies It is possible to construct a brain “only” running a Turing machine, which would produce a philosophical zombie. It would however not be able to generate Gödel statements that are unprovable in its mathematical system and can thus not fully emulate a conscious human. Since the premise specifies that all human behavior is perfectly replicated, philosophical zombies are not conceivable
Teletransporter Paradox The theory makes very few claims about what would happen if two systems had the same quantum configuration. Presumably, such a thing is not feasible and the person on the other side is not identical to you and is thus a different consciousness. A teleporter is a murder machine. An interesting thought experiment would be however to imagine the two subjects having orchestrated computation before one of them is killed.

Strange Loops and Tangled Hierarchies

This one is tough to write about without rambling on, since the book introducing this view, “Gödel, Escher, Bach”, is my favorite piece of writing in the whole world. But I’ll do my best.

Whatever theory of the mind one might favor, it seems apparent that we have some kind of inner representation of objects and concepts. Let’s call these representations “symbols”. Not as in mathematical or written symbols, but as in a symbolic unit of meaning. For example, when you look at a fork, you cannot but instantly recognize it as a fork. Your mind fetches this “fork” symbol automatically and binds it to the image of the fork. We will call this “activating a symbol”. Symbols can also be of abstract objects like democracy or the color yellow. Note that we are not making any claim about where and how this symbol is manifested in your brain, we are just noting that this phenomenon seems to exist.

It also seems apparent that we can create new symbols. Here’s a fun fact: did you know the Swiss have a unique national sport called “Hornussen”? It’s like a mix of gold and baseball. Instead of a ball, you have a small light puck. The player beating it must wildly swing a big wiggly stick with a small head around to hit the puck. The stick looks like a gigantic bottle cleaning brush and handles like a 2 meter long bendy ruler, so the act of swinging it around looks quite fun. The puck then flies into the field while doing a buzzing noise, which is why it’s called a “Hornuss”, a local Swiss word for hornet. The other team waits in the field holding big stop signs, with which they try to catch the Hornuss. If they fail and the Hornuss lands, they get one penalty point per 10 meters of flying distance. The team with the least total points wins.

And just like that, you created a new symbol for this new sport you’ve probably never heard of. Symbols also seem to be interconnected. When you think about Hornussen, which other symbols instantly come into your mind? For me, it’s open fields, two teams, and a silly stick. Thinking about these symbols in turn shows their connections, ad infinitum.

Seen like this, our brain is a machine for creating, activating, connecting, and manipulating symbols.

Interactions with the world

Imagine a very young child called Douglas and his friend playing with a ball. To do so, it seems plausible that they must either already possess a symbol for “ball” or are forming one right now while playing. The friend decides to bounce the ball off a wall and see what happens. It throws the ball, which flies towards the wall, bounces off, and returns to the child. While watching and laughing, Douglas’ symbol-creating mind starts forming symbols for “bouncy”, connected to “ball” [14], and “thrower”, connected to the symbol Douglas created for his friend. Excited, Douglas wants to try as well. He throws the ball, not as straight as his friend did, but well enough to the wall. Just as before, the ball bounces back and lands at his feet. His symbol for “bouncy” gets strengthened and is still connected to “wall”. But what is the symbol for “thrower” connected to now? If he didn’t have a symbol for it yet, this is a magical moment: Douglas just formed a symbol for himself, the “I”. While growing up, Douglas’ symbol for “I” will become more complex. At around the age of 4, his theory of mind will be strong enough to know that his knowledge is not the same as everyone else’s. The video I linked is extremely cute, please take a break from reading and watch it.

Back? Alright. This meta-understanding of his knowledge means that it makes sense to connect his “I” symbol with all knowledge he has, which is all his symbols. But “all his symbols” includes “I” as well, making it an endless recursion. We can come to this conclusion via another route as well: to think ahead of time about his choices, Douglas must be able to simulate parts of the world, including him. But, to simulate himself, he must simulate that he knows about himself, which means that the symbol for his simulated “I” must also include a symbol for himself. This “I” symbol is from now on always activated as the actor suspected behind whatever course of action the processes in the brain are following.

Strange Loops and where to find them

This kind of recursiveness, where you can descend and descend a hierarchy only to find yourself right where you started, is called a “strange loop”, and can be found in the paintings of M.C. Esher, the music of J.S. Bach, and the proof of the incompleteness theorem of Kurt Gödel. The system of symbols containing a strange loop is called a “tangled hierarchy” [15]. The central claim of the theory is that the vague feeling of being conscious is nothing but the qualia resulting from the “I” symbol being active and thus telling you that there is an autonomous agent responsible for what you do, namely “I”. In this sense, the “I” is a complete illusion, maybe even more so than in the other theories.

You are your mind thinking about you thinking about you thinking about

All of this does not require meat hardware, of course. Any sufficiently powerful system capable of a) creating symbols and b) interacting with the world will inevitably create an “I” symbol. Just like natural numbers will always automatically give rise to the self-referential Gödel sentences, a symbol-creating system will automatically give rise to the self-referential “I” symbol [16]. The more sophisticated the ability to manipulate symbols and the more possibilities for interaction the system has, the more complex the tangled hierarchy will be and by extension, the more complex the “I” symbol will be. The theory claims that this complexity is what makes us feel like we are very conscious. If consciousness itself is an illusion, the complexity of the illusion will determine how convincing it is.

Beyond the own body

There are some very interesting consequences of this view. The first one is that all systems containing symbols can be placed in a hierarchy of ascending consciousness. This of course depends on which organization of the information you precisely define as a “symbol”, which is intentionally left blank to be filled by neuroscience since, again, it seems reasonable to assert that they must somehow exist. But at least for humans, this forces us to conclude that babies are less conscious than children, that children are less conscious than adults, and that some adults are more conscious than others. This puts an interesting twist on the trolley problem that forces one to pick between a group of adults and a group of children.

The most radical consequence however comes from the fact that we can not only simulate ourselves in hypothetical situations but others as well. But to simulate someone else, I must also simulate their consciousness, which means I must simulate the strange loop of their “I” symbol. But activating an “I” symbol is exactly what we defined as consciousness before! This means that simulating another person, for example when pretending to be them, remembering them, dreaming of them, thinking about what they would do in a situation, etc. runs a low-fidelity version of their consciousness. This is very easy to dismiss because of how unintuitive it is, but give it a try. Think about the closest person to you that you know. Look at the room you’re in. Now, pretend as realistically as you can to be that person for 60 seconds. Set a timer so you don’t have to think about the time. Think what they would think while looking around. Don’t think about what exactly would go through their mind, just think it directly.

Done? The argument is that for this short moment, your consciousness took a back seat and another was active. The only reason it might not feel like it is that you never lost your sense of agency and that you can remember the experience afterward. In this sense, the teletransporter paradox is real and very mundane. You already exist multiple times: one version is inside your brain, with very high complexity, but hundreds of versions are in the brains of your friends and family. It also means that when you die, it is not as if a candle is extinguished, but as a fire slowly burning out. Every time your close ones gather to reminisce about you, they refresh their memories of you, exchange information about you that some might not have had, and thus strengthen their symbols of you. And no matter if you buy the theory or not, you have to admit one thing: that is a damn beautiful thought.

Stances

This time, the stances come directly from the horse’s mouth, since they are all addressed in the book “I Am a Strange Loop” (2007)

Topic Stance
Mind-Body Problem The body is a machine for producing symbols of reality. The symbol representing the “I” being active feels as if we were an autonomous actor
Chinese Room If the algorithm can always produce a perfectly human answer, it stands to reason it must be able to create symbols and simulate interactions of itself with the world. After all, you can explain to it what Hornussen is and how it would imagine itself playing it. Thus, it must be able to create an “I” symbol and is thus conscious.
Philosophical Zombies Via the same argument as with the Chinese Room, the behavior of the zombie means it must have an “I” symbol and is thus conscious. The zombie is in this view not even conceivable.
Teletransporter Paradox Both people are equally you. It’s okay to kill the version on Earth since this means nothing more than a few seconds of memory are lost. Don’t fear the reaper.

Attention Schema Theory

The brain is constantly bombarded by external and internal input. With this gigantic flow of data, it is impossible to treat them all the same way; the brain has to prioritize. It does this by upregulating some signals and downregulating others. Attention Schema Theory defines attention as this upregulated data. It then postulates that there probably exists a control mechanism in the brain that shifts attention as needed. Since it is assumed that the brain creates models of physical and abstract objects in the world, it would make sense to create such a model of its attention. This model, called the attention schema, is fed to and manipulated by the controlling agent. The model is not perfect, but it is a good and fast enough approximation to be useful in guiding attention. The attention schema resides within the Global Workspace we’ve discussed before. Thus, the contents of the attention schema are at the same time the contents of consciousness. The feeling of this being me comes from the imperfections in the simplified model of attention. Just as we have an imperfect model of our body, we have an imperfect model of our attention, of our experience available to all our brain regions. Thus, if anything came to ask whatever part of our brain if it had an experience, it would state “yes, I have data right here saying that I have an experience!”. This is the illusion of consciousness. (Graziano 2022)

The Social Brain

Since our ancestors evolved in social contexts, it would make sense to be able to have models of other organisms’ attention. We have already evolved a way to model our attention, namely the attention schema, so we can use the same trick to create attention schemata of others. When these models gather our attention, they are moved to the global workspace, where they also become part of our consciousness. (Consciousness and the Social Brain, Graziano 2013) This means that we evolve empathy from consciousness. While some social theories of consciousness assert that consciousness is nothing but theory of mind turned inwards (accidentally implying autistic people are less conscious than neurotypical people), the Attention Schema Theory says the opposite. Theory of mind is formed by extending the assumption that we are conscious to others. A very interesting and delightfully contrarian consequence of this view is the conclusion that it might be very important to make sure our AI is conscious as soon as possible so that it can act with compassion instead of seeing people as mere objects (Graziano 2017).

Strange Loops all the Way Down

Since in the view of Attention Schema Theory consciousness is nothing but an attention schema running on the global workspace, we land at some of the same unintuitive conclusions as we did with Strange Loops. Our low-fidelity models of other people are also conscious. Another similarity is the necessity of attention, thus attention schemata and thus consciousness in brains (Graziano 2015).

These parallels should be no surprise. Attention Schema Theory is a combination of strange loops and global workspace theory. We just use a strange loop between the actively changing attention and the descriptive attention schema instead of the strange loop of “I”. Our model of attention contains our attention, which contains our model of attention, etc. The following quote uses the term “awareness” instead of “consciousness”, but it puts it quite nicely[17]:

Attention is an active process, a data-handling style that boosts this or that chunk of information in the brain. In contrast, awareness is a description, a chunk of information, a reflection of the ongoing state of attention. Yet because of the strange loop between awareness and attention, the functions of the two are blurred together. Awareness becomes just as much of an active controller as attention. Awareness helps direct signals in the brain, enhancing some, suppressing others, and guiding choices and actions. (Consciousness and the Social Brain, Graziano 2013)

Stances (according to me)

The stances end up being the same as for strange loops.

Topic Stance
Mind-Body Problem The brain’s information competes for attention. Winners are summarized in a model called the attention schema, which resides in the global workspace and is thus the content of our consciousness
Chinese Room Since the Chinese Room demonstrates social intelligence, it must be conscious.
Philosophical Zombies Via the same argument as with the Chinese Room, the behavior of the zombie means it must have a model of its attention to be able to model others’. The zombie is in this view not even conceivable.
Teletransporter Paradox Both people are equally you since the contents of their global workspace are identical. It’s okay to kill the version on Earth.

Conclusion

What a ride! I hope I got all the interesting current theories together. If you’re missing your favorite theory of consciousness, please write a comment. Keep in mind that I intentionally left out the ones that mostly consist of correlates of consciousness, explaining where but not necessarily why consciousness happens in the brain.

If I did a good job, you will have noticed how interconnected some of these theories are. I feel like some evergreens pop up quite often. Associating consciousness with a model of the self is quite popular, as is a focus on different organizations of information. Although there is no unified theory of everything yet and no theory addresses all aspects of conscious experience, we can certainly see the field converging on some consensus. I cannot remember who it was, but I remember a recent quote summarizing this development as a fractured web of consensus emerging. What exciting times we live in to learn more about ourselves.

Some spots are still left fairly unexplored. In my opinion, there is very little emphasis on memory, which seems to be a big part of the feeling of being the same person over time. There are also some unusual states of mind such as the dissolution of the self-reported after meditation and the consumption of psychedelics while the person in question is still conscious. Although neuroscientists and consciousness researchers like to talk about this, no current theories have any explicit thoughts on these situations, even though they seem important edge cases that poke holes in our intuitions. Illnesses that involve deterioration of the perceived consciousness like Alzheimer’s seem promising to me as potentially overlooked sources for ideas.

My own two cents

Right out of the gate I will reject dualism. Any phenomenon that defies physics needs extreme evidence which I simply don’t see. For the same reason, I reject theories with metaphysical claims like “consciousness is this or that physical phenomenon”. Integrated information theory and orchestrated objective reduction seem very bold in this regard. Per Occam’s razor, it should be more parsimonious to expect consciousness to be a product of our brain, specifically an algorithm. In contrast to Penrose, I see no reason to believe human thought is above computability since I find his Gödelian argument conceivable, but certainly no hard proof of this extraordinary claim. Thus, I believe consciousness is a regular algorithm that can be run on any Turing machine. But what kind of algorithm? As the hard problem of consciousness posits, there is no process happening in your life that couldn’t happen without consciousness. If all your actions can be explained by physics running algorithms, then “conscious” deliberation, planning, etc. can all in principle run without you ever feeling conscious at all. After all, it’s just data being manipulated in a certain way. If so, considering that evolution overwhelmingly culminates in the simplest, most effective ways of accomplishing a goal, it seems extraordinary that such a useless phenomenon as conscious experience evolved without actually doing anything for us at all. If this statement makes you uncomfortable, remember again that things like empathy are most probably mere algorithms and could be run by any machine without ever requiring a conscious experience.

This reasoning leads me firmly into the illusionist camp. We have systematically excluded all ways in which consciousness could plausibly exist or even have evolved. Thus, we are left with the conclusion that it cannot exist in the way we usually think of it. This is resolved by reducing it to an epiphenomenal illusion. Epiphenomenal means “a phenomenon that exists as a consequence of other phenomena”. Just as natural numbers will automatically give rise to self-referential Gödel sentences, I believe that the algorithms running in our brain automatically gave rise to consciousness and that we cannot have one without the other. This resolves the problem of how consciousness could evolve in the first place. Since this is a strong claim, let me weaken it a bit: I’m only claiming that the “intelligent thought” algorithm running specifically in our brains results in consciousness, not that necessarily every intelligent system is automatically conscious. Since only conscious beings will ask themselves why they are conscious [18], I cannot make any claim about how likely it is that an intelligent being will be conscious. Maybe nearly every sufficiently intelligent system will include some kind of mechanism that automatically results in consciousness, as Hofstadter claims, or maybe it is extremely rare that intelligence emerges with said mechanism. We can only judge this by meeting or creating intelligent systems capable of communicating with us and asking them. And then, perhaps even harder, we must also believe them when they say they’re just as conscious as us.

Consciousness under illusionism

I don’t see any evidence for consciousness being an extremely special kind of algorithm. Of course, it seems special to us, but we are biased. Try to define, grasp, or meet your consciousness and, as experienced meditators can attest, you will fail. There is simply nothing there but the persistent pseudo-knowledge that it simply must somehow exist. On this much, I agree with mysterianism. However, I have no problem facing this. The solution seems to be that something just tells us that we have an experience at all. If you ask “but who is the one being told that they exist?” I will simply have to repeat “no one”. You and I are merely an algorithm inferring that an experience of some sort has happened. All the richness of life, the redness of red, the painfulness of pain, the feeling of simply existing, they are all just false memory. I use a very simple, but deeply cutting razor for these conclusions: “All subjective experiences are illusions until proven otherwise”. I use the term “illusion” to mean “unshakable belief in something nonexistent”. I can prove that I can distinguish different colors, but I cannot prove that I perceive the redness of red, so I cannot believe my own perception. This thought probably feels very weird to you, and believe me, so it does to me. But I find it is the conclusion I reach when rejecting all impossibilities.

Turn the inner eye to see yourself. Where you were there will be nothing.

There is a well-documented and repeatable experiment of how one can observe such an illusion being created. In extreme cases of epilepsy, patients are offered a procedure that cuts through the corpus callosum, the piece of your brain that connects your two hemispheres. After the surgery, the brain regions handling the left and right fields of vision can no longer directly communicate. Present the right hemisphere with a task and it will oblige. But in humans, usually, only the left hemisphere can talk. Ask the patient why they just did what they did and, not knowing that a task was given, the left hemisphere will with frighting ease confabulate whatever reason for your behavior makes half sense. The patient will be completely convinced that he is right, even though his perception of what he experienced is objectively and demonstrably wrong.

A future unified theory

Which of these theories results in the epiphenomenal illusion I describe? Hard to say. Whatever it is, it must account for what happens to my consciousness when I dream. It must account for what happens when I take a dose of LSD. It must account for how I feel like I’m one person with a single experience, even though most parts of my visual processing are running individually in parallel (Zeki 2015). So far, no theory handles all cases (Doerig 2020). I speculate that the truth will be similar to a modified combination of attention schema theory and predictive processing. As discussed before, predictive processing is not a theory of consciousness per se, but a way to approach whatever is going on in the brain. While certainly not perfect, it provides a solid foundation to explore further theories. Attention schema theory brings together the best parts of strange loops and global workspace theory, bridging the gap between consciousness and its contents. There are also some interesting ideas put forth by Jeff Hawkin’s Numenta that we didn’t discuss since they have nothing to do with consciousness itself about how high-abstraction level concepts could biologically be instantiated in the brain (Hawkins 2019). If his team can verify all testable claims they have made, a unified theory of consciousness could emerge going all the way from cell to algorithm to consciousness.

Some days I feel discouraged by how few testable claims our current theories of consciousness have produced and by how even fewer have actually been tested. Marvin Minsky described this dilemma as the theorists coming up with claims and wanting the neuroscientists to test them, while the neuroscientists want the theorists to make sure their theories make sense before they start expensive experiments, in essence passing the burden of proof between them. But other days I can feel like there is a glimpse of a unified, generally accepted theory of consciousness visible in the corner of my eye, a will of the wisp that is as elusive as consciousness itself. One day, I am certain of this, we will have a theory in front of us that will explain consciousness as well as we can explain life today. This theory will not make us sad for the magic lost along the way, but gift us an entirely new appreciation for ourselves and our place in the universe. The deepest solace lies in understanding, this ancient unseen stream, a shudder before the beautiful. I hope I will be there for it.


  1. Not to be confused with a Chinese person; those are definitely conscious. ↩︎

  2. Actually, the essay states that “The answer to the question that forms my title is, therefore ‘No’ and ‘Yes’”. This nuance is meant to allow the possibility that we might find the solution but just not accept it as such, see the paragraph after the footnote. ↩︎

  3. If you’re interested in the intuition behind it, pay attention to the last few sentences of the linked paper, concerning the probability of two siblings being boys. Try to draw all the possible cases on paper, this should give you intuition. The intuition for the aces is then just a natural extension of what you just did. Although calculating the actual numbers requires an understanding of combinatorics, the intuition behind the result does not. ↩︎

  4. Interestingly, Descartes anticipated the Brain in a vat scenario and by extension parts of the simulation hypothesis some 300 years before they were first formulated. ↩︎

  5. The simulation hypothesis, which has surprisingly many parallels to conventional religions once you look for them, is the only one I can of that does not require Cartesian Dualism. This was today’s edgy hot take, maybe I’ll go into this in detail another day. ↩︎

  6. A cool example of this is this case study of hemispatial neglect. Emphasis on the part with the house. ↩︎

  7. If this is not relatable, try listing something else. The important part of this exercise is that all methodical approaches must fail to produce all items and you must in the end solely rely on whatever your brain is doing when you’re just trying to remember. Think tip-of-the-tongue phenomenon. ↩︎

  8. This tendency is described by the notoriously difficult to understand “free energy principle” ↩︎

  9. The symbol “Φ” represents information “I” integrated within a system “O” ↩︎

  10. Knowing this, you should be able to express Occam’s Razor in terms of information theory! Imagine you receive a set of data and are asked to estimate its distribution. With the limited information you are given, you whip out several candidates. Knowing about Shannon entropy, which one do you think you should pick? ↩︎

  11. This line of thinking can lead to some very strange moral questions. Ever asked yourself if electrons can suffer?. ↩︎

  12. This property is called “consistency”. Ironically, any sufficiently powerful mathematical system can only prove its consistency if it is inconsistent. This means that consistency is just something we assume about modern math without being able to prove it. It’s a pretty safe bet so far, but also a slightly terrifying prospect that we cannot prove that there is no way to get “1 + 1 = 3” by following maths. ↩︎

  13. This divide of possible ways to do computation is affecting programming languages up to this day. The mathematical Turing machine was used to develop the physical von Neumann architecture, which is powering the device you’re reading this post on right now. It also inspired procedural languages like C and, by extension, most of the modern programming languages. Lambda calculus, in turn, inspired functional languages like Haskell and by extension the functional aspects of many modern programming languages. ↩︎

  14. And possibly “wall”, until the day he throws a plate at the wall and learns that the wall is not itself responsible for the bouncy characteristic of certain materials, instead getting his “scolding” symbol activated. ↩︎

  15. Or more specifically a heterarchy, which is an organization where an item cannot have a unique rank in comparison to others. The item has either no rank or multiple different ones. ↩︎

  16. If you want more of this, I recommend reading Yann Tiefenhaus’s “An Introduction to Current Theories of Self-Referentialism”, especially the section about how consciousness is an act of self-reference. ↩︎

  17. Attention Schema Theory uses “awareness” every time I use “consciousness” in its description, but the distinction is only very slight. I find the terms “awareness” and “attention” too easy to mix up, so I stick with consciousness in this post. ↩︎

  18. This line of reasoning provokes thoughts on why the universe seems so fine-tuned to the existence of intelligent life and we might live in a multiverse. ↩︎