Fragmentation

There is an equally concerning dual problem to convolution, and that is fragmentation. While convolution is about multiple ideas hiding in the same terminology, fragmentation is about a single idea hiding between multiple terminologies. This causes problems when we understand different aspects of an idea

For example, I know that eggplants are delicious, and know many recipes for them, but then I’m placed in a bizarre British cooking show, and told that today’s ingredient is aubergine. What is an aubergine??? I’ve heard it’s a nightshade, that sounds poisonous…


Notice there are actually two places for fragmentation to hide, in the hierarchy, Thingspace – Thoughtspace – Wordspace, it can hide at either junction. We could have multiple words referring to the same idea. This is analogous to having multiple textual aliases for some function in code. It’s annoying, but not that hard to solve: textual substitution is easy, we can just look up the definition and unify them. In practice, humans are very good at on-the-fly substitution/translation, so these sorts of fragmentation survive but don’t cause us too much trouble. The more difficult sort of fragmentation is at the Thingspace -Thougthspace link, analogous to two different algorithms that do the same thing. Unlike in Wordspace, where most translations are just direct substitution, and equivalent ideas are strictly equivalent, computing equality of generalized ‘things’ is Hard (think function equality, but harder). Part of this is because ‘things’ want to be weak; the way in which things are “equivalent” matters.

There’s a more general problem. Maybe two ideas don’t represent exactly the same thing, but they sorta seem to have related subcomponents. The “sameness” of thoughts/things is related to the degree to which their subcomponents match.

It seems to me that the optimal way to think (and the optimal way to design an AI.. hint hint), is to break all problems down into a complete set of primitive ideas, and then memoize all the ones that get used frequently. An immediate question comes up: How do we know when we’ve found a minimal concept? This business of seeing internal structure, and factoring concepts into their primary components is exactly the sort of thing category theory is good at, and so this problem has a solution. The notion of “universal property” exactly encodes what we mean by “minimal concept relative to some property”. Roughly, a concept has a “universal property” if every other concept with that property can be described in terms of it. For example, a “product thing” for things A and B has exactly the information to give you an A or a B via projecting. Any other thing that could give you an A or a B can be described as a special case. I encourage you to actually read about universal properties, because I can’t give satisfactory coverage here.

The whole point of this is that Fragmentation is a big problem whenever you’re trying to coordinate research between diverse fields of study, because it leads to the encryption problem. Fragmented ideas “want to” be united, in that they have higher entropy than their unification. When two ideas merge into a lower entropy state, the information difference (“counterfactual pressure”) is released as a sort of free energy and can be captured via “counterfactual arbitrage

Advertisements

Consciousness and Intelligence are Convoluted

There’s been great discussion on LW as to the value of consciousness as a concept. The general conclusion many have come away with is that we should probably just taboo ‘consciousness’ and get to the meat. I tend to agree.

I’d like to present a slightly different reasoning though. The feeling of consciousness in ourselves and others is a hardcoded trait. This immediately should lead us to be very suspicious of it as a consistent concept. It’s clearly useful for development, if only to be used as a proxy for “human-ness”, but I’ll argue that it is just that, a heuristic. There are many interesting phenomena hiding in ‘consciousness’, but they should be considered distinct phenomena. They are bound together by the their shared name, and we often switch between them in conversation without noticing, assuming they’re the same concept. That’s right, ‘Consciousness’ is convoluted.

Here are a few things people tend to mean with conciousness, each interesting on their own. It is not immediately obvious that they represent the same phenomena, though I suspect that they are deeply related. It’s important to lay them out as separate ideas so that their connection can be made explicit, rather than equivalent “by definition”.

  1. The feeling of free will – The generation of counterfactual scenarios and an evaluation of those scenarios
  2. The feeling of self awareness – To contain a model of ones self and mental processes
  3. Perception of qualia – The recognition of sense as “internal” experiences, such as awareness of the color red.

Similarly, the term “Intelligence” is convoluted and we should taboo it. Some possible meanings are

  1. Consciousness – yes, sometimes consciousness and intelligence are used synonymously
  2. Containing sufficient processing power, being sufficiently complex so as to be unpredictable – in common usage, we sometimes say someone is “intelligent” if they can learn and think quickly. Many times things feel intelligent if they are complex.
  3. Acting sufficiently agent-like – Intelligent things feel as if they act according to goals and rational decisions based on those goals.
    1. Often a good heuristic for “agent-like” is “self-like”, if you consider the class of all things you encounter, you’re probably one of the more agent-like things you deal with. So in many cases, this is the feeling we’re actually referring to when we say “intelligent”. This one is just flat Wrong, beware of it.

Consider these meanings, which you tend to use most often, and how they might be related. Be mindful of how you use them during conversation and when you feel the urge to switch meanings; it should greatly improve the clarity of your arguments.

Convolution

In common usage, ‘convoluted’ means “complex”,”hard to follow”,”unclear”. So the main reson to use ‘convoluted’ is for its emotional nuances. But I’ll argue (in the near future) that the full emotional power of language is overkill and makes expressing precise logical meaning quite difficult in practice. So as usual we turn to math for a better meaning.

In math, convolution is a peculiar operation that can be thought of as smearing two functions over their input. Conversely, deconvolution reverses the process, finding two functions from a smeared one. It’s often used in image/signal processing, where convolution is some kind of blur and deconvolution tries to sharpen. It’s used a little more generally for things like network deconvolution, where you attempt to direct effects when you’re only able to measure pairwise effects, which may be a direct effect, a transitive chain of 2, of 3, etc.

So let’s use our wonderful abstracting brains to distil the commonality:

DEFINITION: Convolution is when the distinction between multiple independent factors becomes obscured through some interaction. Deconvolution is the intellectual/computational process of dissecting a convoluted result into its factors, and determining how they convolute. Pseudomathematically, a convoluted thing is a thing like C \overset{f}{\cong} C_1 + C_2 + C_3 + .... Deconvolution is finding both the right hand side (the factors) AND the function f, the “how”.

(Note: We could use “conflation” here and it would be closer to the original english meaning. I choose “convolution” instead so that the technical distinction is clear. The image here should be of an object “sitting below” its confounding factors. If this ‘misuse’ of language bothers you, feel free to interpret it as a frequent typo of mine – it doesn’t matter much anyway, as the definition will always be linked when using a “technical” term)

EXAMPLE: When statistically evaluating cause and effect, we often use correlation as a surrogate. But if two events A,B, have a 0.5 correlation, We could have

A \overset{0.5}{\to} B, or A \overset{0.5}{\leftarrow} B, or A \overset{\sqrt{0.5}}{\leftarrow} C \overset{\sqrt{0.5}}{\to} B or A \to \ldots \to B or.. you get the idea

 

 

Convolution is a big problem that seems to go unnoticed in human thinking (perhaps because there was no word for it in common usage :D), so I’ll be using it as a platform for many more posts in the near future. I started writing them and realized they all had the same braindump preamble, so I factored it out!

HREF EVERYTHING

This begins an experiment where I attempt to carve thoughtspace at its boundaries, to pluck choice thoughtstuff from the æther and distill it to crystaline perfection. There’s great danger in making up new words or repurposing old ones, but programmers seem to do it all the time in designing subroutine names. They avoid disaster by making sure the definition is always within sight of the name usage; to them it is clear, it is not the name that has meaning, but the reference to its definition, reified in the (arbitrary) name. Programmers have an easier time, because programs and words are different stuff, but when the definition of a word is a bunch of other words, it gets convoluted (see what I did there?). Sometimes people get confused and think that words have meaning, that there should be a right definition, or that some words are real and others made up. How silly, to hold all words in a global namespace. For practical reasons its a hard problem to fix, but fortunately we live in the ~future~, where references can be reified directly into hyperlinks, so meaning need never be ambiguous or unfamiliar! I encourage everyone everywhere to hyperlink the definition to every term that may be even a bit unclear. This forces us to make the way we’re using a word explicit, and gives more freedom of expression.

Language Proposal: Meme dereferencing operator

I lied, no programming today. Instead, metaphysics! But actually, natural language is very similar to computer language, except everyone’s afraid to write a standard for it.

Before we start, I’d like to clarify my meaning of meme, given the cultural baggage it has acquired. By meme I refer to “formal cultural entities”, such as religions, companies, political parties, etc. But also to such phenomena as “the fashion of Victorian nobles“.

Additionally, I use “thought” not necessarily to mean a thing which one thinks, but as a thing which one may think, a conceptual ideal. This definition is tricky. In some sense, thoughts are everything. For if you cannot think it, you cannot perceive it, and so it exists in no meaningful sense.

EDIT: Since becoming acquainted with the lesswrong community, I’ve realized the concept I’m describing here is more commonly known as a “thing” in “thingspace“, to be deconvoluted from a “thought” in “thoughtspace”. 

Now, let us proceed by analogy and trust for now that there is a point hidden here somewhere.

(1) You can write point-free style in Perl with a little (a lot) of finagling, but would you WANT to? Of course not! By not allowing functions (and higher order combinators) as first class, you discourage the use of certain styles. In the same way, EVERY language feature subtly affects the kinds of programs that are commonly expressed in the language.
Now, a program (source code) is merely a description of a procedure  In an ideal world, we’d be able to traverse the space of procedures and pluck the right one directly. But the space of procedures is too large to be first class in our universe, so we need programs to give us a handle into this space.

(2) Now consider a similar problem:
One of the great abstractions of “high level” languages is being able to pretend that we’re really dealing with values directly: the addresses are hidden. In reality the machines have to identify addresses to push data around to, a fact that becomes painfully apparent when working with (ex.) Assembly or C pointers. But actually, C pointers are an elegant approach to address specification. With pointers, we do not specify memory addresses directly, the space of address is too big to be practical. (Though, we could specify them directly if we chose, the space is not so big as to be impossible.) Instead, we obtain the address of the relevant data through the reference operator.

Enough computers, what does this have to do with natural language?

Recall problem (1). Natural language serves the same purpose as a program, where procedures are thoughts. We have no way to directly specify objects in this (mindbogglingly infinite) “thoughtspace” so we use words as a handle into the space. But here’s a problem: thoughtspace is big. Really big. Armed with only our limited language, some ideas will take infinite time to express (consider the completion of the rationals to the reals). Now, you may wonder if only infinite ideas require infinite time. That would certainly be a nice property and is a valid question. However, given the incredible vastness of thoughtspace, I suspect that there exists an infinite family of “holes”: Seemingly innocuous ideas which nevertheless cannot be expressed in finite time (imagine spiraling towards a central point, refining your statement with an infinite series of words.) Even if this is not the case, weaken “infinite time” to “prohibitively long time”, and I think there is little question that this is a true problem.
Any given hole can be plugged by anchoring it to a reference point in our universe, either a physical pattern (“that’s a tree”), or via reference (“the way that a clock moves is clockwise”). Thus, the holes are dependent on the language; the language shapes the ideas we can express.

Necessarily, the things which exist, the “reified thoughts”, are only a small subset of possible thoughts. This shapes our own language, as things which readily exist are much easier to encode in speech than those which must be conceived by analogy. As beings of abstraction, we can perceive certain high level abstractions directly, as “first class”. Ex. A forest is a collection of trees, but a tree is a tree. We naturally perceive it as a unit, though in reality a tree is a complex collection of processes. We can easily do this for things which exist at a “level of abstraction” ≤ our own (The case of equality is particularly interesting, but I will not get into it at the moment).

Finally, we may consider memes. Memes are in some sense, the simplest step up from our current level of abstraction. We cannot perceive them directly as a single unit because we are a part of them (or they a part of us, depending on your perspective), in the same way that a (very intelligent) cell could not easily conceive of the body to which it belongs. Because of this, we find it hard to describe memes. A common way of referring to complicated memes without regurgitating their entire description is by implicitly tying them to more easily expressible notions, kind of like a generalized synecdoche. That is, by listing other easily named memes which are “commonly” associated with it under certain circumstances.

This method causes a host of problems, which unnecessarily limit the expressible statements. Of primary concern is ambiguity. It is often not clear when one is refering to the literal idea or the meme associated with the idea. This problem is often resolved by cultural context. That is, people of a certain similar mindset will understand the difference in their own communication, but this is not stable across mindsets, it is almost impossible to communicate in this way across large cultural gaps.
There’s a related problem. By nature of our language, an unqualified statement (usually) contains an implicit ∀. This directly conflicts with implicit meme dereferencing, and the interpretation of which is intended is subject to ambiguous context. Mixing up ∀ and “associated meme” is a dangerous thing to do and can lead to sweeping generalizations. Remember, we are referring to people here, and sweeping generalizations lead to various forms of prejudice.
This is the heart of the problem: the meme is referred to by association with a concrete idea, but exact relation between the concrete idea and the meme is unspecified and ambiguous. It can be avoided by making the link explicit, but the amount of qualification required to avoid this ambiguity is prohibitively expensive, so these types of statements tend to simply be avoided, limiting our expressive power greatly.

While truly fixing this problem essentially requires a new, carefully designed language, we can make an immediate improvement by at least specifying when we are making a connection. To this end, I propose a new primitive connective: ☀ to mean roughly “The meme primarily inhabiting  and to be used as a sort of “dereference” operator. This will at least allow an unambiguous instantiation of “association”. While it cannot represent more complicated associations than “primarily inhabiting , it covers most common use cases. There are issues with ambiguity when multiple memes may inhabit the same collection of people, which becomes more severe when the memes are mutually exclusive. Correct and clever usage of ☀ can remedy this. It is helpful to imagine trying to indicate an animal to someone where you are only allowed to speak of its environment. Ex: ☀Temperate city skies. Can you guess Pigeon?

I’ve played a little trick here. It is not immediately clear that “people inhabited  is a consistent description of memes. Why should memes associate like that? Genes associate because they can only be combined in discrete pairings with small mutation (mating), so you’re going to get something close to the parents. Memes combine in much more complicated ways, and it’s not clear that they would preserve these associations  In fact, there’s a deeper reason for why these associations hold. In biological evolution, organisms reflect their environment. In some sense, a successful organism is a “statement about how to survive in that environment”. What’s interesting about memes is that they act as both the environment and the replicator. More on this later.