Search This Blog

Compression as Intelligence

Let me take a stab at defending compression as equivalent to intelligence.

Standard string compression (LZW, etc.) works by understanding and then exploiting the sequencing rules that result in the redundancy built into most (all?) languages and communication protocols.

Compression is necessary in any storage/retrieval/manipulation system for the simple reason that all systems are finite.  Any library, any hard drive, any computer memory… all finite.  If working with primary in-situ environments was as efficient as working with maps or abstractions we would never have to go through the trouble of making maps or abstracting and filtering and representing.

It might seem sarcastic even to say it, but a universe is larger than a brain.

You have however stumbled upon an interesting insight.  Where exactly is intelligence?  In classic Shannon information theory, and the communication metrics (signal/noise ratio) upon which it is based, information is a duality where data and cypher are interlocked.  In this model, you can reduce the size of your content, but only if you increase the size (or capacity) of the cypher.  Want to reduce the complexity of the cypher, well you are forced to accept the fact that your content will grow in size or complexity.  No free lunch!

In order to build a more robust cypher, one has to generalize in order find salience (the difference that make a difference) in a greater and greater chunk of the universe.  It is one thing to build an data crawler for a single content protocol, quite another to build a domain and protocol independent data crawler.  It is one thing to build hash trees based on word or token frequency and quite another to build them based on causal semantics (not how the words are sequenced, but how the concepts they refer to are graphed.

I think the main trouble you are having with this compression = intelligence concept has to do with a limited mapping of the word "compression".

Lets say you are driving and need to know which way to turn as you approach a fork in the road.  If you are equipped with some sort of mental abstraction of the territory ahead, or on a map, you can choose based on the information encoded into these representations.  But what if you didn't?  What if you could not build a map, either on paper, or in your head.  Then you would be forced to drive up each fork in turn.  In fact, had you no abstraction device, you would have to do this continually as you would not be able to remember the first road by the time you took the second.

What if you had to traverse every road in every city you came to just to decide which road you were meant to take in the first place?  What if the universe it self was the best map you could ever build of the universe?  Surely you can see that a map is a form of compression.

But lets say that your brain can never be big enough to build a perfect map of every part of the universe important to you.  Lets imagine that the map-building map you build in order to create mental memories of roads and cities is ineffective at building maps of biological knowledge or physics or the names and faces of your friends.  You will have to go about building unique map builders for each domain of knowledge important to you.  Eventually, every cubic centimeter of your brain will be full of domain-specific map making algorithms.  No room for the maps!

What you need to build sited is a universal map builder.  A map builder that works just as well for topological territory as it does for concepts and lists and complex n-dimensional pattern-scapes.

Do so and you will end up with the ultimate compression algorithm!

But your point about where the intelligence lies is important.  I haven't read the rules for the contest you sight, but if I were to design such a contest, I would insist that the final byte count of each entrants' data also include the byte count of the code necessary to unpack it.

I realize that even this doesn't go far enough.  You are correctly asserting that most of the intelligence is in the human minds that build these compression algorithms in the first place.

How would you go about designing a contest that correctly or more accurately measures the full complexity of both cypher and the content it interprets?

But before you do, you should take the time to realize that a compression algorithm becomes a smaller and smaller component of the total complexity metric the more often it is used.  How many trillions of trillions of bytes have been trimmed from the global data tree over the lifespan of use of MPEG or JPEG on video and images?  Even if you factor in a robust calculation of the quantum wave space inhabited by the humans brains that created these protocols it is plain to see that use continues to diminish the complexity contribution of the cypher no matter how complex.

Now what do you think?

Randall Lee Reetz