Search This Blog

Compression as Intelligence

Let me take a stab at defending compression as equivalent to intelligence.

Standard string compression (LZW, etc.) works by understanding and then exploiting the sequencing rules that result in the redundancy built into most (all?) languages and communication protocols.

Compression is necessary in any storage/retrieval/manipulation system for the simple reason that all systems are finite.  Any library, any hard drive, any computer memory… all finite.  If working with primary in-situ environments was as efficient as working with maps or abstractions we would never have to go through the trouble of making maps or abstracting and filtering and representing.

It might seem sarcastic even to say it, but a universe is larger than a brain.

You have however stumbled upon an interesting insight.  Where exactly is intelligence?  In classic Shannon information theory, and the communication metrics (signal/noise ratio) upon which it is based, information is a duality where data and cypher are interlocked.  In this model, you can reduce the size of your content, but only if you increase the size (or capacity) of the cypher.  Want to reduce the complexity of the cypher, well you are forced to accept the fact that your content will grow in size or complexity.  No free lunch!

In order to build a more robust cypher, one has to generalize in order find salience (the difference that make a difference) in a greater and greater chunk of the universe.  It is one thing to build an data crawler for a single content protocol, quite another to build a domain and protocol independent data crawler.  It is one thing to build hash trees based on word or token frequency and quite another to build them based on causal semantics (not how the words are sequenced, but how the concepts they refer to are graphed.

I think the main trouble you are having with this compression = intelligence concept has to do with a limited mapping of the word "compression".

Lets say you are driving and need to know which way to turn as you approach a fork in the road.  If you are equipped with some sort of mental abstraction of the territory ahead, or on a map, you can choose based on the information encoded into these representations.  But what if you didn't?  What if you could not build a map, either on paper, or in your head.  Then you would be forced to drive up each fork in turn.  In fact, had you no abstraction device, you would have to do this continually as you would not be able to remember the first road by the time you took the second.

What if you had to traverse every road in every city you came to just to decide which road you were meant to take in the first place?  What if the universe it self was the best map you could ever build of the universe?  Surely you can see that a map is a form of compression.

But lets say that your brain can never be big enough to build a perfect map of every part of the universe important to you.  Lets imagine that the map-building map you build in order to create mental memories of roads and cities is ineffective at building maps of biological knowledge or physics or the names and faces of your friends.  You will have to go about building unique map builders for each domain of knowledge important to you.  Eventually, every cubic centimeter of your brain will be full of domain-specific map making algorithms.  No room for the maps!

What you need to build sited is a universal map builder.  A map builder that works just as well for topological territory as it does for concepts and lists and complex n-dimensional pattern-scapes.

Do so and you will end up with the ultimate compression algorithm!

But your point about where the intelligence lies is important.  I haven't read the rules for the contest you sight, but if I were to design such a contest, I would insist that the final byte count of each entrants' data also include the byte count of the code necessary to unpack it.

I realize that even this doesn't go far enough.  You are correctly asserting that most of the intelligence is in the human minds that build these compression algorithms in the first place.

How would you go about designing a contest that correctly or more accurately measures the full complexity of both cypher and the content it interprets?

But before you do, you should take the time to realize that a compression algorithm becomes a smaller and smaller component of the total complexity metric the more often it is used.  How many trillions of trillions of bytes have been trimmed from the global data tree over the lifespan of use of MPEG or JPEG on video and images?  Even if you factor in a robust calculation of the quantum wave space inhabited by the humans brains that created these protocols it is plain to see that use continues to diminish the complexity contribution of the cypher no matter how complex.

Now what do you think?

Randall Lee Reetz

Evolution: Pendulum Dance Between Laws of Thermodynamics

For years, I have pursued a purely thermodynamic definition of evolution.

My reasoning is informed by the observation that change is independent of domain, process, or the physical laws and behaviors upon which a system is based.  As the science of thermodynamics has itself matured (evolved), the boundaries of its applicable domain have expanded far beyond its original focus on heat.  It is generally accepted that the laws of thermodynamics apply to ANY system in which change occurs, that the laws of thermodynamics are agnostic to energy type or form.  Furthermore, scientists studying information/communication independently discovered laws that match almost perfectly, the laws of thermodynamics.  This mirroring of domains has thrilled logicians, physicists, mathematicians, and cosmologists who are no more and more convinced that information (configuration) and energy are symmetric with respect to change over time.

Even conservatively, the implications of this symmetry are nothing short of profound.  If true, it suggests that one can, for instance, calculate the amount of information it would take to get a certain mass to the moon and back, and it means that one can calculate how much energy it would take to compute the design a moon rocket.  It means that the much vaulted "E" in Einstein's Relativity equation can be exchanged with an "I" for information (with valid results).  It means, at some level, that information is relativistic and that gravity works as a metric of information.  Same goes for the rules and equations that govern quantum dynamics.

And this from an eyes-wide-open anti-post modernist!

At any event, the symmetric relationship between energy and information (at least with regard to change) provides a singular foundation for all of physics, and even perhaps for all of ANY possible physical system (equally applicable to other universes with other rules).

It would seem that thermodynamics would provide a more than solid base from which to define the process that allows for, limits, and possibly demands the (localized) accumulation of complexity – evolution!

The Zeroth and First Laws of Thermodynamics work to shape and parameterize action. Given the particular configuration immediately prior they insure that the next action is always and only the set of those possible actions that together will expend the most energy.  In colloquial terms, things fall down and things fall down as fast and as completely as is possible.  Falling down, is a euphemism for the process of seeking of equilibrium.  If the forces attracting two objects is greater than the forces keeping them apart, they will fall together.  If the forces keeping them apart is greater than the forces attracting them, they will fall apart.  Falling down reduces a system to a more stable state – a state in which less force is pushing because some force was released. Falling down catalyzes the maximum release of energy and results in a configuration of minimum tension.

The Second Law of thermodynamics dictates that all action results in a degradation of energy, or configurationally speaking, a reduction in density or organizational complexity.  Over time the universe becomes cooler, more spread out, and less ordered.

The falling down dictated by the the zeroth and first law result in particular types of chunking determined by a combination of the materials available and the energy reduced.  About a million years after the big bang, the energy and pressures of the big bang had dissipated such that the attractive forces effecting sub-atomic particles were finally stronger than the forces all around them.  The result was a precipitation of matter as hydrogen and helium atoms in plasma.  After a few hundred million years, the mass in these gasses exerted more attractive energy than the much cooler and less dense universe, and precipitated into clumps that became stars.  As the fusion cascade in these first stars radiated their energy out into an expanding and cooling universe, the attractive force of gravity within became greater than the repulsive forces of nuclear reaction and the starts imploded upon themselves with such force as to expel their electrons and precipitate again into all of the other elements.  These heavy elements were drawn by gravity again into a second generation of stars and planets of which earth is but one lonely example.

You will have noticed that each precipitatory event in our cosmological history resulted in a new aggregate class – energy, sub atomic particles, light atoms, stars, heavy atoms, stars and planets, life, sentience, language, culture, science, etc).  The first two laws of thermodynamics dictate the way previously created aggregate objects are combined to form new classes of aggregate objects.  The second law guarantees as a result of the most contemporary precipitation event, a coincidental lowering of energy/configurational density which allows still weaker forces to cause aggregates in the next precipitatory phase.

If you still aren't following me, it is probably because I have not been clear about the fact that the lower environmental energy density that is the result of each precipitatory cycle optimizes the resulting environmental conditions to the effects of the next weaker force or the next less stable configuration.

For instance, the very act of the strong force to create atomic nuclei, lowers the temperature and pressure to such an extent that the weak force and the electromagnetic force can now overcome environmental chaos and cause the formation of atoms in the next precipitatory event.

This ratcheted dance between the laws of thermodynamics is the why of evolution, and results in the layered grammars that sometimes or at least potentially describe ever greater stacked complexities that led to life and us and what might come as a result of our self same actions as the dance continues.

Stepping back to the basic foundation of causality, it is important to be re-reminded that a configuration of any kind always represents the maximum allowable complexity.  In recent years, much has been made of the black hole cosmologies that define the event horizon as the minimum allowable area on which all of the information within the black hole can be written as a one bit thick surface membrane of a sphere.  The actual physical mechanical reason that this black hole event horizon membrane can be described as a lossless "holographic" recording or description or compression of the full contents of the black hole is complex and binds quantum and relativistic physics.  Quantum because the energies are so great structure is reduced to the structural granularity of basic quantum bits.  Relativistic because at this maximally allowable density everything passing the event horizon has reached the speed of light,  freezing time itself… the event horizon effectively holds an informational record of everything that has passed.

The interesting and I think salient aspect of an event horizon is that is always exactly as big as it needs to be to hold all of the bits that have passed through it.  As the black whole attracts and eats up any mass unlucky enough to be within its considerable influence, the event horizon grows by exactly the bits necessary to describe it at the quantum level.

The cosmological community (including Sir Steven Hawking), was at first shocked by the sublime elegance of this theory and then by the audacious and unavoidable implication that black holes, like everything else, are beholding to the laws of thermodynamics.  The theory predicts black hole evaporation!  Seems black holes, like everything else, are entropically bound.  There is no free lunch. The collapse of matter into a black hole results in a degradation of energy and informational configuration, the self same entropy that demands that heat leak from a steam engine, demands that black holes will evaporate and that eventually, when this rate of evaporation exceeds the rate of stuff falling into it, a black whole will get smaller and ultimately, poof, be gone.

This is heady stuff.  The biggest and baddest things in the universe are limited!  But to me, the most profound aspect of this knowledge is not that event horizons can be describes as maximal causal configurations, but that we are shocked by this!  All systems are, at each moment, the maximal allowable configuration by which those forces and those materials can be arranged.  If they could be arranged any tighter, they would have already collapsed into that configuration.

To say this is to understand that time is not separable from configuration.  As Einstein showed, time is physically dependent upon and bounded by the interaction of mass, distance, energy, and change.  Cosmologists use limits to understand the universe.  The maximal warpage of space-time caused by a black hole's density effectively flattens the allowable granular complexity of the configurational grammar  to binary bits held in the minimally allowable physical embodiment.  But, lower energy configurations, configurations like dogs, planets, and the mechanism by which I am attempting to explain this concept, are bounded and limited by the exact same causal rules.

The difference between a black hole horizon and an idea?  Well it has to do with the stacking of grammatical systems (quarks, sub atomic particles, atoms, molecules, proteans, cells, organs, bodies, culture, language, etc.) that allows for complexities greater than the binary bits, the only stuff allowed to pass through an event horizon.  But these stacked grammars that allow us to be us are every bit as restricted to the same maximally allowable configuration rule that minimizes the size of a black hole's event horizon.  In a system configured by a stacked grammar, the minimum complexity rule is enforced at the transition boundary between each two grammatical layers.


Things fall, but only as fast as the stacked grammars that govern causal reality will allow.  This isn't a metaphor, the speed of diffusion, of degradation, of falling down, is always and in all situations, maxed-out.  The exact same physical topology that bounds the size of the a black hole event horizon contributes to the causal binding effecting the rate at which any system can change.  This is because at the deepest causal layer, all systems are bound by relativity and quantum dynamics.  The grammatical layers built successively on top of this lower binding only serve to further influence entropy's relentless race towards heat death.


[to be continued]

Randall Reetz

The Big Arrow: What Matters and Why










  • hierarchy of influence
  • complexity handling capacity as evolutionary fitness metric
  • decentralized autonomous node computation topology
  • localized least energy optimization vs. topology range-finding and exploration for long range optimization
  • compression as computational grand-attractor
  • causally restricted abstraction space
  • causally calibrated abstraction space
  • self-optimized causal semantics
  • generalize and subsume schemes
  • self optimized stacked grammars
  • causally restricted language
  • universal simulation environment
  • context-optimized language generators
  • context-optimized language interpreters
  • entropy maximization schemes
  • balancing local vs. universal evolution schemes
  • processing economics
  • network nodes vs. software objects
  • networks vs. graphs…
  • generalize and subsume



These are the concepts that bubble up when I ask myself, What matters? and, What matters the most?".  I ask these questions over and over again.  Have for some 40 years.  You can get by not asking these questions, might even thrive, but only because others not so indifferent, have, do, and will ask.

What you are, what we all are, what we will become, and what will come after us, is more the result of the thoughts and actions taken by the few individuals, consciously or not, who have honored these questions, and honored them above all others.  To be sure, survival, at least in the present and local, is not dependent upon asking the big questions.  In fact, as far as the individual is concerned, asking big questions, almost certainly diminishes fitness and reduces the probability of survival.

Much print is devoted to the question of whether and how socially benevolent behavior evolves .  How can moral behavior spread through the gene or meme pool when, at the granularity of the individual, moral behavior frequently allows other individuals to take more than their fair share?

But the same issue is not so controversial or surprising if we shift our focus to competing motivations within a single individual.  How do we ever learn to think long-term or wide-focus thoughts when short-term, narrow-focus thoughts are more likely to increase the likelihood of immediate survival?

Weirder still, there is obviously plenty of evolutionary evidence that wide-focus problem solving has bridged routs to new domains.  Aquatic animals have become land animals and vice versa.  Single-celled animals have become multi-celled animals (presumably though less intuitively, multi-celled animals have evolved the other way, towards single celled animals).  Chemistry has become biology and biology catalyzes chemistry.  And unique to our temporal neighborhood, biology has sprouted culture that is well on its way towards sprouting non-biological life… the first "intentional" life!

But domain-jumping doesn't sit well with traditional views of evolution.  Evolutionists tend to study biology from the perspective of a particular environmental constraint or set of stable constraints.  Within the (self-imposed) bubble of these artificially bounded steady-state environments, evolution certainly seems to be a process of refinement seeking.  In thermodynamics we describe this class of behavior; "seeking the fall line".  In your prototypical energy topology, where peaks mean high energy and chaos and valleys equate equilibrium low energy stability, refinement evolution selects for processes that find their way to the nadir of the local-most valley.  When sliding down the (local) least-energy fall line, there is but this one possible result.

The problem with refinement (as an explanation of evolution) is that it describes a sub-type of change that is peculiarly adverse to the kinds of novelty and acceleration away from stasis that one actually sees in evolving systems.  Refinement in point of fact is the very reverse of sustainable change.  Refinement always seeks a limit.   Becoming, for instance, the best swimmer in the sea, sort of insures that you are so specialized that you will have a hard time changing into anything else but a swimmer.  Refinement sets you up to be stage, environment, ground (the past)… for other things, the things that are more directed towards the forms of evolutionary change that will define the foreground, the action, the object, (the future).

Limit seeking schemes are schemes in which change decelerates over time.  That doesn't sound like a formula that fits the upward accelerating curve of evolution.

This would be a good time to introduce a term I use all of the time, without which, I believe it is impossible to see evolution for what it really is.  The term is "hierarchy of influence".  A hierarchy of influence is a cline, a stack, a pyramid, that relates each of the factors effecting a system according to the degree to which each will effect the the behavior, output, eventual state, or direction of the system of which each is a part.

I know it isn't politically correct to suggest that some parts of a system are more important than others, so I will just say that some factors of a system will have a greater effect over the future than will others.  A hierarchy of influence is an ontology of sorts, or more accurately, a ranking.  On the bottom of the stack, you will have those sub-systems or parts or actors that have an effect on almost everything else in the system, and on top you will have those parts that are more the result of or subservient to the rest of the system.  If you aren't comfortable with that order, just flip it over!  Either way you map it, hierarchy of influence is a powerful tool for the understanding of systems and change.

So, let's look at evolutionary systems through the hierarchy of influence lens.  Here as before, we can apply this new lens locally or globally.  What leads towards success locally is different than what leads to success globally.  As the field of view narrows, a hierarchy of influence favors factors that support refinement.  Process at larger and longer scopes support influencers that reach out side of current domains, influencers that seek a universal understanding of all domains, of domain in general, of change itself, and finally, of the very reason for change, for and understanding of the end game and how best to get there.

Now lets apply the hierarchy of influence filter to the super-system we've just described, the system composed of both localized hierarchies of influence and universal hierarchies of influence.  In any such super-system it should be clear that the local refinement leaning hierarchies will be demoted to the realm of effectors in reference to deep and wide long-range oriented hierarchies of influence.

Ecologists and Population Biologists are keen to point to the fact that most of this earth's biomass comes in the form of single celled animals and plants.  Absolutely true.  It is also true that most of the mass and energy in our Solar System is rather unimpressively ordered hydrogen, helium and a smattering of lithium.  But the future of biology, of complexity, even of mass and energy is much more likely to be sensitive to complex systems than the simple ones upon which they feed.

But before we throw out "refinement" as a category, let me posit a kind of refinement that is a good candidate for the fitness function or filter we see in evolving systems, systems that get better and better and solving more and more diverse problems at a faster and faster rate.

What if we were to re-cast the concept of refinement to mean the refinement of refinement itself?  In stead of refining a particular solution space, we think of refinement in its most general and universal form, a refinement of the definition of refinement.  In doing so, we tip the traditional view of evolution on its head.  Animals, individuals, species, film of every sort become the environment, the conditions, the topology as background as tool as expendable media for the refinement of the ultimate fitness metric. 

I must step in now, interrupt my self, and state the obvious even if the obvious might throw a huge wrench in the logical works of this thesis.  

The distinction I have been outlining, between refinement and domain jumping suggests or could lead some readers to think that I am suggesting that domain jumping offers some form of escape from the laws of thermodynamics.  I have suggested that refinement evolution simply seeks the least energy fall line.  No problem here.  But by contrasting refinement against domain jumping, the reader might be lead to believe that I am suggesting a way around physics, a free lunch, some sort of evolutionary daemon that does what Maxwell's couldn't.  I am not!  Only the next action that takes the least energy can happen next.… no exceptions.  Period.  Domain jumping must therefore, at every moment and in every context, obey the laws of thermodynamics.

Now, it is relatively easy to see how refinement evolution meets these least-energy constraints, but how is it that domain jumping could ever happen?  How would any action ever allow ridge-climbing escape from any concave depression in any energy topology?

Before I continue along this vein of logic, I should probably jump back a pace and clarify what I mean when I say "energy topology".  An energy topology is a graphical depiction of the forces acting upon a region of space.  Some energy topologies are almost identical to real world space.  The undulating surface of the earth under our feet is, at least with regard to gravity, equivalent to the energy topology that restricts motion across its surface.  If I am standing on the side of a mountain and moving 1 foot to my left means I will have to haul my body up half a foot vertically, and traversing 1 foot to the right would allow me instead to fall half a foot, than to slope of the ground is a perfect analog of the energy topology with respect to gravity.  Left to the whims of time and chance, the energy topology I just described would make it far more likely that I would eventually end up more to my right (lower) than to my left (higher).  This is because I would have to use energy to move up the mountain and could actually access energy by moving down the mountain.

Of course there are less obvious energy topologies, energy topologies that do not map to actual terrain. With respect say to choosing a religious belief the energy topology heavily reflects the beliefs already held by one's emendate family, cultural heritage, and other factors.  Choice that differs radically from local norms will require lots more energy, than will conforming.  If one were to plot the energy topology necessary to choose to become Muslim for instance, a child in a museum family would stand on top of a steep hill, and a child born to a Christian family would stand at the bottom of a deep pit.  Energy topologies offer wonderfully obvious illustrations of the forces effecting evolving systems.

Each object or system to be examined acts according to the sum of many energy effectors.  Each of these effectors (physical terrain, social obstructions/accelerators, on-board energy reserves and conversion rates, environmentally accessible resources, etc.) can be plotted separately as an energy topology, but causality is the result of the sum of all energy topologies effecting an object of system.  To illustrate, lets now combine the above two examples.  Lets say that the person on the mountainside, is in the process of plotting their own religious future.  To the right the physical mountain rises, to the left it falls into a valley.  The person standing there is from the Christian village in the valley below.  That person is philosophically attracted to the Muslim faith.  But to learn more, they will have to travel up the mountain to a Muslim village a thousand feet higher.  In this case, the energy necessary to fulfill their philosophical desire will require them to haul their body up the mountain.  And because doing so will also incur the costs associated with going against cultural norms.  Obviously, both topologies must be summed in order to compute the likelihood of both possible choices.  As I am sure you are realizing, the philosophical leaning of our actor can also be represented by an energy topology.  This to must be summed to produce the aggregate energy topology in which our subject must act.

But none of these topologies explain hill climbing.  For that we need to compose yet another energy topology, a topology that expresses the energy held as reserve within the individual actor.

So why ask these questions?  If natural selection asks them down at the DNA level, and across the vast landscape that is evolutionary time, why should we bother asking them again?

Dimensionality and Postmodern Self-Cannibalism

"Parenchyma" and "stroma" – two important words in the fight against ambiguity in any discussion of complex subject matter.

Both are medical lexicon and specify the difference between that part of a system (physical organ) that is (chemically) re-active ("parenchyma") and the part of the same system that is (connective tissue) structure ("stroma").

Of course it is true that structure both indicates and precipitates behavior.  Equally, activity influences and predicts structure.  So, again, things are not so simple as could be hoped.  But words like these allow anchoring in critical discussion.

If one can substitute the much more common words "active" and "structural", why bother further confusing this issue with the introduction of the less common and harder to pronounce "parenchyma" and "stroma"?

Well, because understanding is strengthened through multiple contextual mappings.  The larger and more varied the link graph, the more obvious become the differences between similar and potentially ambiguous topics or the signs we use as reference.

Also, uniquely, these two words signify the classic subject/object, object/ground, mind/body, I/others, specific/general, instance/class ambiguity in information, language, communication, computation… and existence.

The post modern position, an argument in reaction (over reaction) to the modern or classical "reductionist" (their word) world view is that hierarchical relationships (the kind that would result in a definable difference between a thing and the larger thing of which it is a part) do not in fact exist.  The post-modernists present as absolute, that all relationships are "relative" (their word), because they say there is no reliable place to stand by which to judge hierarchy, that relationships are inherently biased to the observer.

What is the baby?  What is the bathwater?  The postmodernists, frustrated and angry, did King Solomon proud and threw them both out.

If there is anything of use to be learned from this mess it won't come from the (supposedly) blind "all" of classical thinking, or the fruitless "nothing" of the post modernists.  I will half agree that relationship is vantage dependent (the answer you get back from the question, "Are you my mother?" depends on who is asking), but this dependence isn't purely local.  Vantage can be retooled such that it is, as are spacial dimensions, something that can apply universally at all times and all places at once.  By this gestalt, vantage is defined ubiquitously, ridding the hopelessly circular grounding problem at the center of the postmodern argument.  When vantage is defined as dimension, it applies equally to all objects.  You can switch dimensions at will and not loose the absolute and hierarchical relationships the classicists rightfully found so important.

Yes, the postmodernist (re-invention of the) word "relative" was awkwardly stolen (rather ignorantly) from Einstein.  The difference, Einstein made the world more measurable by showing how energy and space-time are transmutable and self-limiting.  The postmodernist's naive re-appropriation of Einstein's empirically derived authority, does the opposite – making it impossible to compare anything, ever.  The irony here is profound.  The postmodernists first stand upon the authority acquired through carful and causal measurement, then they say such measurement isn't possible!

God help the human race.

By the way, if you look carefully at Einstein's two papers on Relativity, you will see the underpinnings of the shiftable but universal vantage that a dimensional grounding provides.  There are rules.  1. A dimension must apply to everything and through all time.  2. You can switch dimensional vantage at any time, but 3. You can only compare two things if you compare them within the context of the same dimensional vantage.

Is an attribute a dimension?  No.  An attribute situates an object in reference to a dimension.  An attribute is a measurement of an object according to a property shared by all such objects in that dimension.  A property is measurable for a class of objects as a result of the rules or grammar or physics that define a dimension.

The absolute causal hierarchy made all the more impenetrable by Quantum and Relativistic theory makes the postmodern "hard relativist" tantrum all the more ridiculous – especially in light of the fact that postmodernists constantly turn to these twin pillars of physical theory as support of their position.  The fatal logical mistake here is the misrepresentation of a property ("relative vantage") as a dimension (rules that provide a stable base from which to define properties – in this case, the novelty of experience guaranteed by the first[?] law of causality:  that no two bodies can occupy the same space at the same time).

Randall

Probability Chip – From MIT Spin-Off Lyric Semiconductor

Photo: Rob Brown


A Chip That Digests Data and Calculates the Odds (New York Times, Aug, 17, 2010) and the Lyric Semiconductor company web page Probability Processor: GP5 (General-Purpose Programmable Probability Processing Platform).  Looks like a variation on analog processing accessed within a digital framework.  And here is an article from GreenTech Can 18th-Century Math Radically Curb Computer Power? which explains the chip in reference to Thomas Bayes and error correction.  The crossover between error correction and compression is profound.  Remember; intelligence = compression.


Randall