Posted by
Phil Henshaw-2 on
URL: http://friam.383.s1.nabble.com/order-and-disorder-tp522512p522515.html
Steve
> Phil, et al -
>
> > It's that difference between data and information
> > that I often find skipped over. It's not cut and
> > dried and it seems odd
> > that 'information theory' always describes it as cut and
> > dried. Then
> > when data compacting algorithms, which are a great boon but nothing
> > more, get used as causal explanations for complex organization in
> > nature, I think the distinction between our tools and our
> > subjects is getting lost.
> When I first started seriously contemplating things like information
> theory, I had the (dis?)advantage of not being schooled in it
> directly but instead having a lot of the necessary tools to
> contemplate it, to try to reinvent some of the ideas others
> had already put out for us.
> Specifically, I had a grounding in statistical physics and markov
> models, and a grounding in computer logic and programming, but not
> specifically in information theory. That's what degrees in math and
> physics and a strong interest in computers got you back in the 70's I
> guess.
>
> What that lead me to contemplate a *lot* was "relative entropy"... or
> the simple notion that the amount of effective entropy in a string of
> bits (or an ensemble of physical states) was highly dependent on your
> knowledge of the string of bits (physical system). If you assume a
> string of bits has some order in it and especially if you
> have external
> knowledge (a model of that order) to predict with, then the "entropy"
> is effectively less. This would be why, for example, that jpeg's
> simple cosine-model for "predicting" bits in a string (along
> a line of an image) works so well on certain types of images
> (3D objects with
> shaded surfaces..) where run-length encoding did not.
Great! A cosine model is a good useful incorrect universal model of
shape in data. There may be better ones, based on the same implied
principle that in regions of continuity a small number of points gives
you a simple rule for all the points in-between. The one I use is even
more naturalistic and easier to calculate, the rule that the 2nd or 3rd
derivative at a point, or something, is the same approached from either
direction... If anyone knows anyone, I'd like to talk to people
interested in generalizing this and the related issues. My math isn't
really strong enough.
Still, isn't the basic question when to make the jump from recognizing
patterns in the data to recognizing things in the world? Huge steps
have been made in pattern screening, fingerprints & text searches and
other complicated things. Isn't the 'holy grail' to do the same for
complex systems? What would you look for? Continuities and breaks,
periods when shapes have higher derivatives all of the same sign, etc.
..'relative entropy' sounds a little like the concept of 'random with
respect to' the local pattern discontinuities in organizational
hierarchies. The behavior of materials is exactly the larger scales of
the behavior of their molecules. The question is whether the behavior
of the whole arises from individually orderly molecules behaving
'randomly with respect to' the whole.
>
> >
> > The clue to me is that they compare translating between different
> > languages to lengths of dots. To translate good prose from
> > English to
> > Japanese you have to teach Japanese how to speak English,
> > because the
> > concepts are different. It's a real art. I think
> > information theory
> > assumes all concepts are the same, and that's probably inaccurate,
> > even if data density is a fascinating and very important concept.
> I've also thought a bit about this in terms of basis spaces or basis
> vectors.
> One might suggest that each natural language (say English or
> Japanese), represents a basis space for meaning, ideas,
> thoughts, creative
> expression,
> etc. And any meaningful utterance (a single word, a sentence, a
> dialog, a
> book, an encyclopedia, a library) in that language is a vector (I
> suppose
> non-meaningful ones are too, whatever the Jabberwocky in the
> Slivy Toves
> that means!). And it is not clear if these two basis spaces truly
> "cover"
> the same territory.
>
> My variation on your observation is to note that to learn a
> language fully requires learning the culture of the language
> fully, which most (all by definition?) members of a given
> culture never even achieve. We revere the OED because it
> gives us first-known uses of words and their context and so
> forth... most of us are amazed half of the time when we look
> up a word, to discover it's (apparent) origins and/or
> multiply nuanced uses, etc.
Yes, the same idea. Maybe the most useful word for it is 'nuance',
those feint and powerful paths of association. Not much nuance to data!
(unless you read between the lines, of course)
> >
> > I'm not sure I see the circular relation you describe,
> > though. There
> > are things left out, as you suggest, like not knowing what
> > to call a
> > pattern without knowing what the pattern is supposed to
> > tell you (i.e.
> > providing no analysis method whatever but snap judgment). I think
> > that's
> > what's finessed with using a picture of a person. Lots of
> > images pop
> > up without a question. The image itself doesn't tell you much
> > actually. Maybe that's what you mean, that the sweeping
> > generalities
> > rely on your automatic judgments of the image before you
> > ask where any
> > judgments would come from?
>
> I don't imagine that we only have one or two levels of
> pattern matching going on... I think we have many levels and
> not all of them coplanar or parallel. I've had
> experiences where at a glance I saw a "thing" which
> caused me to think I had seen some other "thing" which on
> careful review (looking more carefully at "thing one" and
> considering all of the ways I might have extracted the image
> or idea of "thing two" from it" I could see
> lots of levels of patterning. A series of "dots" can
> suggest a line... several
> of these "lines" can suggest an arc or an edge or a boundary,
> and each of
> these can suggest some negative or positive space which can
> suggest an area or an object which can suggest higher orders
> of objects like an animal or a vehicle or a person which can
> suggest a relationship or a scene (flight or fight!) or ...
Sorting out the 'powers of suggestion' in any data is definitely not
easy. The closest information theory would seem to come is with the
algorithms, that I have no real understanding of but can see how well
they work, for making up rules of association between patterns and then
skimming matches from huge sets of alternates. The one thing in the
natural world that seems to do something similar, by a different means
perhaps, is human thought. I don't think thought is either digital or
analog, but the outside appearance is that people have a similar amazing
facility at word puzzles as Google has on the web, and neither have a
proportionate grasp on other kinds of meaning. Isn't there something
similar in the disproportionate performance levels on very similar
tasks?
There are lots of interpretation tasks neither man or machine seems
likely to ever master, but another one that might be mastered by either,
by different means perhaps, is reading curves from dots. It's one of
those natural navigation tasks, to read as far ahead on the curves as
possible to minimize the steering necessary. It's the core problem for
'homing systems', I think, which the world produces in abundance and
variety. Basic thermostats only respond to the set point crossings,
above or below, but could reasonably be engineered to respond to the
system's implied thermal mass (past responsiveness) and the rate of
approach of the set point(implied energy flux), just reading the
dynamics of the curve.
> >
> > Jochen is suggesting that it's really the mixture of order and
> > disorder that's hard to describe. Of course I don't
> > disagree, mixing
> > things makes it quite difficult, toward impossible, to
> > separate noise
> > from your signal, for example.
>
> One man's noise is another man's signal!?
Well, sure. A man looks at a basket of apples as something to buy and
his son looks at it as something to eat! That discrepancy might not
depend of choosing which source noise to ignore, or it might. They
might both be overlooking the dirt and worm holes the mom would notice
right off, giving her a much bigger picture, and leading her to quickly
scurry the two boys away from that stall in the market!
>
> > I don't think the complexly organized things are often recurrent
> > patterns in a pervasive disorder, but usually independent
> > and cohesive
> > real things, pushed into the background behind the appearance of
> > disorder because the disorder is distracting. Maybe the reason
> > complex
> > order is tantalizing is that there is an answer somewhere.
> > Some will
> > want to treat the mixed data we get as purely a
> > mathematical analysis
> > puzzle, but I think of it as a thing-out-there puzzle, which the
> > analysis can be quite useful for.
>
> I think "noise" is a bogus (or at least relative) concept.
> Noise is what we wish to ignore for a given purpose.
> At LANL, for example, we model
> lighting very thoroughly because we want to remove they
> constitute virtually all of the "background noise" when you
> are listening for the EMP from a nuclear explosion...
> This "noise" is hugely signal to meteorologists...
As you were mentioning before, there are many kinds and layers of signal
and noise. That's maybe my main objection to what I was taught in data
analysis, essentially to treat data as if all the pattern you didn't
understand was made by the same universal noise generator. It ain't so.
In studying changes in fossil shape over time there's a basic choice at
the beginning. Is the irregularity in the data produced by sampling
changes clustered around a smooth curve that has multiple scales of long
and short term fluctuation?, or is it produced by a single random
jumping machine that takes off from each point to land precisely at your
next point?? The two starting assumptions look almost the same, even
to careful analysis sometimes. I say, try'm all, see if anything works.
When you first find one that works it may give you a major component you
can subtract out to more clearly see the others!
Phil
> - Steve