speaking of analytics

classic Classic list List threaded Threaded
33 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: speaking of analytics

Merle Lefkoff-2
Hi Nick,

Thanks for the great metaphor!  I'm an omelet too, and OMG I've just become Visiting Professor in Conflict Studies at Saint Paul University in Ottawa, Canada.  Stu Kauffman and I spent some time together at his house on Crane Island two weeks ago, and he's helping me stir the omelet.  I'm teaching a course to Canadian government officials on how to use CAS for what they call "Integrative Peacebuilding"--they have Trudeau, we have Trump.  Agghh!

Warmest regards as always,



On Fri, Sep 9, 2016 at 8:46 AM, Nick Thompson <[hidden email]> wrote:

Hi, Roger. 

 

That was some hurricane, huh?  I thought of you in Boston Harbor, battened against the lashing gales. 

 

Speaking of analytics, I was struck by the notion of having a prediction without a theory.  I am wondering if that is actually possible.  I know that theories are really useful for making predictions, but can one actually make a prediction without one?  Perhaps meteorology would be a good domain in which to think this through.  The lowest level of prediction (and one that works remarkably and embarrassingly well) is to predict that tomorrow’s weather will be the same as todays …. “persistence forecasting.”  But even that entails a theory that the weather is stable.  Then one can have dynamic persistence theories, which one would apply to the stuff floating down a river ... the river will continue to flow down to me.  The jet stream is sometimes like that.  And jet “stream” is, after all, a metaphor.  And this is making me think that we ought perhaps to talk about “levels of theory”, rather than “theory/non-theory”, persistence forecasting being the application of a VERY low level theory. 

 

Anyway, I am probably bending this thread horribly.  Off on my own cloud.  Age has addled my brain, and now the heat has cooked it.   I am an omelet. 

 

Take care and keep afloat.

 

Nick

 

Nicholas S. Thompson

Emeritus Professor of Psychology and Biology

Clark University

http://home.earthlink.net/~nickthompson/naturaldesigns/

 

From: Friam [mailto:[hidden email]] On Behalf Of Roger Critchlow
Sent: Thursday, September 08, 2016 7:21 PM
To: The Friday Morning Applied Complexity Coffee Group <[hidden email]>
Subject: Re: [FRIAM] speaking of analytics

 

See the result of the AI judged beauty contest?  Apparently the training set needed more curation.  Very teachable moment.

-- rec --

 

On Sep 8, 2016 7:10 PM, "Marcus Daniels" <[hidden email]> wrote:

Racial profiling is a single dimensional predictor.  It's bad because it is regressive, not because race is a useless predictor.
There are lots of attributes like that, and big data is just puts them together to predict aggregate behaviors about people without really having a theory of mind of that individual or a theory of mind at all.    Like trying to learn from Google without understanding the reading and writing of human language.    I think the FOIA type concerns should be fixable in principle.  But in practice, these databases and algorithms are tightly held intellectual property that the government licenses from companies.   Without sweeping legislation, the government can't get their hands on it, and the people interested in applying these systems, like law enforcement, aren't necessarily the most curious people in the world to begin with.   Push a button and get an authoritative answer.   What could be better?  You're guilty because the system said so.

-----Original Message-----
From: Friam [mailto:[hidden email]] On Behalf Of glen ?
Sent: Thursday, September 08, 2016 4:54 PM
To: The Friday Morning Applied Complexity Coffee Group <[hidden email]>
Subject: [FRIAM] speaking of analytics


The case against big data: "It’s like you’re being put into a cult, but you don’t actually believe in it"
http://www.salon.com/2016/09/08/the-case-against-big-data-it-is-like-youre-being-put-into-a-cult-but-you-dont-actually-believe-in-it/

> But it’s opaque right? Which is also what a lot of these things have in common.
>
> It’s opaque, and it’s unaccountable. You cannot appeal it because it is opaque. Not only is it opaque, but I actually filed a Freedom of Information Act request to get the source code. And I was told I couldn’t get the source code and not only that, but I was told the reason why was that New York City had signed a contract with this place called VARK in Madison, Wisconsin. Which was an agreement that they wouldn’t get access to the source code either. The Department of Education, the city of New York City but nobody in the city, in other words, could truly explain the scores of the teachers.
>
> It was like an alien had come down to earth and said, "Here are some scores, we’re not gonna explain them to you, but you should trust them. And by the way you can’t appeal them and you will not be given explanations for how to get better."

--
glen

============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com
============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com


============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com



--
Merle Lefkoff, Ph.D.
President, Center for Emergent Diplomacy
Santa Fe, New Mexico, USA
[hidden email]
mobile:  (303) 859-5609
skype:  merle.lelfkoff2

============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com
Reply | Threaded
Open this post in threaded view
|

Re: speaking of analytics

gepr
In reply to this post by Nick Thompson
I simply mean that, yes, all predictions require some form of "theory", even if it's solely the unconcious (or programmed in) ontology used to look at, think about, filter the world/data.  I.e. any form of inference is subject to the organizing effect of the machine doing the inferring ... premature registration biased by one's own perspective.  But it's too strong to assert that all types of inductive inference will be biased or mis-organized by that a priori ontology/perspective.

And making that argument against the induction tools (especially considering the more hybrid inference you get in typical machine learning, where one does a little induction, a little deduction, and a little abduction in order to arrive at a useful solution) could be the "you do it too" fallacy.  If all the accuser's reasoning _does_ require the a priori organization, accusing any given set of machine learning methods of doing it too is, effectively, "You do it too!"  It's not an adequate defense of doing it.

It might be reasonable to assert that induction is the only (or closest to pure) form of bias-free inference available to us.  For example, one could brute-force evaluate all the theorems in a simple formal system, then iteratively (automatically) modify the language according to some schema, then brute force evaluate all the formable sentences in the new language.  Etc.  Take that to its extreme and you get fully automated theory construction (even if the "theories" make no sense to any humans).


On 09/09/2016 07:18 PM, Nick Thompson wrote:
> Glen wrote:
>
> *There's no doubt that any form of inference done by humans is subject to premature registration or even apophenia.  But the inverted claim, that _all_ registration is premature (or imaginary) is way too strong, and perhaps a case of tu quoque.*
>
> Narcissist that I am, I assume you are punishing me for all the weird language I have inflicted on the list over the last 12 years.   I humbly acknowledge the punishment.
>
> Now:  Could you explain what you meant? (};-)]



--
☣ glen

============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com
uǝʃƃ ⊥ glen
Reply | Threaded
Open this post in threaded view
|

Re: speaking of analytics

Nick Thompson
Thanks, Glen.

I think it's the word "registration" that has me most confused.  Can you help a bit further?

Does inductive inference involve metaphor?  That seems to be the lurking question, here.  Inductive inference is famously incomplete without some fundamental assumptions (abductions) concerning the kind of world we are in ... a stable one, for instance.  So, I would answer the question, yes.  I would not, of course assert that all metaphoric thinking is wrong in all regards.  Metaphoric thinking would be a pretty poor tool, if that were the case.  

Nick



Nicholas S. Thompson
Emeritus Professor of Psychology and Biology
Clark University
http://home.earthlink.net/~nickthompson/naturaldesigns/


-----Original Message-----
From: Friam [mailto:[hidden email]] On Behalf Of glen ?
Sent: Saturday, September 10, 2016 7:14 PM
To: The Friday Morning Applied Complexity Coffee Group <[hidden email]>
Subject: Re: [FRIAM] speaking of analytics

I simply mean that, yes, all predictions require some form of "theory", even if it's solely the unconcious (or programmed in) ontology used to look at, think about, filter the world/data.  I.e. any form of inference is subject to the organizing effect of the machine doing the inferring ... premature registration biased by one's own perspective.  But it's too strong to assert that all types of inductive inference will be biased or mis-organized by that a priori ontology/perspective.

And making that argument against the induction tools (especially considering the more hybrid inference you get in typical machine learning, where one does a little induction, a little deduction, and a little abduction in order to arrive at a useful solution) could be the "you do it too" fallacy.  If all the accuser's reasoning _does_ require the a priori organization, accusing any given set of machine learning methods of doing it too is, effectively, "You do it too!"  It's not an adequate defense of doing it.

It might be reasonable to assert that induction is the only (or closest to pure) form of bias-free inference available to us.  For example, one could brute-force evaluate all the theorems in a simple formal system, then iteratively (automatically) modify the language according to some schema, then brute force evaluate all the formable sentences in the new language.  Etc.  Take that to its extreme and you get fully automated theory construction (even if the "theories" make no sense to any humans).


On 09/09/2016 07:18 PM, Nick Thompson wrote:

> Glen wrote:
>
> *There's no doubt that any form of inference done by humans is subject
> to premature registration or even apophenia.  But the inverted claim,
> that _all_ registration is premature (or imaginary) is way too strong,
> and perhaps a case of tu quoque.*
>
> Narcissist that I am, I assume you are punishing me for all the weird language I have inflicted on the list over the last 12 years.   I humbly acknowledge the punishment.
>
> Now:  Could you explain what you meant? (};-)]



--
☣ glen

============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com


============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com
Reply | Threaded
Open this post in threaded view
|

Re: speaking of analytics - data mining

Prof David West
In reply to this post by Nick Thompson

Once upon a time there was "information." People loved information and kept abundant amounts of in their heads and used it as a means of commerce among themselves, sharing it and savoring it and finding profit in it.

One day a new king, King Codd, conquered the realm and took all the information away from all the people. He dissembled all the information into meaningless pieces, called "data" and locked it away in an impenetrable matrix called a "schema." This required great effort, a process called "normalization," but it was, "worth it, because I can prove, mathematically', that data can be reassembled with the magic incantations of SQL." Information was thrown into the dungeons of thousands of Relational DataBase Management Systems (RDBMS), never to bee seen in its beautiful original form again.

Unfortunately, it proved impossible for the people to normalize properly, Codd-Normal-Form, had no algorithm or process to assure it was achieved and no one could master SQL - the logic was simply not something that most people could master. And, if you really did achieve proper normalization, it was so inefficient it was not practical, so everyone "demoralized" their vast stores of data so they could use them, poorly and in a crippled manner, to try and get some of their beloved information back.

The worst part of this story came later when the people found that the impenetrable matrix — the schema that held all their information hostage in the form of dissociated data, connected only with predefined "relationships" — made it impossible to retrieve any and all the "information" that they wanted and needed.

In anguish, the people invented an entire new profession - Data Mining -  that essentially 'crushed' the data stores creating gravel composed of individual datums and put the result in a different, more malleable matrix — live gravel in cement and sand and water (before the matrix dries). From this new medium the people would pluck bits of gravel and place them next to each other an proclaim, "Look! Information!"

Alas, this new "information" proved to lack most of the meaning that was intrinsic to the information the people once new and loved. All the semantics had been stripped from the old information when it was first placed in the RDBMS dungeons. The new juxtapositions of datums that data miner's called 'information' rapidly proved to be a pale imitation of the original. Once a video junkie, working as a clerk at the video rental company around the corner, could make accurate and reliable predictions about what movie you might want to view next — because of all the natural information he had in his head. But now, even the great wizard, NetFlix, despite all the algorithmic prowess and all the mined data it possesses, cannot make as accurate a prediction as the teenage clerk.

To this day, most of the world suffers from the massive evils perpetrated by the Wicked King Codd. Information, once abundant and freely shared with little more organization than the 'story', remains a rare and precious thing.

Nick - this is my metaphor, can you discern my theory and guess how, when, where, and why I utilize that theory?

dave west


On Fri, Sep 9, 2016, at 12:37 PM, Nick Thompson wrote:

And data “mining” is a metaphor.

 

Now people claim to use metaphors “metaphorically”, by which they mean that they mean nothing by them.  But it is my “teery”* (and it is all mine) that nobody uses a metaphor but that hizr thinking is influenced by it.  The influence can be inexplicit, in which case the user is blind to its effects on himmr, or explicit, in which case the user’s imagination is enhanced by its use and less likely to be misled by its misuse.   I would like to explore this “teery” using “Data Mining” as an example.  How does thinking of data as encased in a non-dynamic subterranean matrix shape our (your) thinking for good or ill?

 

*cf, Monte Python’s Flying Circus

 

Nick Nicholas S. Thompson

Emeritus Professor of Psychology and Biology

Clark University

http://home.earthlink.net/~nickthompson/naturaldesigns/

 

From: Friam [mailto:[hidden email]] On Behalf Of Eric Charles
Sent: Friday, September 09, 2016 11:31 AM
To: The Friday Morning Applied Complexity Coffee Group <[hidden email]>
Subject: Re: [FRIAM] speaking of analytics

 

Marcus,

That's an interesting distinction. Is it the case that by "theory" Nick was referring to something verbal and explicitly metaphorical, or would the results of data mining, which one sought to validate on a different sample, count as a "theory".

 

So, for example, if my data mining of Marine data found that tying shoes left-to-right predicted success at Officer Candidate School, and I then went to test for that "prediction" in a later sample of incoming officer candidates, to what extent is my prediction based on "a theory". 

 

Of course, "data mining will be a  useful way to uncover patterns" is itself a theory, applicable in some domains but not others (i.e., not all domains of inquiry will contain the sought after patterns in a long-term stable form).

 

Eric 

 



-----------
Eric P. Charles, Ph.D.
Supervisory Survey Statistician

U.S. Marine Corps

 

On Fri, Sep 9, 2016 at 10:51 AM, Marcus Daniels <[hidden email]> wrote:

I know that theories are really useful for making predictions, but can one actually make a prediction without one?”

 

Yes, that’s what data mining is:  Take a large corpus of data, find some statistically rare relationships, and then test for their predictive value on another large corpus of data.     In this way one can predict things without really having any kind of theory or even domain knowledge.

 

Marcus


============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com

 

============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College


============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com
Reply | Threaded
Open this post in threaded view
|

Re: speaking of analytics - data mining

Marcus G. Daniels

 

In anguish, the people invented an entire new profession - Data Mining -  that essentially 'crushed' the data stores creating gravel composed of individual datums and put the result in a different, more malleable matrix — live gravel in cement and sand and water (before the matrix dries). From this new medium the people would pluck bits of gravel and place them next to each other an proclaim, "Look! Information!"

 

That’s a funny story, but it overlooks the fact that sometimes all there is, is bits of gravel.  Like 3 billion base pairs of the human genome.   There’s no “teenage clerk” that has looked at most of it in detail or has much of any intuition about what it does.   Similarly, there’s no Rosetta stone for the nuances of why different whale species vocalize one way or another.  It’s just a process of throwing ideas against the wall and see if they stick.   Computers can do that more rapidly than humans can, at least.  Data mining isn’t just for developers in industry that can’t figure out how to decompose tables or make indices.

 

There are many approaches to modeling information, database normalization is one of many.   Information and category theory contribute other approaches.

 

Marcus


============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com
Reply | Threaded
Open this post in threaded view
|

Re: speaking of analytics - data mining

Nick Thompson
In reply to this post by Prof David West

David,

 

Wow!  Lot or history here.  I am looking to hearing what Lee and Owen make of this, since there were around for some of this history. 

 

With respect to our long running discussion of metaphor, I think your story is an allegory, not a metaphor.  An allegory (said he, improvising)  is a story constructed of metaphors.  So, “gravel” is a metaphor.  “Crushing a matrix” into gravel is a metaphor, and a particularly inspiring one, at that.  A story about how evil people lost the Kingdom because they crushed a matrix into gravel, now THAT’S an allegory. 

 

A metaphor contains basic and surplus meaning, and some of that surplus meaning is patently facetious.  When I say that Nature Selects, the basic meaning is all the ways in barnyard breeding is known to correspond to what goes on in nature,  the facetious surplus meaning is all the ways in which its known not to correspond.  What remains of the surplus meaning of the metaphor when the facetious implications are identified, is called the positive heuristic.  Roughly it’s the “juice” of the metaphor … all the ideas that the metaphor inspires us to explore and test with future science. 

 

Your allegory contains a whole bunch of very juicy metaphors.

 

Nick

 

Nicholas S. Thompson

Emeritus Professor of Psychology and Biology

Clark University

http://home.earthlink.net/~nickthompson/naturaldesigns/

 

From: Friam [mailto:[hidden email]] On Behalf Of Prof David West
Sent: Sunday, September 11, 2016 8:57 AM
To: [hidden email]
Subject: Re: [FRIAM] speaking of analytics - data mining

 

 

Once upon a time there was "information." People loved information and kept abundant amounts of in their heads and used it as a means of commerce among themselves, sharing it and savoring it and finding profit in it.

 

One day a new king, King Codd, conquered the realm and took all the information away from all the people. He dissembled all the information into meaningless pieces, called "data" and locked it away in an impenetrable matrix called a "schema." This required great effort, a process called "normalization," but it was, "worth it, because I can prove, mathematically', that data can be reassembled with the magic incantations of SQL." Information was thrown into the dungeons of thousands of Relational DataBase Management Systems (RDBMS), never to bee seen in its beautiful original form again.

 

Unfortunately, it proved impossible for the people to normalize properly, Codd-Normal-Form, had no algorithm or process to assure it was achieved and no one could master SQL - the logic was simply not something that most people could master. And, if you really did achieve proper normalization, it was so inefficient it was not practical, so everyone "demoralized" their vast stores of data so they could use them, poorly and in a crippled manner, to try and get some of their beloved information back.

 

The worst part of this story came later when the people found that the impenetrable matrix — the schema that held all their information hostage in the form of dissociated data, connected only with predefined "relationships" — made it impossible to retrieve any and all the "information" that they wanted and needed.

 

In anguish, the people invented an entire new profession - Data Mining -  that essentially 'crushed' the data stores creating gravel composed of individual datums and put the result in a different, more malleable matrix — live gravel in cement and sand and water (before the matrix dries). From this new medium the people would pluck bits of gravel and place them next to each other an proclaim, "Look! Information!"

 

Alas, this new "information" proved to lack most of the meaning that was intrinsic to the information the people once new and loved. All the semantics had been stripped from the old information when it was first placed in the RDBMS dungeons. The new juxtapositions of datums that data miner's called 'information' rapidly proved to be a pale imitation of the original. Once a video junkie, working as a clerk at the video rental company around the corner, could make accurate and reliable predictions about what movie you might want to view next — because of all the natural information he had in his head. But now, even the great wizard, NetFlix, despite all the algorithmic prowess and all the mined data it possesses, cannot make as accurate a prediction as the teenage clerk.

 

To this day, most of the world suffers from the massive evils perpetrated by the Wicked King Codd. Information, once abundant and freely shared with little more organization than the 'story', remains a rare and precious thing.

 

Nick - this is my metaphor, can you discern my theory and guess how, when, where, and why I utilize that theory?

 

dave west

 

 

On Fri, Sep 9, 2016, at 12:37 PM, Nick Thompson wrote:

And data “mining” is a metaphor.

 

Now people claim to use metaphors “metaphorically”, by which they mean that they mean nothing by them.  But it is my “teery”* (and it is all mine) that nobody uses a metaphor but that hizr thinking is influenced by it.  The influence can be inexplicit, in which case the user is blind to its effects on himmr, or explicit, in which case the user’s imagination is enhanced by its use and less likely to be misled by its misuse.   I would like to explore this “teery” using “Data Mining” as an example.  How does thinking of data as encased in a non-dynamic subterranean matrix shape our (your) thinking for good or ill?

 

*cf, Monte Python’s Flying Circus

 

Nick Nicholas S. Thompson

Emeritus Professor of Psychology and Biology

Clark University

http://home.earthlink.net/~nickthompson/naturaldesigns/

 

From: Friam [[hidden email]] On Behalf Of Eric Charles
Sent: Friday, September 09, 2016 11:31 AM
To: The Friday Morning Applied Complexity Coffee Group <[hidden email]>
Subject: Re: [FRIAM] speaking of analytics

 

Marcus,

That's an interesting distinction. Is it the case that by "theory" Nick was referring to something verbal and explicitly metaphorical, or would the results of data mining, which one sought to validate on a different sample, count as a "theory".

 

So, for example, if my data mining of Marine data found that tying shoes left-to-right predicted success at Officer Candidate School, and I then went to test for that "prediction" in a later sample of incoming officer candidates, to what extent is my prediction based on "a theory". 

 

Of course, "data mining will be a  useful way to uncover patterns" is itself a theory, applicable in some domains but not others (i.e., not all domains of inquiry will contain the sought after patterns in a long-term stable form).

 

Eric 

 

 


-----------
Eric P. Charles, Ph.D.
Supervisory Survey Statistician

U.S. Marine Corps

 

On Fri, Sep 9, 2016 at 10:51 AM, Marcus Daniels <[hidden email]> wrote:

I know that theories are really useful for making predictions, but can one actually make a prediction without one?”

 

Yes, that’s what data mining is:  Take a large corpus of data, find some statistically rare relationships, and then test for their predictive value on another large corpus of data.     In this way one can predict things without really having any kind of theory or even domain knowledge.

 

Marcus


============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com

 

============================================================

FRIAM Applied Complexity Group listserv

Meets Fridays 9a-11:30 at cafe at St. John's College

 


============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com
Reply | Threaded
Open this post in threaded view
|

Re: speaking of analytics - data mining

Nick Thompson
In reply to this post by Marcus G. Daniels

Marcus,

 

Now here, I would argue that gravel is a very bad metaphor for base pairs.  The salient properties of the elements of gravel is that the particles are more or less uniform in shape free to move with respect to one another , and not easily compressed and broken.  Base pairs are of significantly different shapes, bind together importantly with each other and other substances, do not move freely with respect to one another,  and can readily be crushed and broken.  So, the argument would run, thinking of base pairs as gravel will lead to more errors than thinking of them as, say, dance partners in an elaborate contra-dance. 

 

Nick . 

 

Nicholas S. Thompson

Emeritus Professor of Psychology and Biology

Clark University

http://home.earthlink.net/~nickthompson/naturaldesigns/

 

From: Friam [mailto:[hidden email]] On Behalf Of Marcus Daniels
Sent: Sunday, September 11, 2016 10:33 AM
To: The Friday Morning Applied Complexity Coffee Group <[hidden email]>
Subject: Re: [FRIAM] speaking of analytics - data mining

 

 

In anguish, the people invented an entire new profession - Data Mining -  that essentially 'crushed' the data stores creating gravel composed of individual datums and put the result in a different, more malleable matrix — live gravel in cement and sand and water (before the matrix dries). From this new medium the people would pluck bits of gravel and place them next to each other an proclaim, "Look! Information!"

 

That’s a funny story, but it overlooks the fact that sometimes all there is, is bits of gravel.  Like 3 billion base pairs of the human genome.   There’s no “teenage clerk” that has looked at most of it in detail or has much of any intuition about what it does.   Similarly, there’s no Rosetta stone for the nuances of why different whale species vocalize one way or another.  It’s just a process of throwing ideas against the wall and see if they stick.   Computers can do that more rapidly than humans can, at least.  Data mining isn’t just for developers in industry that can’t figure out how to decompose tables or make indices.

 

There are many approaches to modeling information, database normalization is one of many.   Information and category theory contribute other approaches.

 

Marcus


============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com
Reply | Threaded
Open this post in threaded view
|

Re: speaking of analytics - data mining

Marcus G. Daniels

Gravel has fractured faces and is complex.  It certainly does not move freely between units.  It is used just for the opposite property.   Pebbles are rounded move more freely.

(If you want to split hairs, I can do that too.)

 

The point is that billions of A, G, C, and Ts,  do not directly create information about why one person will be Usain Bolt and another will be Amadeus Mozart, or how certain immunotherapy tactics will work with one person or not another.   If you want to think about organic molecules, don’t think about dance partners.   Get an organic chemistry textbook and a molecular dynamics code and check to see if a metaphor even is in the right ballpark.

 

From: Friam [mailto:[hidden email]] On Behalf Of Nick Thompson
Sent: Sunday, September 11, 2016 8:41 AM
To: 'The Friday Morning Applied Complexity Coffee Group' <[hidden email]>
Subject: Re: [FRIAM] speaking of analytics - data mining

 

Marcus,

 

Now here, I would argue that gravel is a very bad metaphor for base pairs.  The salient properties of the elements of gravel is that the particles are more or less uniform in shape free to move with respect to one another , and not easily compressed and broken.  Base pairs are of significantly different shapes, bind together importantly with each other and other substances, do not move freely with respect to one another,  and can readily be crushed and broken.  So, the argument would run, thinking of base pairs as gravel will lead to more errors than thinking of them as, say, dance partners in an elaborate contra-dance. 

 

Nick . 

 

Nicholas S. Thompson

Emeritus Professor of Psychology and Biology

Clark University

http://home.earthlink.net/~nickthompson/naturaldesigns/

 

From: Friam [[hidden email]] On Behalf Of Marcus Daniels
Sent: Sunday, September 11, 2016 10:33 AM
To: The Friday Morning Applied Complexity Coffee Group <[hidden email]>
Subject: Re: [FRIAM] speaking of analytics - data mining

 

 

In anguish, the people invented an entire new profession - Data Mining -  that essentially 'crushed' the data stores creating gravel composed of individual datums and put the result in a different, more malleable matrix — live gravel in cement and sand and water (before the matrix dries). From this new medium the people would pluck bits of gravel and place them next to each other an proclaim, "Look! Information!"

 

That’s a funny story, but it overlooks the fact that sometimes all there is, is bits of gravel.  Like 3 billion base pairs of the human genome.   There’s no “teenage clerk” that has looked at most of it in detail or has much of any intuition about what it does.   Similarly, there’s no Rosetta stone for the nuances of why different whale species vocalize one way or another.  It’s just a process of throwing ideas against the wall and see if they stick.   Computers can do that more rapidly than humans can, at least.  Data mining isn’t just for developers in industry that can’t figure out how to decompose tables or make indices.

 

There are many approaches to modeling information, database normalization is one of many.   Information and category theory contribute other approaches.

 

Marcus


============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com
Reply | Threaded
Open this post in threaded view
|

Re: speaking of analytics - data mining

Owen Densmore
Administrator
I'm with Dave here. Early on we created a patent database at Xerox. Luckily, King Codd did not hold sway. Instead, a "document search system" was used which had more knowledge of semantics/language than it did of structure and relationships.

This was a huge win. It did run into the problem of categorization .. building a sort of formal set of tags like a card catalog. That was OK but basically was never used by researchers who much preferred a language based system .. i.e. like google search.

The classification system was eventually minimized, and the language search improved.

I used to make bets with the folks trying to migrate the patent database to relational: Give me a one-paragraph description on how to convert to first-normal form, the purist, most factored form. 
First normal form (1NF) is a property of a relation in a relational database. A relation is in first normal form if and only if the domain of each attribute contains only atomic (indivisible) values, and the value of each attribute contains only a single value from that domain.
The other bet was: show me any database in the company that doesn't cheat on its schema using "stored functions". I never lost.

Historically, RDB's are dying, simply because that are too rigid to evolve into fragmented, globally distributed, highly replicated file systems. Flat is Back.

   -- Owen

On Sun, Sep 11, 2016 at 8:57 AM, Marcus Daniels <[hidden email]> wrote:

Gravel has fractured faces and is complex.  It certainly does not move freely between units.  It is used just for the opposite property.   Pebbles are rounded move more freely.

(If you want to split hairs, I can do that too.)

 

The point is that billions of A, G, C, and Ts,  do not directly create information about why one person will be Usain Bolt and another will be Amadeus Mozart, or how certain immunotherapy tactics will work with one person or not another.   If you want to think about organic molecules, don’t think about dance partners.   Get an organic chemistry textbook and a molecular dynamics code and check to see if a metaphor even is in the right ballpark.

 

From: Friam [mailto:[hidden email]] On Behalf Of Nick Thompson
Sent: Sunday, September 11, 2016 8:41 AM
To: 'The Friday Morning Applied Complexity Coffee Group' <[hidden email]>
Subject: Re: [FRIAM] speaking of analytics - data mining

 

Marcus,

 

Now here, I would argue that gravel is a very bad metaphor for base pairs.  The salient properties of the elements of gravel is that the particles are more or less uniform in shape free to move with respect to one another , and not easily compressed and broken.  Base pairs are of significantly different shapes, bind together importantly with each other and other substances, do not move freely with respect to one another,  and can readily be crushed and broken.  So, the argument would run, thinking of base pairs as gravel will lead to more errors than thinking of them as, say, dance partners in an elaborate contra-dance. 

 

Nick . 

 

Nicholas S. Thompson

Emeritus Professor of Psychology and Biology

Clark University

http://home.earthlink.net/~nickthompson/naturaldesigns/

 

From: Friam [[hidden email]] On Behalf Of Marcus Daniels
Sent: Sunday, September 11, 2016 10:33 AM
To: The Friday Morning Applied Complexity Coffee Group <[hidden email]>
Subject: Re: [FRIAM] speaking of analytics - data mining

 

 

In anguish, the people invented an entire new profession - Data Mining -  that essentially 'crushed' the data stores creating gravel composed of individual datums and put the result in a different, more malleable matrix — live gravel in cement and sand and water (before the matrix dries). From this new medium the people would pluck bits of gravel and place them next to each other an proclaim, "Look! Information!"

 

That’s a funny story, but it overlooks the fact that sometimes all there is, is bits of gravel.  Like 3 billion base pairs of the human genome.   There’s no “teenage clerk” that has looked at most of it in detail or has much of any intuition about what it does.   Similarly, there’s no Rosetta stone for the nuances of why different whale species vocalize one way or another.  It’s just a process of throwing ideas against the wall and see if they stick.   Computers can do that more rapidly than humans can, at least.  Data mining isn’t just for developers in industry that can’t figure out how to decompose tables or make indices.

 

There are many approaches to modeling information, database normalization is one of many.   Information and category theory contribute other approaches.

 

Marcus


============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com


============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com
Reply | Threaded
Open this post in threaded view
|

Re: speaking of analytics - data mining

Marcus G. Daniels

 

Historically, RDB's are dying, simply because that are too rigid to evolve into fragmented, globally distributed, highly replicated file systems. Flat is Back.

 

Throwing the baby out with the bath water, I’d say.   The relational aspect of a modern RDBMS database product is hardly the whole product.   Query optimization can be done within a multi-field query of a single table, for example.   Indexing is valuable for any database, unless there is really good accelerator technology available.  ACID transaction properties are also important in many use cases.  A lot of these data warehousing or NoSQL systems are comparatively immature implementations compared to DB2 or Oracle or Postgres.  Sharding is straightforward to layer on to a RDBMs from the client side, that’s hardly a reason to switch to another technology.

 

What is the problem with stored procedures?  Clearly in any client/server architecture there will be situations where code needs to be close to the data for performance reasons.

 

Anyway, this has nothing to do with the value of statistical inference so far as I can tell.

 

Marcus 


============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com
Reply | Threaded
Open this post in threaded view
|

Re: speaking of analytics - data mining

Nick Thompson
In reply to this post by Marcus G. Daniels

Please see below.

 

Nicholas S. Thompson

Emeritus Professor of Psychology and Biology

Clark University

http://home.earthlink.net/~nickthompson/naturaldesigns/

 

From: Friam [mailto:[hidden email]] On Behalf Of Marcus Daniels
Sent: Sunday, September 11, 2016 10:57 AM
To: The Friday Morning Applied Complexity Coffee Group <[hidden email]>
Subject: Re: [FRIAM] speaking of analytics - data mining

 

Gravel has fractured faces and is complex.  It certainly does not move freely between units.  It is used just for the opposite property.   Pebbles are rounded move more freely.

(If you want to split hairs, I can do that too.)

[NST==>Splitting hairs is just what working a metaphor is about.  So I consider your contribution above as very helpful.  I was thinking pebbles, actually.  I think if one orders gravel around here, pebbles is what you get.  So, we are “negotiating” the surplus meaning of the metaphor, making it explicit. Good important scientific work.  Work too rarely done, in my opinion, particularly with respect to the metaphor of “natural selection.”  <==nst]

 

The point is that billions of A, G, C, and Ts,  do not directly create information about why one person will be Usain Bolt and another will be Amadeus Mozart, or how certain immunotherapy tactics will work with one person or not another. 

[NST==>Well, I agree so avidly with this statement, that I have lost track of where we disagree.  <==nst]

  If you want to think about organic molecules, don’t think about dance partners.   Get an organic chemistry textbook and a molecular dynamics code and check to see if a metaphor even is in the right ballpark.

[NST==>I only meant to assert that “dance partner” is a better metaphor than “gravel as was delivered to my yard last week.”<==nst]

 

From: Friam [[hidden email]] On Behalf Of Nick Thompson
Sent: Sunday, September 11, 2016 8:41 AM
To: 'The Friday Morning Applied Complexity Coffee Group' <[hidden email]>
Subject: Re: [FRIAM] speaking of analytics - data mining

 

Marcus,

 

Now here, I would argue that gravel is a very bad metaphor for base pairs.  The salient properties of the elements of gravel is that the particles are more or less uniform in shape free to move with respect to one another , and not easily compressed and broken.  Base pairs are of significantly different shapes, bind together importantly with each other and other substances, do not move freely with respect to one another,  and can readily be crushed and broken.  So, the argument would run, thinking of base pairs as gravel will lead to more errors than thinking of them as, say, dance partners in an elaborate contra-dance. 

 

Nick . 

 

Nicholas S. Thompson

Emeritus Professor of Psychology and Biology

Clark University

http://home.earthlink.net/~nickthompson/naturaldesigns/

 

From: Friam [[hidden email]] On Behalf Of Marcus Daniels
Sent: Sunday, September 11, 2016 10:33 AM
To: The Friday Morning Applied Complexity Coffee Group <[hidden email]>
Subject: Re: [FRIAM] speaking of analytics - data mining

 

 

In anguish, the people invented an entire new profession - Data Mining -  that essentially 'crushed' the data stores creating gravel composed of individual datums and put the result in a different, more malleable matrix — live gravel in cement and sand and water (before the matrix dries). From this new medium the people would pluck bits of gravel and place them next to each other an proclaim, "Look! Information!"

 

That’s a funny story, but it overlooks the fact that sometimes all there is, is bits of gravel.  Like 3 billion base pairs of the human genome.   There’s no “teenage clerk” that has looked at most of it in detail or has much of any intuition about what it does.   Similarly, there’s no Rosetta stone for the nuances of why different whale species vocalize one way or another.  It’s just a process of throwing ideas against the wall and see if they stick.   Computers can do that more rapidly than humans can, at least.  Data mining isn’t just for developers in industry that can’t figure out how to decompose tables or make indices.

 

There are many approaches to modeling information, database normalization is one of many.   Information and category theory contribute other approaches.

 

Marcus


============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com
Reply | Threaded
Open this post in threaded view
|

Re: speaking of analytics - data mining

Marcus G. Daniels
In reply to this post by Owen Densmore

That was OK but basically was never used by researchers who much preferred a language based system .. i.e. like google search.

 

Co-occurrence of words is not the same thing as natural language processing.   There’s no conceptual query planner or unification system in place, like there is with Watson. 

 

Marcus


============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com
Reply | Threaded
Open this post in threaded view
|

Re: speaking of analytics

gepr
In reply to this post by Nick Thompson


On September 10, 2016 9:45:40 PM PDT, Nick Thompson <[hidden email]> wrote:
>I think it's the word "registration" that has me most confused.  Can you help a bit further?

Registration is when a quality/feature coalesces or seems to emerge from the ambience ... when a structure appears amongst the structureless noise.

>Does inductive inference involve metaphor?  That seems to be the lurking question, here.  Inductive inference is famously incomplete without some fundamental assumptions (abductions) concerning the kind of world we are in ... a stable one, for instance.  So, I would answer the question, yes.

I don't think all types of induction require metaphor, no. It seems like induction can be done with meaningless symbols. ... like asking what symbol should follow in this sequence:

   a 6 g ! 4 q t

It may well be more powerful to attach meaning to the symbols as part of your guess. But it isn't necessary. Part of the problem might be that we tend to assume there exists a correct and unique answer to the question. E.g. if I were to _tell_ you that the 8th symbol was definite, just hidden from you. But induction need not assume a definite, correct, unique 8th symbol. Any new symbol derived from the 1st 7 will be an inductive inference. It doesn't matter whether the result is true or false (or pink or whatever). It's still induction.

It may be reasonable to claim that no inductive inference can be made, which would imply that there is zero signal in the data at all. But what you're really claiming in that case is that any induction is as good as any other induction. That doesn't prevent some arbitrary inference from happening. It only tells us that we have no a priori way of estimating its likelihood of success.

============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com
uǝʃƃ ⊥ glen
12