Re: speaking of analytics - data mining

Posted by Owen Densmore on
URL: http://friam.383.s1.nabble.com/speaking-of-analytics-tp7587850p7587885.html

I'm with Dave here. Early on we created a patent database at Xerox. Luckily, King Codd did not hold sway. Instead, a "document search system" was used which had more knowledge of semantics/language than it did of structure and relationships.

This was a huge win. It did run into the problem of categorization .. building a sort of formal set of tags like a card catalog. That was OK but basically was never used by researchers who much preferred a language based system .. i.e. like google search.

The classification system was eventually minimized, and the language search improved.

I used to make bets with the folks trying to migrate the patent database to relational: Give me a one-paragraph description on how to convert to first-normal form, the purist, most factored form. 
First normal form (1NF) is a property of a relation in a relational database. A relation is in first normal form if and only if the domain of each attribute contains only atomic (indivisible) values, and the value of each attribute contains only a single value from that domain.
The other bet was: show me any database in the company that doesn't cheat on its schema using "stored functions". I never lost.

Historically, RDB's are dying, simply because that are too rigid to evolve into fragmented, globally distributed, highly replicated file systems. Flat is Back.

   -- Owen

On Sun, Sep 11, 2016 at 8:57 AM, Marcus Daniels <[hidden email]> wrote:

Gravel has fractured faces and is complex.  It certainly does not move freely between units.  It is used just for the opposite property.   Pebbles are rounded move more freely.

(If you want to split hairs, I can do that too.)

 

The point is that billions of A, G, C, and Ts,  do not directly create information about why one person will be Usain Bolt and another will be Amadeus Mozart, or how certain immunotherapy tactics will work with one person or not another.   If you want to think about organic molecules, don’t think about dance partners.   Get an organic chemistry textbook and a molecular dynamics code and check to see if a metaphor even is in the right ballpark.

 

From: Friam [mailto:[hidden email]] On Behalf Of Nick Thompson
Sent: Sunday, September 11, 2016 8:41 AM
To: 'The Friday Morning Applied Complexity Coffee Group' <[hidden email]>
Subject: Re: [FRIAM] speaking of analytics - data mining

 

Marcus,

 

Now here, I would argue that gravel is a very bad metaphor for base pairs.  The salient properties of the elements of gravel is that the particles are more or less uniform in shape free to move with respect to one another , and not easily compressed and broken.  Base pairs are of significantly different shapes, bind together importantly with each other and other substances, do not move freely with respect to one another,  and can readily be crushed and broken.  So, the argument would run, thinking of base pairs as gravel will lead to more errors than thinking of them as, say, dance partners in an elaborate contra-dance. 

 

Nick . 

 

Nicholas S. Thompson

Emeritus Professor of Psychology and Biology

Clark University

http://home.earthlink.net/~nickthompson/naturaldesigns/

 

From: Friam [[hidden email]] On Behalf Of Marcus Daniels
Sent: Sunday, September 11, 2016 10:33 AM
To: The Friday Morning Applied Complexity Coffee Group <[hidden email]>
Subject: Re: [FRIAM] speaking of analytics - data mining

 

 

In anguish, the people invented an entire new profession - Data Mining -  that essentially 'crushed' the data stores creating gravel composed of individual datums and put the result in a different, more malleable matrix — live gravel in cement and sand and water (before the matrix dries). From this new medium the people would pluck bits of gravel and place them next to each other an proclaim, "Look! Information!"

 

That’s a funny story, but it overlooks the fact that sometimes all there is, is bits of gravel.  Like 3 billion base pairs of the human genome.   There’s no “teenage clerk” that has looked at most of it in detail or has much of any intuition about what it does.   Similarly, there’s no Rosetta stone for the nuances of why different whale species vocalize one way or another.  It’s just a process of throwing ideas against the wall and see if they stick.   Computers can do that more rapidly than humans can, at least.  Data mining isn’t just for developers in industry that can’t figure out how to decompose tables or make indices.

 

There are many approaches to modeling information, database normalization is one of many.   Information and category theory contribute other approaches.

 

Marcus


============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com


============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com