Google and Semantics

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Google and Semantics

George Duncan
Re the discussion below, while Google doesn't, Clusty makes a basic stab.
See http://clusty.com/

> When I do a search with Google I see very little 'intelligence' of that
> kind in the results.  There appears to be some statistical weighting,
> but the 'intelligence' of the results seems to depend entirely on
> whether my word combination captures the concept I'm looking for.   I
> don't believe that's definable by any means I know of yet.
>

Yes. As far as I'm aware Google has not yet deployed a production
quality technology for the semantic web.   Google doesn't reason about
concepts.  Not only can't it trim down logically inappropriate results,
it can't expand on related concepts unless there happens to be data
(like from Wikipedia) where someone has created a document that
physically contains the overlap of different nomenclatures.   It
certainly can't tell you whether two mathematical formulations of
similar models will make the same predictions unless, again, there
happens to be a  web page posting of someone that said it was so.



--
George T. Duncan
Professor of Statistics
Heinz School of Public Policy and Management
Carnegie Mellon University
Pittsburgh, PA 15213
(412) 268-2172
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://redfish.com/pipermail/friam_redfish.com/attachments/20070422/8fa16e04/attachment.html 

Reply | Threaded
Open this post in threaded view
|

Google and Semantics

Marcus G. Daniels
George Duncan wrote:
> Re the discussion below, while Google doesn't, Clusty makes a basic
> stab. See http://clusty.com/
As I understand it, Clusty is doing cluster analysis in a statistical
way, and does not represent things in relations of objects and actions,
etc. It doesn't make or crawl RDF/OWL or model natural language.

http://www.hakia.com aims to a `meaning' oriented search engine. When I
searched for `global warming', Hakia gave me a breakdown of categories,
including stuff like "Possible Solutions", "Movies and Documentaries",
and so on. Also Hakia has a dialogue system where terminology can be
clarified (as if one was working with a reference librarian).

These pages illustrate and describe how queries are modeled:

http://labs.hakia.com/OntoSem/hakia-lab-ontox.aspx
http://labs.hakia.com/hakia-lab-onto.html
http://www.ontologicalsemantics.com

Also, I ran across this leak of Google's "Big Goals and Directions 2006".

http://blog.outer-court.com/archive/2006-10-26-n80.html

where it says:

"Google wants to have the world?s top AI research laboratory."

Also relevant are Larry Page's remarks here:

http://technology.guardian.co.uk/news/story/0,,1781121,00.html

"Mr Page said one thing that he had learned since Google launched eight
years ago was that technology can change faster than expected, and that
AI could be a reality within a few years"


Reply | Threaded
Open this post in threaded view
|

Google and Semantics

George Duncan
Yes, that's what Clusty does. But looking at results for "global
warming" here are the clusters Clusty arrives at:

Climate change<http://clusty.com/search?v%3afile=viv_724%4019%3aw689P8&v%3aframe=list&v%3astate=root%7cN572&id=N572&action=list&sw=%7cClimate%20change%7c&sec=1177288059&>
 (48)
 +<http://clusty.com/search?v%3afile=viv_724%4019%3aw689P8&v%3aframe=tree&v%3astate=%28root%28N428%29%29%7cN428&id=N428&action=list&sw=%7cScientists%7c&sec=1177288059&>
Scientists
<http://clusty.com/search?v%3afile=viv_724%4019%3aw689P8&v%3aframe=list&v%3astate=root%7cN428&id=N428&action=list&sw=%7cScientists%7c&sec=1177288059&>
 (15)
 +<http://clusty.com/search?v%3afile=viv_724%4019%3aw689P8&v%3aframe=tree&v%3astate=%28root%28N689%29%29%7cN689&id=N689&action=list&sw=%7cProblems%7c&sec=1177288059&>
Problems
<http://clusty.com/search?v%3afile=viv_724%4019%3aw689P8&v%3aframe=list&v%3astate=root%7cN689&id=N689&action=list&sw=%7cProblems%7c&sec=1177288059&>
 (13)
 +<http://clusty.com/search?v%3afile=viv_724%4019%3aw689P8&v%3aframe=tree&v%3astate=%28root%28N667%29%29%7cN667&id=N667&action=list&sw=%7cPolitics%7c&sec=1177288059&>
Politics
<http://clusty.com/search?v%3afile=viv_724%4019%3aw689P8&v%3aframe=list&v%3astate=root%7cN667&id=N667&action=list&sw=%7cPolitics%7c&sec=1177288059&>
 (8)
 +<http://clusty.com/search?v%3afile=viv_724%4019%3aw689P8&v%3aframe=tree&v%3astate=%28root%28N716%29%29%7cN716&id=N716&action=list&sw=%7cGlobal%20Warming%20Science%7c&sec=1177288059&>
Global Warming Science<http://clusty.com/search?v%3afile=viv_724%4019%3aw689P8&v%3aframe=list&v%3astate=root%7cN716&id=N716&action=list&sw=%7cGlobal%20Warming%20Science%7c&sec=1177288059&>
 (8)
 +<http://clusty.com/search?v%3afile=viv_724%4019%3aw689P8&v%3aframe=tree&v%3astate=%28root%28N393%29%29%7cN393&id=N393&action=list&sw=%7cDangers%7c&sec=1177288059&>
Dangers<http://clusty.com/search?v%3afile=viv_724%4019%3aw689P8&v%3aframe=list&v%3astate=root%7cN393&id=N393&action=list&sw=%7cDangers%7c&sec=1177288059&>
 (6)
 +<http://clusty.com/search?v%3afile=viv_724%4019%3aw689P8&v%3aframe=tree&v%3astate=%28root%28N491%29%29%7cN491&id=N491&action=list&sw=%7cGreenhouse%20Gas%7c&sec=1177288059&>
Greenhouse Gas<http://clusty.com/search?v%3afile=viv_724%4019%3aw689P8&v%3aframe=list&v%3astate=root%7cN491&id=N491&action=list&sw=%7cGreenhouse%20Gas%7c&sec=1177288059&>
 (7)
 +<http://clusty.com/search?v%3afile=viv_724%4019%3aw689P8&v%3aframe=tree&v%3astate=%28root%28N518%29%29%7cN518&id=N518&action=list&sw=%7cAverage%20Temperature%20Of%20The%20Earth%7c&sec=1177288059&>
Average Temperature Of The
Earth<http://clusty.com/search?v%3afile=viv_724%4019%3aw689P8&v%3aframe=list&v%3astate=root%7cN518&id=N518&action=list&sw=%7cAverage%20Temperature%20Of%20The%20Earth%7c&sec=1177288059&>
 (5)
 +<http://clusty.com/search?v%3afile=viv_724%4019%3aw689P8&v%3aframe=tree&v%3astate=%28root%28N655%29%29%7cN655&id=N655&action=list&sw=%7cInconvenient%20Truth%7c&sec=1177288059&>
Inconvenient Truth<http://clusty.com/search?v%3afile=viv_724%4019%3aw689P8&v%3aframe=list&v%3astate=root%7cN655&id=N655&action=list&sw=%7cInconvenient%20Truth%7c&sec=1177288059&>
 (5)
 ?
Not bad at a "meaning" level, I think. Also useful for the searcher. Clusty
is a Carnegie Mellon spinoff from CS. A lot of the research on information
retrieval done here works with rather simple (conceptually at least)
statistical models. Here's a link with a broad overview:
http://www.lti.cs.cmu.edu/Research/index.html

BTW, try FRIAM in Clusty.

George



On 4/22/07, Marcus G. Daniels <marcus at snoutfarm.com> wrote:

>
> George Duncan wrote:
> > Re the discussion below, while Google doesn't, Clusty makes a basic
> > stab. See http://clusty.com/
> As I understand it, Clusty is doing cluster analysis in a statistical
> way, and does not represent things in relations of objects and actions,
> etc. It doesn't make or crawl RDF/OWL or model natural language.
>
> http://www.hakia.com aims to a `meaning' oriented search engine. When I
> searched for `global warming', Hakia gave me a breakdown of categories,
> including stuff like "Possible Solutions", "Movies and Documentaries",
> and so on. Also Hakia has a dialogue system where terminology can be
> clarified (as if one was working with a reference librarian).
>
> These pages illustrate and describe how queries are modeled:
>
> http://labs.hakia.com/OntoSem/hakia-lab-ontox.aspx
> http://labs.hakia.com/hakia-lab-onto.html
> http://www.ontologicalsemantics.com
>
> Also, I ran across this leak of Google's "Big Goals and Directions 2006".
>
> http://blog.outer-court.com/archive/2006-10-26-n80.html
>
> where it says:
>
> "Google wants to have the world's top AI research laboratory."
>
> Also relevant are Larry Page's remarks here:
>
> http://technology.guardian.co.uk/news/story/0,,1781121,00.html
>
> "Mr Page said one thing that he had learned since Google launched eight
> years ago was that technology can change faster than expected, and that
> AI could be a reality within a few years"
>
> ============================================================
> FRIAM Applied Complexity Group listserv
> Meets Fridays 9a-11:30 at cafe at St. John's College
> lectures, archives, unsubscribe, maps at http://www.friam.org
>



--
George T. Duncan
Professor of Statistics
Heinz School of Public Policy and Management
Carnegie Mellon University
Pittsburgh, PA 15213
(412) 268-2172
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://redfish.com/pipermail/friam_redfish.com/attachments/20070422/f66b841f/attachment.html 

Reply | Threaded
Open this post in threaded view
|

Google and Semantics

Marcus G. Daniels
George Duncan wrote:
> Not bad at a "meaning" level, I think. Also useful for the searcher.
> Clusty is a Carnegie Mellon spinoff from CS. A lot of the research
> on information retrieval done here works with rather simple
> (conceptually at least) statistical models. Here's a link with a broad
> overview: http://www.lti.cscmu.edu/Research/index.html 
> <http://www.lti.cs.cmu.edu/Research/index.html>
Thanks for the link -- looks like their machine translation and
information retrieval projects follow both statistical and grammatical
approaches.    For web search engines, at least for casual users, I
think its pretty clear that stateless clustering approaches can work
well.   My interest is whether, using automated procedures, scientific
terms can be determined to have consistent meanings or not.    If it
didn't matter what order words and sentences had, words' part-of-speech,
etc.  then we ought to be able to scramble any text and still understand
it.   (How basic statistical retrieval systems work.)


Reply | Threaded
Open this post in threaded view
|

Google and Semantics

Phil Henshaw-2
Actually, the context of other replies covered my questions on the how
computers can link concepts, but not on how people somehow have such
separate languages that learning from one sphere can't cross over to the
other.   There's still that conceptual gap between promoting more
efficient growth and a desire to limit the real economic impacts on the
earth.  All the communities I have contacted on the subject are
speechless for some reason.  

The clear evidence of a gap in intellect is that global economic
efficiency is methodically improving at half the rate of output for
energy at least (Jevons Paradox, or the natural vanishing returns of
efficiency as you may call it).   You see the same pattern in the
general technology cycle that demonstrates a growth and climax
development model of terminally limiting efficiency for any physical
process known, and for good reasons applying generally to any possible
undiscovered one too, of the same kind as predicted by the 2nd law for
energy transformation.  It seems that the 1st law of human behavior, not
to look at the 2nd law, still predominates even among scientists, but
since the world is physically running smack into it at an accelerating
rate, something will give.


Phil Henshaw                       ????.?? ? `?.????
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
680 Ft. Washington Ave
NY NY 10040                      
tel: 212-795-4844                
e-mail: pfh at synapse9.com          
explorations: www.synapse9.com    


> -----Original Message-----
> From: friam-bounces at redfish.com
> [mailto:friam-bounces at redfish.com] On Behalf Of Marcus G. Daniels
> Sent: Sunday, April 22, 2007 11:54 PM
> To: The Friday Morning Applied Complexity Coffee Group
> Subject: Re: [FRIAM] Google and Semantics
>
>
> George Duncan wrote:
> > Not bad at a "meaning" level, I think. Also useful for the searcher.
> > Clusty is a Carnegie Mellon spinoff from CS. A lot of the research
> > on information retrieval done here works with rather simple
> > (conceptually at least) statistical models. Here's a link
> with a broad
> > overview: http://www.lti.cscmu.edu/Research/index.html 
> > <http://www.lti.cs.cmu.edu/Research/index.html>
> Thanks for the link -- looks like their machine translation and
> information retrieval projects follow both statistical and
> grammatical
> approaches.    For web search engines, at least for casual users, I
> think its pretty clear that stateless clustering approaches can work
> well.   My interest is whether, using automated procedures,
> scientific
> terms can be determined to have consistent meanings or not.    If it
> didn't matter what order words and sentences had, words'
> part-of-speech,
> etc.  then we ought to be able to scramble any text and still
> understand
> it.   (How basic statistical retrieval systems work.)
>
> ============================================================
> FRIAM Applied Complexity Group listserv
> Meets Fridays 9a-11:30 at cafe at St. John's College
> lectures, archives, unsubscribe, maps at http://www.friam.org
>
>