Login  Register

Can you guess the source. - on Earth Day

Posted by Marcus G. Daniels on Apr 22, 2007; 2:42pm
URL: http://friam.383.s1.nabble.com/Can-you-guess-the-source-tp523696p523765.html

Phil Henshaw wrote:

> How would a computer be able to suggest that
> when you search for 'Corian' you might actually be looking for 'solid
> surface'.   You might assume that the original discussion that
> associated the terms was not coded, and only the gradual change in usage
> can be documented (e.g. as for punctuated equilibrium).   I can see some
> assistance, but not a lot, being provided by a computer able to mark the
> growth dynamics of word uses, giving a specific date to when a new
> phrase began to mature (first turning point ending the first growth
> period).  The poor computer is just never going to be coding the 'idea'
> the terms convey for people, and won't it always be making word
> associations a different way?
>  
Wikipedia gives both definitions:  solid surface and acrylic polymer +
alumina trihydrate.
In R, for example:

 > dict <- new.env()
 > dict[["corian"]] <- c(old="polymer",new="solid surface")
 > contextualize <- function (entry,context) if (!is.na(context))
entry[context] else entry
 > meaningOf <- function (word,context=NA) {
contextualize(dict[[word]],context) }

Then:

 > meaningOf("corian","old")
      old
"polymer"
 > meaningOf("corian","new")
            new
"solid surface"
 > meaningOf("corian")
            old             new
      "polymer" "solid surface"

So, given some context, it's a simple matter to grab a subset of
possible meanings or all of them.   Computers are especially good at
combinatorics, and can report a confidence interval on any conclusions
by logically extrapolating each outcome from ambiguity in each word.    
Some conclusions might defy common sense though, and that's where I'd
see something like Cyc coming in to play.   A customer asked for
granite, or pointed at a polymer-like countertop that wasn't from
Dupont, and so the inference could be made that they really meant "solid
surface".
 
If you like redefine the contextualize function to be smart by looking
at a dynamic description of the immediate environment, or index into
different dictionaries as a function of time or whatever.