Re: [sfx: Discuss] Fwd: DuckDuckGo

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [sfx: Discuss] Fwd: DuckDuckGo

Steve Smith
Owen -
> I thought this might be of broader interest:
>
> This article: http://dontbubble.us/ discusses the "bubble effect" of
> search engines, where you slowly evolve into a bit of a ghetto.  Your
> search usage creates a profile that can paint you into a corner.
After reading the DuckDuckGo article, I was (mildly) puzzled (offended?)
by their rhetoric.  They act as if deleting a feature (personalized
search results based on prior searches) is a big plus.  At best, it
might be a preferred default?  Their strategy seems to be to prey on the
naivete and the paranoia of the masses to make their less capable search
engine seem more capable?  I don't doubt they are working hard on other
features but to make their *lack of personalization*  out to be the
prime feature seems... duplicitous.

Search engines are essentially "recommender" systems.  One strategy for
improving the recommendation *is* to track searches and customize
results.  If you don't sign into Google, I don't think they apply this
to your search results at all (at best for the subnets where you
access?).  You can also clear your history, selectively edit it, and
turn it off (Pause they call it).

I happen to have multiple google IDs but remain logged out of Google
most of the time.  For those of you who have given over to letting
Google manage your mail, this is probably too inconvenient (logging
out/in all the time).  I log in for google docs and for blog management
now and again, with my different IDs.  Each of the IDs roughly
corresponds to one of my alter egos or personalities.  For example, my
personal interests overlap my professional interests, but only to a
modest extent.  In principle, my personal account profile and my
professional account profile will be informed differently and produce
different results.

3 years ago when one of our new kittens was dying and I was searching
far and wide for information, I was annoyed (offended?) by the many ads
popping up trying to sell me catfood, cat leashes, cat nip, cat toys,
even cat pet insurance.   They knew I had a cat and was interested in
cat things, but didn't know that the very same cat was nearly dead and
wouldn't be needing any of the stuff they were peddling.  If there was a
human in the loop, it would have been quite rude.

I also think referring to it as a bubble is part of their duplicitous
rhetoric (any marketing, self-promotion is going to use this).   If
anything, I would compare it to canalization on an epigenetic landscape.
Of course that metaphor would be lost on most of their (potential)
users.   I use the term because it feels more accurate... essentially,
there is an "erosion" of the search landscape going on, informed by the
searches that have gone before.

I had tried to pitch Google maybe 7 years ago on the idea of studying
search in this context... I never heard back...  but I would not be
surprised if this isn't effectively what they are doing anyway.
>
> When Steve and I were working on a project to visualize SFI working
> papers, we stumbled across search engine mashups that gave you
> categories of responses .. which seemed more useful than a single long
> list of results.  Yahoo also seems to me, anyway to have a better
> display of search results.

Your noticing that a set of categories is more interesting/useful than a
simple ordered list of course, begs the question of how does one arrive
at the categories?  Are these human-derived?  Are these derived by the
structure of their relations?  Are they derived by *your use*?!     I
suppose your comments are making a case for *exposing* more of the
qualities used to personalize your search... help expose *why* the list
is ordered the way it is, or the categories of reasons they are offering
you things in those categories, etc.

In my vernacular, it would be to show you the erosion patterns of your
own search landscape I suppose.  And I agree, and this is what I was
vaguely trying to propose to the Googleteers...  to help us see the
basins of attraction carved out not only by our own personal searches
but by the linking and general search and followup patterns.  I haven't
tracked their tech work in years, but at the time, Spectral Graph Theory
was an important part of the game it seemed.

The problem (one of them) is really that this is a high dimensional
problem...  and reducing it to one dimension (ordered list) is only a
little worse than reducing it to a (2d) landscape by some measures.  I
am often surprised that google doesn't offer multiple sorts on their
results.   Sometimes I am interested in *recent* things (I'm now using
Google Realtime sometimes and wondering what happened to Collecta...
currently offline?) and other times I *might* be interested in ordering
*without* personalization and *with* personalization... or
personalization weighted different ways, etc.

Visualizing complex multi/hypergraphs is a holy grail for me.   I've
done a bit with various real-world problems but it remains an
interesting and hard one.  More to the point, in this context, it is not
the actual *graphs* one needs to visualize but rather the systems and
data that are encoded in the graphs.   In this case, the networks of
interconnected web sites/links and the search patterns and utilization
over those networks are what Google (or DuckDuckGo) has to work with,
and a structured ordering/layout of the available resources, possibly
annotated, is what we want returned when we enter a query.

I like the landscape metaphor for many reasons.  In general I believe
all visualizations are rooted in metaphors, even simple (usually
geometric) ones.   The landscape is a familiar one (geometrically it is
a simple single-valued function) which human brains were evolved to
parse well.   The Topic Maps of PNNL's Spire and Sandias VxInsight are a
beginning of this.   In this case, they only really encode proximity and
density.   In the case at hand, one would also like to encode erosion,
accelerated wear, and possibly growth and diversity.    The "height" of
the landscape is one interesting and primary measure, but the relative
height and the raggedness and the size of an "ideashed" , etc. are also
interesting/useful.

I guess I wandered into the other thread started by Tom Johnson on Data
Visualization in general.

- Steve




============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
lectures, archives, unsubscribe, maps at http://www.friam.org
Reply | Threaded
Open this post in threaded view
|

Re: [sfx: Discuss] Fwd: DuckDuckGo

Marcos
Steve, glad to see someone else around here working on complex
visualizations of multigraphs (what I call fractal graphs) and able to
articulate the problems so well.

One thing that I wanted to bring to such a discussion is the
limitiations of such endeavors as a static or monolithic system of
algorithms and data and that without a *social* component, will always
fall short.  I'd argue that the best and most wanted material (in a
search) will always be highly volatile and not conducive to the
traditional, more-or-less analytical approaches.  That is, the most
interesting stuff within this problem domain of (complex, dynamic
graphs and search) is at the edges of the landscape and that it's
hardly there long enough to generate rank.  Therefore, I think the
only solution to this problem is to create a "republic" -- get human
agents involved in a ranking system as well as the items themselves,
forming a meritocracy.   To continue with your landscape metaphor, I'm
trying to solve the problem at the "center of the cyclone" where small
perturbations have a huge effect across the content-landscape over
time.

Marcos, sf_x
pangaia.sf.net


On Tue, Jun 21, 2011 at 5:57 PM, Steve Smith <[hidden email]> wrote:

> Owen -
>>
>> I thought this might be of broader interest:
>>
>> This article: http://dontbubble.us/ discusses the "bubble effect" of
>> search engines, where you slowly evolve into a bit of a ghetto.  Your search
>> usage creates a profile that can paint you into a corner.
>
> After reading the DuckDuckGo article, I was (mildly) puzzled (offended?) by
> their rhetoric.  They act as if deleting a feature (personalized search
> results based on prior searches) is a big plus.  At best, it might be a
> preferred default?  Their strategy seems to be to prey on the naivete and
> the paranoia of the masses to make their less capable search engine seem
> more capable?  I don't doubt they are working hard on other features but to
> make their *lack of personalization*  out to be the prime feature seems...
> duplicitous.
>
> Search engines are essentially "recommender" systems.  One strategy for
> improving the recommendation *is* to track searches and customize results.
>  If you don't sign into Google, I don't think they apply this to your search
> results at all (at best for the subnets where you access?).  You can also
> clear your history, selectively edit it, and turn it off (Pause they call
> it).
>
> I happen to have multiple google IDs but remain logged out of Google most of
> the time.  For those of you who have given over to letting Google manage
> your mail, this is probably too inconvenient (logging out/in all the time).
>  I log in for google docs and for blog management now and again, with my
> different IDs.  Each of the IDs roughly corresponds to one of my alter egos
> or personalities.  For example, my personal interests overlap my
> professional interests, but only to a modest extent.  In principle, my
> personal account profile and my professional account profile will be
> informed differently and produce different results.
>
> 3 years ago when one of our new kittens was dying and I was searching far
> and wide for information, I was annoyed (offended?) by the many ads popping
> up trying to sell me catfood, cat leashes, cat nip, cat toys, even cat pet
> insurance.   They knew I had a cat and was interested in cat things, but
> didn't know that the very same cat was nearly dead and wouldn't be needing
> any of the stuff they were peddling.  If there was a human in the loop, it
> would have been quite rude.
>
> I also think referring to it as a bubble is part of their duplicitous
> rhetoric (any marketing, self-promotion is going to use this).   If
> anything, I would compare it to canalization on an epigenetic landscape. Of
> course that metaphor would be lost on most of their (potential) users.   I
> use the term because it feels more accurate... essentially, there is an
> "erosion" of the search landscape going on, informed by the searches that
> have gone before.
>
> I had tried to pitch Google maybe 7 years ago on the idea of studying search
> in this context... I never heard back...  but I would not be surprised if
> this isn't effectively what they are doing anyway.
>>
>> When Steve and I were working on a project to visualize SFI working
>> papers, we stumbled across search engine mashups that gave you categories of
>> responses .. which seemed more useful than a single long list of results.
>>  Yahoo also seems to me, anyway to have a better display of search results.
>
> Your noticing that a set of categories is more interesting/useful than a
> simple ordered list of course, begs the question of how does one arrive at
> the categories?  Are these human-derived?  Are these derived by the
> structure of their relations?  Are they derived by *your use*?!     I
> suppose your comments are making a case for *exposing* more of the qualities
> used to personalize your search... help expose *why* the list is ordered the
> way it is, or the categories of reasons they are offering you things in
> those categories, etc.
>
> In my vernacular, it would be to show you the erosion patterns of your own
> search landscape I suppose.  And I agree, and this is what I was vaguely
> trying to propose to the Googleteers...  to help us see the basins of
> attraction carved out not only by our own personal searches but by the
> linking and general search and followup patterns.  I haven't tracked their
> tech work in years, but at the time, Spectral Graph Theory was an important
> part of the game it seemed.
>
> The problem (one of them) is really that this is a high dimensional
> problem...  and reducing it to one dimension (ordered list) is only a little
> worse than reducing it to a (2d) landscape by some measures.  I am often
> surprised that google doesn't offer multiple sorts on their results.
> Sometimes I am interested in *recent* things (I'm now using Google Realtime
> sometimes and wondering what happened to Collecta... currently offline?) and
> other times I *might* be interested in ordering *without* personalization
> and *with* personalization... or personalization weighted different ways,
> etc.
>
> Visualizing complex multi/hypergraphs is a holy grail for me.   I've done a
> bit with various real-world problems but it remains an interesting and hard
> one.  More to the point, in this context, it is not the actual *graphs* one
> needs to visualize but rather the systems and data that are encoded in the
> graphs.   In this case, the networks of interconnected web sites/links and
> the search patterns and utilization over those networks are what Google (or
> DuckDuckGo) has to work with, and a structured ordering/layout of the
> available resources, possibly annotated, is what we want returned when we
> enter a query.
>
> I like the landscape metaphor for many reasons.  In general I believe all
> visualizations are rooted in metaphors, even simple (usually geometric)
> ones.   The landscape is a familiar one (geometrically it is a simple
> single-valued function) which human brains were evolved to parse well.   The
> Topic Maps of PNNL's Spire and Sandias VxInsight are a beginning of this.
> In this case, they only really encode proximity and density.   In the case
> at hand, one would also like to encode erosion, accelerated wear, and
> possibly growth and diversity.    The "height" of the landscape is one
> interesting and primary measure, but the relative height and the raggedness
> and the size of an "ideashed" , etc. are also interesting/useful.
>
> I guess I wandered into the other thread started by Tom Johnson on Data
> Visualization in general.
>
> - Steve
>
>
>
>
> ============================================================
> FRIAM Applied Complexity Group listserv
> Meets Fridays 9a-11:30 at cafe at St. John's College
> lectures, archives, unsubscribe, maps at http://www.friam.org
>

============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
lectures, archives, unsubscribe, maps at http://www.friam.org