Roll Your Own Google   (with Wired online link)

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Roll Your Own Google   (with Wired online link)

Randy Burge

For accessing the live links in the article:
http://www.wired.com/news/technology/0,1282,69817,00.html?tw=wn_tophead_2

Roll Your Own Google?

 By Jeff MacIntyre

02:00 AM Dec. 13, 2005 PT

 In a move with potentially far-reaching implications for the search market,
Alexa Internet is opening up its huge web crawler to any programmer who
wants paid access to its rich trove of internet data.

Alexa, a subsidiary of Amazon.com that is best known for its traffic
rankings, on Monday unveiled Alexa Web Search Platform, a set of online
tools for searching, indexing, computing, storing and publishing vast
quantities of net data.

Alexa claims it's the first time that developers, students and startups will
be given inexpensive access to an industrial-scale web crawler -- the same
technology used by industry giants like Yahoo (Yahoo Slurp) and Google
(Googlebot).

"It sounds innocuous but it's big," said Alexa CEO Bruce Gilliat. "We're
giving access to billions of pages and computing resources.... Users have
never had this opportunity before. Big industry has ruled search, because it
was the only player with access to the tools."

Alexa spiders 4 billion to 5 billion pages a month and archives 1 terabyte
of data a day. The new platform will allow developers to build their own
search engines.

"If it is what they claim it is, it strikes me that this is nontrivial
news," said search industry pundit and author John Battelle. "Anyone can
crawl the web, but crawling and maintaining an index at scale is very
difficult and very expensive. They are providing convenient access to
something that was very dear."

Battelle said the move, if it pans out as promised, could have a big impact
on the search industry, and could possibly lessen Google's growing dominance
in web search.

Alexa's offering may help "create an ecosystem (in search) where something
can occur outside the Googleverse," he said.

To illustrate the new service's potential, Alexa developed a photo search
engine that allows users to query photo metadata normally hidden from
standard keyword searches, such as the date the photo was taken or the
camera used.

Musipedia, another Alexa prototype, provides users with the ability to
search the web by melody. Give the engine a keyword or melodic contour, and
it returns similar music. Musipedia allows users to input their own
whistling as a query.

>From computer scientists to web hobbyists, Gilliat predicted Alexa's
inexpensive services will spawn numerous creative results. Costs are priced
at $1 per transaction, which range from a CPU hour of computing time to
gigabytes of uploads and downloads. Gilliat said a complete web snapshot
should cost a "couple thousand" dollars.

Thanks to the company's history, Gilliat believes Alexa is well-positioned
to democratize data search.

It is an interesting return to the spotlight for Alexa, the commercial
cousin of Internet Archive, a nonprofit founded by Brewster Kahle that is
dedicated to preserving a public index of the web and its history. Alexa's
crawler donates directly to the Internet Archive.

Alexa has been archiving the web from its Presidio of San Francisco offices
since it was founded in 1996. In 1997, Alexa unveiled its toolbar, one of
the first such search-specific browser add-ons, which has since registered
more than 10 million downloads. Amazon acquired Alexa in 1999.

Alexa has more than a thousand machines involved in storage, access and
computation, and the company expects high demand for the new service.

"Using our crawler saves massive time, money and computational power,"
Gilliat said. "There are lots of really smart people out there who don't
work for a search engine, but they have good ideas, needs and desires for
what they want from web search. They have an inkling, and we have the way."

Amazon and Alexa representatives declined to speculate whether this move
might compel other search engines to commercialize their crawlers.

Battelle, however, characterized the news as "Amazon casting a stone in the
lake of search."

He said Alexa's announcement echoes other developments in recent years at
Amazon, a company that prides itself on leveraging the strength of its user
community.

"I have been consistently impressed by the innovative thinking there,"
Battelle said. "This is the type of news you might come to expect from
Amazon.... We can now sift the web and do it cheaply and frequently. This
feels very Web 2.0."
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://redfish.com/pipermail/friam_redfish.com/attachments/20051213/5231e87c/attachment.htm