Tuesday, May 25, 2010

The Search-Notes

What is Search?
Link by link ,click by click ,search is building possibly the most lasting, ponderous ,and significant cultural artifact in the history of humankind: the Data base of Intentions.
The Data base of Intentions is simply this: the aggregate results of every search ever entered ,every result list ever tendered , and every path taken as result. What we call simply as Search.

Who ,What ,Where ,Why, When, and How:
John Battelle words: As a cub reporter ,I was taught to answer five questions about any topic before writing about it: who ,what ,where ,why and when. If you crammed answers to all those questions into your lead paragraph ,then you'd essentially done your job. Author also said he quickly learned to add a sixth how?. I hope these words holds good for every one.

A Search engine Consists :A search engine consist of three major pieces
The Crawl
The Index
The run time system or query processor
The crawler is a specialized software program that hopes from link to link on the World Wide Web , scarfing up the pages it finds and sending them back to be indexed.

The more sites they crawl , and the more frequently they crawl them ,the more complete the index is. When the index is more complete ,the search results pages (SERPs) that are returned for particular query have a greater chance of being relevant. The process of grokking the index is referred to as analysis. Google's PageRank algorithm is an example of analysis: It looks the links on a page, the anchor text around those links , and the popularity of the pages that link to another page and factors them together to determine the ultimate relevance of a particular page to your query. Google in fact ,looks at more than one hundred factors to determine a sites relevance to your keywords.

The query processor which is the interface and related software that connects user's queries to the index.
Once the crawl data is analyzed ,indexed , and tagged ,it's dumped into what's called a runtime index - a data base ready to serve results to users. The runtime index form something of a bridge between the back end of an engine (the crawl and index) and the front end(it's query server and user interface).

Atomic phrases:Phrases that have their own sets of results at the smallest levels ,search engines are capable of tell the difference by parsing a list of atomic phrases.
As per John (2007) Google alone has more than 175,000 computers dedicated to the job.

The power of search lies :
We do ask lot of the same questions , but we ask far more that are unique , and therein lies the power of search.

Google Whacking:
In the early days of Google , a popular sport among the search watchers was to find a query that had exactly one result. This game even has a name - Google Whacking.

Where & Why -Search:
Navigational Query: The practice of typing in a word you know so as to yield a site you wish to visit called as navigational query.
Why Search: We are searching for more than one answers. Not only are we searching for that which we know. We increasingly searching to find that which we do not know.
Web blindness: A sense that was know there's stuff we might want to find, but have no idea how to find it.