Monday, January 16, 2012

How do web search engines (crawler-based search engine) work?

Search engines are used to find information on the Web. Search the Web, images, video, news from Google, Yahoo, Bing seems very simple. People just type queries in search box, search engine returns search results. But search engines are complex software system. Search engines utilize tens or hundreds of thousands of computers to process billions of web pages and return results for thousands of searches per second. Search can't find what is not on search engine. How do search engines find a website? Here are basic steps of how search engines work.

    1. Search engine crawls a website
  • Search engine uses spider (or crawler) visit a web site, read the information on the web pages, follow hyperlinks from one pages to another pages that the site connects to.
    2. Search engine stores each word in a searchable index
  • Search engine indexes the content(text, code) of the webpage by adding it to their giant database and then periodically updates this content.
    3. Search engine matches the query terms with words in the index
  • Search engines search their own giant databases when a user enters in a search query to find related documents sort documents by relevance.
    4. Search engine displays results
  • Search engines rank the resulting documents using an algorithm by assigning various weights and ranking factors.

 See More...
What is SEO?