Google Search Now Processing One Trillion Unique Webpages

by DB on July 25, 2008

This is really amazing. Google has reached a new milestone in the number of unique webpages its search bots have searched for new content. From 26 million (6 zeroes) pages in 1998 to 1 billion (9 zeroes) pages in 2000 to 1 trillion (12 zeroes) is indeed a very unique milestone.

Jesse Alpert & Nissan Hajaj of the Google Web Search Infrastructure Team say they really don’t know how big the web is. One thing for sure is that it definitely bigger than one trillion pages. They are calling the web infinite.

So how many unique pages does the web really contain? We don’t know; we don’t have time to look at them all! :-) Strictly speaking, the number of pages out there is infinite — for example, web calendars may have a “next day” link, and we could follow that link forever, each time finding a “new” page. We’re not doing that, obviously, since there would be little benefit to you. But this example shows that the size of the web really depends on your definition of what’s a useful page, and there is no exact answer

Google Search downloads webpages continously all the time and computes the pagerank graph for webpages multiple times a day.

Back then, we did everything in batches: one workstation could compute the PageRank graph on 26 million pages in a couple of hours, and that set of pages would be used as Google’s index for a fixed period of time. Today, Google downloads the web continuously, collecting updated page information and re-processing the entire web-link graph several times per day.

I wonder how many webpages have Yahoo, Microsoft and Ask discovered till date.

Leave a Comment