sheriff Administration

Joined: 29 Apr 2007 Posts: 3
|
Posted: Thu May 03, 2007 9:46 pm Post subject: Google algorithm grays out many sites! |
|
|
Approximately once a month, Google update their index by recalculating the Pageranks of each of the web pages that they have crawled.
Because of the nature of PageRank, the calculations need to be performed about 40 times and, because the index is so large, the calculations take several days to complete. During this period, the search results fluctuate; sometimes minute-by minute. Google usually has this take place sometime during the last third of each month.
Google has two other servers that can be used for searching. The search results on them also change during the monthly update.
For the rest of the month, fluctuations sometimes occur in the search results, but they should not be confused with the actual dance. They are due to Google's fresh crawl and to what is known "Everflux".
Checking new rankings during the Googles update.
Google has two other searchable servers apart from www.google.com. They are www2.google.com and www3.google.com. Most of the time, the results on all 3 servers are the same, but during the update, they are different.
For most of the update, the rankings that can be seen on www2 and www3 are the new rankings that will transfer to www when the update is over. Even though the calculations are done about 40 times, the final rankings can be seen from very early on. This is because, during the first few iterations, the calculated figures merge to being close to their final figures. You can see this with the Pagerank Calculator by checking the Data box (top left) and performing some calculations. After the first few iterations the search results on www2 and www3 may still change, but only slightly.
During the update, the results from www2 and www3 will sometimes show on the www server, but only briefly. Also, new results on www2 and www3 can disappear for short periods. At the end of the dance, the results on www will match those on www2 and www3.
This Google Tool allows you to check your rankings on www, www2 and www3 and on all of datacenters simultaneously.
[TOP]
Checking new PageRank during the Google Update
Google currently has 12 data centers, any one of which can provide the Toolbar PageRank of any page. As the update progresses, these data centers are updated one by one. Before the update begins, they all return the same, current PageRank value for a given page, they are updated, one by one, to the new PageRank value. Checking each of the centers during the update reveals the new PageRank values as they gradually spread through the centers. If the PageRank isn't going to change, the centers show the same values throughout, of course.
Querying the data centers
For this, it is necessary to have the Google Toolbar installed and the PageRank indicator on. Every time a page is received by the browser, the Toolbar requests its PageRank from one of Google's data centers. The information is returned as a one-line text file and stored in the Temporary Internet Files folder.
The Toolbar's request URL includes the URL of the page that it wants the PageRank for (the target page), and a checksum that matches that URL. Of course, the checksum must match the target page's URL.
A fat URL for a typical Toolbar request (all in one line):-
http://216.239.33.102/search
?client=navclient-auto
&ch=5150615727
&features=Rank:FVN
&q=info:http%3A%2F%2Fwww%2Eseoperfectcart%2Ecom%2F
If you copy and paste that fat URL into your browser, you will get Google's "forbidden" page back. That's because the target page and checksum don't match - it's just an example of the request URL ie(eoperfectcart).
Notice that the target page is in escaped format - some of the characters are represented by hexadecimal codes (e.g. %2F).
To get the new PageRank for a particular page, you need to make the same request that the Toolbar makes for it. I.e. you need the fat URL that the Toolbar uses. And you need to request the PageRank from all of Google's data centers. The method is a bit long-winded but it works. Here's how to do it:-
# Use your browser to browse to the page. This makes sure that the page and the Toolbar's PageRank request are in your Temporary Internet Files folder. You only need to do this once - not every time.
# Open the index.dat file from the Temporary Internet Files folder into a text editor, and perform a search in it for the target page. You'll find the entire fat URL, similar to the one above, for the Toolbar's PageRank request.
NOTE: Because the target page is escaped in the fat URL, search only for an unescaped part; e.g. "exampledomain".
# When you've found the fat URL, copy and paste it into your browser's address box and press Return or click Go. If the page is in Google's directory, the returned line includes the directory path. The last element in the first part of the line is the Toolbar PageRank value for the target page.
To see the page's new PageRank spread across the centers during the dance, use the same fat URL, but replace the IP address with each of the data centers. This is also a good way to see the progress of the dance in general.
Data centers
216.239.33.100 :: www-ex.google.com
216.239.35.100 :: www-sj.google.com :: currently offline
216.239.37.100 :: www-va.google.com
216.239.39.100 :: www-dc.google.com
216.239.41.100 :: www-fi.google.com
216.239.51.100 :: www-ab.google.com
216.239.53.100 :: www-in.google.com
216.239.55.100 :: www-zu.google.com
216.239.57.100 :: www-cw.google.com
216.239.59.100 :: www-gv.google.com
66.102.11.100 :: www-kr.google.com
66.102.7.100 :: www-mc.google.com
TIP: If you want to check the same pages during future dances, save the fat URLs into a text document so that you don't need to go through the process of finding them in the Temporary Internet Files folder each time.
This last incarnation of Google the algorithm is called Allegra and it seems to have greyed out pagerank on several websites.
The Allegra update borrows its name from the popular Allergy medication for allergy sufferers. It is said that Allegra could be the remedy to the "Sandbox Effect" that tens of thousands of Web sites experienced in 2004. |
|