Google Sitemap Verification 404-200 problem

Google Sitemap-Service makes it extremely difficult to check the verification.

There is a notorious error that cannot be easily fought:

NOT VERIFIED
We've detected that your 404 (file not found) error page returns a
status of 200 (OK) in the header.

The reason for this is, that google wants you to place a verification-file in the root-folder of your page. By checking the existance of this verification-file googles knows if you are the site-owner. Now google does not actually download this file, but just makes a HEAD-request to check if this file is actually there.
If your page now has some custom 404-page that returns the code of 200 instead of 404 the HEAD-check for the verification-file is absurd, cause every request to any file returns a 200-code, regardless if this file is there or not. So google demands that your 404-pages actually return the 404-code.

While this makes somehow sense, its not easy to administer, cause many dynamic pages do not have 404-pages any more, cause every url lead to a page.

google checks for a a file noexist_VERIFYCODE (same code as verifyfile) but does not identify itself as google when doing so!!

www.mydomain.at 64.233.172.37 - - [16/Mar/2006:12:45:42 +0100] "GET /noexist_xxxxxxxx.html HTTP/1.1" 200 73457 "-" "-"

1) verification of a static page

For the time of the verification-process I turned my page into a simple static homepage with only the verification-file. Every other page returned 404. I prooved this by using ‘wget -S’:

wget  --spider   http://www.adulteducation.at/googlef6xxxxxxxxxxx1.html
....
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
200 OK
wget  --spider   http://www.adulteducation.at/googlef6xxxxxxxx1_NOT_EXIST.html
...
HTTP request sent, awaiting response... 404 Not Found
12:51:39 ERROR 404: Not Found.
\\
Nevertheless the goole-verify-process returned the above error:\\
\\
<code>
NOT VERIFIED
We've detected that your 404 (file not found) error page returns a
status of 200 (OK) in the header.

2) verification of a dynamic page

After my try to fool google by providing a static-page during verification-process failed I changed my handler that creates the dynamic wegpage to return 404 on 404-pages on root-level (its not possible on sublevels, cause there are no 404-pages any more, cause every url is then interpreted partially as a seachterm that changes the result of the page, so there are no 404-pages any more).
I worked a while and then I’d made it:

wget  -S   http://www.adulteducation.at/googlef6xxxxxxxxxxxx1.html
....
HTTP/1.1 200 OK
wget  --spider   http://www.adulteducation.at/googlef6xxxxxxxxxxx1_NOT_EXIST.html
....
HTTP request sent, awaiting response... 404 Not Found
12:55:30 ERROR 404: Not Found.



I pressed verify again and got ... again ...

NOT VERIFIED
We've detected that your 404 (file not found) error page returns a
status of 200 (OK) in the header.



conclusio

 
knowwiki/howtos/sitemap_verify.txt · Last modified: 2006/03/16 13:09