26 April 2009

Web site: security or index, do we need to choose ?

Many web sites gives content based on questions/answers. Answering a question can even be paid, to keep people motivated to answer.

To have their business running, these sites often ask to register and pay to access the content....
But, to have visitors and so customers, theses questions and answers must be indexed by search engine like google.
You can't give google an account to log on your site to index it. So these web sites filters access based on the user agent coming to them, and if it's a known search engine, then they give full access to the content. So when a visitor find a page with the same question it has through google, they filter the answers because the user agent is not a search engine.

This form of security can be easily circumvented by changing your user agent to googlebot or others. Very easy, even for dummies like with the firefow extension, User Agent Switcher!

If you are too lazy to find the exact name of search engine, you can feed this extension with this ready to use XML:

For example, the web site SQL Server Central use this false security.

To have a real security, you need to filter User Agent AND IP address