Hey everyone ,So recently I noticed that Google had indexed my whole Development server! To combat this I immediately inserted a robots file into the root of my Apache server so that Google would stop indexing everything.
After a month it was still listing pages for my development server but was no longer updating (so it was working but I wanted Google to forget the data). I realised at this point that I needed to put some security in place so that even if someone accidentally came across the server, they couldn't access it without permission from me, so I set-up some htaccess so that you are required to login in order to access anything from the server. This has secured my development pages from unwanted viewing and now Google is finally getting the hint and has started removing results from it's search engine!
So I thought I would blog about the various methods you could use to stop unwanted visitors (or just stop Google thinking your server is an actual website).
A robots file is a simple text file called “robots.txt” which you place in the root directory of your website, this file is read by all respectable search engines and is used (most often) to determine what they can and cannot index on your server.
You can use this method to tell search engines NOT to index your development server, for example:
User-agent: * Disallow: /
The above example will disable all agents accessing anything on your server.
You can also exclude specific directories and/or files, for example:
User-agent: * Disallow: /checkout/ Disallow: /account.php
As I have said though if Google has already indexed your website then it may keep the results even though it’s now told not to index them, so perhaps the htaccess approach is for you…
No-Index Meta Tags
It is possible to exclude pages from search engines by placing a Meta Tag into the HTML of the page, this is useful for dynamic sites where you don't want to keep updating a robots.txt file for every page you wish to be excluded.
To exclude a page from all search engines you would place the below code into your <head> tag:
<meta name="robots" content="noindex" />
Password protect the server using .htaccess
This is the best solution to protect your Development Server from search engines and unwelcome visitors!
You have to create two files in the root of your webserver, one called .htaccess and another called .htpasswd, this is remarkably difficult on windows machines as windows says there invalid names!
Here is an app that will help you create these files Coming soon.
In the .htaccess file put the following code, replace “Deanos Development Server” with a string that represents your server, and AuthUserFile to the directory path of your .htpasswd file.
AuthType Basic AuthName "Deanos Development Server" AuthUserFile "C:\Program Files (x86)\EasyPHP-22.214.171.124\www\.htpasswd" require valid-user
Once this is in place head over to: http://www.htaccesstools.com/htpasswd-generator/
Input a desired Username and Password and click generate.
Copy the string returned and save that into the .htpasswd file.
congratulations! your server is now blocking unwanted guests, and soon Google may give up and get the hint too!