Search engines index millions of web sites to generate the search results they return for key words. They do this using spiders.
Most search engines have their own spider that crawls around the web looking for web pages. Spiders are also known as robots because they are simply tiny little programs that run automatically, looking for web pages and recursively traveling through the embedded text links to index them.Most robots look for a robots.txt file in the top-level directory of your web site, also known as the root where your home page is located on the web server.
The robots.txt file is a simple text file created in a basic text editor, like Notepad. It allows you to control what the spider is allowed to access and what it is not allowed to access or index.
The format of the basic robots.txt file is pretty simple:
User-Agent: [Spider Name]
Disallow: [File Name]
For example, to allow ALL robots complete access to your web site, your robots.txt file will look like this:
User-agent: *
Disallow:
The asterisk is a wild card character that represents ALL robots. Leaving the Disallow line blank indicates to the robots, that nothing on the site is disallowed.
The next example bars all robots from the cgi-bin (where your scripts are typically located), images directories, and the portfolio directories:
User-agent: *
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /portfolio/
Note: You should use a separate Disallow line for each directory or individual file.
In this example, you may wonder why you would want to disallow a robot from indexing your portfolio directory.
If you are a photographer and you have thumbnail images on a portfolio page that link to enlargement pages launched in a pop-up window, you may not want those pop-up pages indexed. These are called dead-end or orphaned pages because only the enlarged image appears on the page with no contact info or menu links back to the main site. If the visitor entered your site on one of these pages, they would have nowhere to go and no way to contact you.
For a live example, check out www.AnJPhotography.com and look at her wedding portfolio. When you click on an image, it opens in a new window. The page in the new window is a dead-end page. A robots.txt file can keep search engines from indexing these dead pages so you dont leave site visitors stranded.
This example keeps googlebot (the Google spider) from getting at the private.htm file:
User-agent: googlebot
Disallow: private.htm
When you create your robots.txt file it is extremely important that you use a basic text editor (like Notepad) and NOT a word processing application like Microsoft Word. Applications like Microsoft Word can insert hidden characters that may make your robots.txt file unreadable. After you post your robots.txt file to the web server, you can validate it to make sure it is properly formatted. There are several free validators on the web. Here is one:http://www.searchengineworld.com/cgi-bin/robotcheck.cgi
There are several advantages and some disadvantages of having the robots.txt file in your root directory. Protocol requires that all search engine robots start indexing your web site with the robots.txt file. This is the default entry point for robots if the file is present. Major search engines will never violate the Standard for Robots Exclusion. This is the primary reason it should be there. Beyond that, it can help with your search engine rankings when used correctly, and it can keep dead pages on your web site from being indexed. The primary disadvantage is that the robots.txt file may be viewed by nefarious individuals on the web, so you never want to use the robots.txt file to try to hide sensitive pages or directories on your web site (like passwords or private information).For more information about the robots.txt file and complete list of robots, visit the following web site: http://www.robotstxt.org/wc/robots.html
Sandra Waggett is the founder and principal designer of MSW Interactive Designs LLC (MSW-ID) major products and websites. MSW-ID provides custom website design, hosting, ecommerce and online marketing solutions to nearly 400 small business clients nationwide. MSW-ID helps small business professionals achieve an effective Internet presence.
Prior to founding MSW Interactive Designs LLC, she spent nearly 5 years working as a Senior Engineer for BAE Systems on the Lockheed Martin Mission Systems Team in Colorado Springs, CO.
While with BAE, she was the training lead for the proposal phase of the Integrated Space Command and Control (ISC2) program. In this role, she authored the 10 year training plan for the proposal and developed web-based training prototypes for presentation to to the Government decision makers. Sandy earned her Master of Arts of degree from the University of CO, Colorado Springs, in Curriculum and Instruction, Corporate Track. Her specialties include web design, interface design, instructional design, and computer-based training development.
Black Hat SEO
A fairly common black hat search engine optimization tactic is to build multiple websites on a general theme. The sites are then cross-linked to other sites in the same network, and would also include one-way links to the primary site with varying anchor text. The whole aim is to give one or more sites a huge boost in search engine results pages (SERPs), and for it to also benefit from additional traffic flowing from the various network sites.
People undertaking such methods generally create the websites with automated tools, use scraped content from other peoples websites, and most of the sites have no purpose other than to drive traffic to the primary site.
White Hat SEO
First it is important to understand a little about linking structure. I am not going to go into excessive details.
It is widely understood that internal linking on any website can represent as much as 50% of the page rank attributed to any single page within a site. How your pages are linked together, for which terms, and whether links are reciprocated all play an import role in the calculation.
If I told you that there are hundreds of websites on the internet, with very high page rank on multiple terms themselves, who would be willing to create a niche portal within their pages, highly optimized for your website, niche and keywords, it is something you would probably be willing to pay for.
We are not talking about a simple directory site. We are talking about high quality content pages, that will pass on pagerank to your site, plus a central hub, similar to a home page, that benfits from all the content pages linking to it, and that in turn also points directly to your website.
Of course:-
This is all white hat. You will never be penalised for using this tactic by the search engines, and it is permanent! Your traffic hubs will be a permanent fixture. Some of these hubs will disappear, but many more will appear to replace them.
Is this something you would pay for
You can get this highly powerful promotion of your website for free!
Simply write and submit articles to article directories.
Every day I see questions on multiple marketing forums along the lines of:-
Does article marketing really work
I submitted an article 2 weeks ago and my search engine results have stayed the same, why
When I submit an article, how long until I will see traffic to my website
Describing exactly how this all works in words is very difficult, but lets look at a very simple math formula.
1x1x1x1x1=1
It is not very impressive is it
You have to remember however that an individual article you publish gains incoming links in a number of ways.
So we may be looking at more like...
1.3x1.3x1.3x1.3x1.3=3.71
Some of the numbers however are going to be bigger or smaller, depending on the authority of the page linking to the article, the number of links from that page etc.
You might well have to use addition rather than multiplication when regarding many aspects of a real formula.
What is important however is that not only is each individual article you publish gaining in pagerank, but also your author profile.
Lets take some examples.
These are the current top 5 article authors listed at Ezine Articles:-
Lance Winslow 2029 Articles
Jeff Herring 340 Articles
Tim Gorman 306 Articles
John Mussi 303 Articles
Dennis Siluk 286 Articles
Now do a search for any of those author names in Google.
Every single result has a reference to their Ezine Articles profile within the top 3 positions.
This isnt true in every case. Well known (and popular for good reason) internet marketer Willie Crawfords profile only appears at the bottom of the first page, but he has hundreds of links pointing to his popular websites, and has a baseball player competing for ranking.
Profile pages concentrate and magnify the linking benefit of every article you publish, thus the links from a profile page carry a lot of weight.
Some author bio pages allow a lot of customization. Most allow you to have some text (which can be keyword targeted), along with website links. A few even allow you to set anchor text for every link in your profile.
Thus to answer all the questions I see every day on various marketing forums.
Short term it can be a fast route to having a website spidered by search engines.
Medium term, you will gain some exposure within your niche as other sites and ezines publish your content. Many of them dont write about your topic every day.
Long-term is really up to you. The more quality articles you write, the larger your hubs will become. Large article hubs pick up traffic from a larger variety of search engine traffic, but also make your author bio more prominent, thus magnifying the value of external links placed there.
Andy Beard has worked in Sales, Marketing and Localization for the last 15 years, primarily in the computer games industry.He publishes his articles with the services of Article Marketer