Duplicate Content - Mysite.com/ vs. Mysite.com/index.html

Saturday, September 20th, 2008

As I wrote in a prvious post, duplicate content on your own website can come in the form of “.mysite.com/” vs. “.mysite.com/index.html.” The search engines see this same page as two different ones, but with identical content. As I also mentioned, most search engines are smart enough to figure out that these two pages are the same one, but still, they do share .

What to do? That’s easy too. Just open up your .htaccess again and type in the following code:

RewriteEngine On
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ HTTP/
RewriteRule ^index\.html$ http://.mysite.com/ [R=301,L]

You can do this with other pages that have the same problem as well.

Related posts

Tags: , , , , , , , , , , , , , , ,

How to Set Up a Custom 404 File Not Found Page

Wednesday, September 17th, 2008

Ok, this is a pretty simple thing to do and it has some important benefits.

Have you ever visited a website or a web page only to find that annoying “Not Found” ? If so, what did you do? You probably got ticked off, hit the back button and visited another website. Can you imagine someone coming across a “Not Found” error page on your website? Well, if you don’t have a Not Found” page set up on your website, that might just be happening.

Here is what you need to do to fix this problem and keep your visitors on your website.

The first thing is to create a web page with some sort of message on it. Something like, “Whoops, looks like the page you are looking for isn’t here. Please click this link to visit our home page or our search page…” You get the idea. You can save the page as “404.” or something similar and upload it to the of your web server.

Oh, I forgot to mention this. In order to do what I am suggesting here, you need to be running an web server and your web host has to allow changes to your .htaccess . I am sure there are other ways to create a Not Found error page and get it up and running, but I am only talking about one way here.

Now, open up your .htaccess and place this code into it somewhere. I like to place it right on top:

404 /404.

I am using . extensions for this stuff just because of habit and preference. You can use .html or whatever you wish.

Well, that’s basically it. You can now save your .htaccess and upload it to the server and go see if it worked. Try typing in some page that you know isn’t there. If it works, please read my previous post about “How To Check Your Web Page HTTP Headers & Response Codes” for some important information.

Good luck.

Related posts

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Sudden Google Ranking Drop - Proxy Hijack

Tuesday, September 16th, 2008

Do you remember my article from yesterday about the sudden drop in Google search ranking for my friend’s website? Well, I just can’t stop thinking about it.

From what I have been reading, it seems as though my conclusion may be correct. At least I am hoping it is. If I ever conclude anything semi-concrete while thinking about Google, it’s a good day for me.

Ok, I found this very helpful and thorough website that pretty much described the exact problem my friend is having. It’s titled “Google Proxy Hijacking” and tells the whole story.

Here is what struck me as I think about this some more.

- My friend’s website has been live since 2004.
- The site seemed to be in the Google sandbox for the entire 4 years.
- For his most competitive keywords, he was ranking past page 20 on Google.
- About two months ago, he made some changes to the homepage copy as well as an HTML overhaul.
- About a month after that, the site ranked number 3 for his most competitive keywords.
- The site ranked on page 1 of Google for about a month.
- The site now sits at page 25 for its most competitive keywords.

Here is my theory. I think the website has been proxy hijacked for a number of years. This is what caused the poor rankings for such a long time. When the and HTML changes were made about 2 months ago, Google visited the site and found it unique. Google ranked the site well, due to this new unique . During the month, Google noticed the proxy website was now a duplicate of my friend’s website once again and dropped the website’s ranking.

Does that make sense? From what I read on the website I linked to, it does.

Here are the similarities with what we are experiencing and what the author wrote on the other website:

- My friend’s website has never been banned.
- We did a quoted Google for supposedly unique on my friend’s website and a proxy website showed in the results.
- The proxy URL looked like this: proxysite.com/cgi-bin/pxy/nph-pxy.pl/000010A/http/www.friendssite.com/
- The proxy site was an exact duplicate of my friend’s website.

Now, I am not sure if this is what caused my friends ranking to drop, but all the factors are there. The keywords we are talking about are very competitive, but the fact that his site showed so well in the results for a month shows me that the potential is there.

I would appreciate your thoughts on this.

Related posts

Tags: , , , , , , , , , , , , , , , , , , , ,

Avoiding Duplicate Content On Your Own Website

Monday, September 15th, 2008

Today has been an interesting day. We have been taking a look at our and searching for duplicate content using Copyscape. After today’s findings, we might just go with ’s premium service.

Now, let me just tell you that duplicate content is everywhere. Actually, someone has probably written this sentence a million times. What we were searching for today was blatant and far reaching . We found a few instances of one of our homepages and general idea taken for someone else’s use as well as many instances of interior pages taken. Needless to say, we made screen copies of these cases and sent them to our attorney’s office. These are serious and can’t be ignored.

I would like to talk about two things you can do to help out a more subtle form of duplicate content, on your own .

The first form of duplicate content on your own is in the form of www vs. non-www. If you go to your and type in “www.mysite.com” and then type in “mysite.com,” you may see the same page appear. In the search engine’s eyes, these are two copies of the same page. How do you fix this? It’s easy. Just open up your .htaccess and type in the following code:

RewriteEngine On
%{HTTP_HOST} !^www\.mysite\.com
RewriteRule ^(.*)$ ://www.mysite.com/$1 [R=permanent,L]

When someone types in “mysite.com” to visit your , they will automatically be forwarded to “www.mysite.com.” The search engines will be forwarded as well.

Another form of duplicate content on your own comes in the form of “www.mysite.com/” vs. “www.mysite.com/index.html.” The search engines see this same page as two different ones. What to do? That’s easy too. Just open up your .htaccess again and type in the following code:

RewriteEngine On
%{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ /
RewriteRule ^index\.html$ ://www.mysite.com/ [R=301,L]

When someone either types in “www.mysite.com/index.html” or follows a link like that to your , they will be automatically be forwarded to “www.mysite.com.”

Now, here is the disclaimer. I used this on my server setup and it worked. Please check with your own hosting company to see if something similar will work for your too.

Related posts

Tags: , , , , , , , , , , , , , , , , , , , ,