Archive for the 'Search Engine Optimization' Category
July 31st, 2010 by Joe Majewski
In recent years, the meta tag and link relationship library has exploded. It’s not uncommon to have over a dozen different meta tags on a given page. As a web designer, it’s important to keep yourself up-to-date with these changes or you can quickly fall behind in the SEO (Search Engine Optimization) race.
Googlebot can’t figure out everything for itself, so it appreciates it when you, as a web designer, point it in the right direction by giving it various information through the use of these meta tags and link relations.
One particular tag that I find to be important is the canonical tag, which is still relatively new. This tag allows you to define the URL to a page that Googlebot will take interest in when said page has many different “views” through the use of URL variables. This is difficult to put into words, but I’m sure that with an example you will fully understand.
Imagine a page titled rankings.php, which lists the top players in some online game. Perhaps, by default, 50 members are shown per page. Now imagine if their were 125 members in this “game”. Then, rankings.php would have 3 different pages to view, most likely referenced by a $_GET variable passed through the URL. Thus, here are all the different ways that this page can be accessed:
- rankings.php?page=1 would list the first 50 members
- rankings.php?page=2 would list members 51 through 100
- rankings.php?page=3 would list members 101 through 125
- rankings.php would assume that page=1, showing the same content as rankings.php?page=1
There is already a duplicate content issue, as simply viewing the default page shows the same content as you’d see if you added the URL variable ?page=1. Googlebot hates duplicate content.
Now imagine if you also had a sort feature, where you could sort users alphabetically by their username or by their ranking, both with ascending and descending options. I couldn’t even begin to list all of the possible combinations anymore, especially if there were more than just three pages of users, but I’ll do it anyways:
- rankings.php?page=1?sort=username&order=ascending
- rankings.php?page=2?sort=ranking&order=descending
- rankings.php?sort=username
- rankings.php?sort=username&order=ascending
Assuming that the default page ordering was ascending, then examples one, three and four are ALL duplicate content pages.
Naturally, Googlebot would crawl the site, following all paths of links that it can find, and you would quickly find that hundreds of pages are being indexed just for the rankings page, many of them showing duplicate content. Although not proven to be fact, many believe that this has a negative impact on your PageRank.
If you aren’t familiar with, or simply just don’t know a whole lot about how it works, I highly recommend reading an article that I wrote about Google PageRank.
Here’s where the canonical tag comes into play. By simply placing a single meta tag into your source code, you can tell Google to not bother with all of the hundreds of combinations of URL variables that can be created to dynamically generate the rankings list. It looks a little bit like this:
<link rel = “canonical” href = “http://example.com/rankings.php” />
By simply placing that tag in your source code, it will tell Googlebot to only take concern with that URL structure for rankings.php. This means that all of the ?page=x and ?order=asc won’t have any affect on your site’s indexing or PageRank.
This may not have been the greatest example to use, as many people would disallow Googlebot from indexing a rankings page at all, and even if they did want the rankings page to be indexed, they could add “no-follow” to the sort links. Although not foolproof, there are other ways around the issue in some cases.
Nonetheless, my intention was to inform you about the “canonical” tag, and not to provide you with the perfect example case for when to use it. If you have any questions about the use of this tag, ask away in the comment section below.
I really, really, appreciate all comments that I receive. To post a comment, you don’t need to register to this site, or leave your name, or even provide your email address; it’s easy as pie and it only takes a few seconds of your time. Also, if you have your own blog, I will comment you back to show my gratitude.
June 12th, 2010 by Joe Majewski
As a webmaster, you should be constantly checking Google to find out if your pages are being properly indexed. If you do a simple Google query for “site:www.joemajewski.com”, the search results will contain all of the pages in the specified domain that are currently indexed. Thus, the site operator can come in very handy for the slaves of SEO.
It takes a lot of time and patience, and maybe even a bit of luck, to get your website indexed exactly how you like it. Having extra pages show up in Google’s search results can cause your website to lose relevance. Therefore, you want to ensure that every single page that Google lists (upon performing the site command with your domain name) is relevant to Internet users in some way or another.
Every month or so I will query Google for a list of pages indexed on my website and go through them, one by one. From start to finish, I will slowly compile a list of pages until I have completed the round. In the end, I can use the list of pages to discover what modifications I need to make to my website to better optimize things for search engine users.
Remember, it may take several months for the changes you make to your website to be completely crawled by Google and translated into better search results.
This is important! Realize that blocking crawler access to a page using the robots.txt file will NOT prevent those pages from being listed on Google’s search results. Placing a disallow line in your robots file will only tell Googlebot not to crawl said page; it will not stop Google from indexing that page. In fact, Googlebot will still transfer PageRank to a blocked page, so you are damaging your website if your attempt to remove an indexed page results in just blocking it from being crawled.
So how do you remove an indexed page from Google? Simple. Place the following meta tag in the head of the page’s source code, and the next time Google crawls your page it will be completely removed from the directory.
<meta name="robots" content="noindex,follow" />
The “noindex” parameter does exactly what you’d think it does; Google will no longer place that page in it’s search results. The “follow” parameter will tell Google to still use this page for PageRank purposes. Google will still take a look at all the links on the page and have the crawler continue it’s path, rather than stopping dead in it’s tracks. Thus, it’s usually a good idea to keep the follow attribute in the meta tag as well, unless you truly don’t want Google to have anything to do with that page, which is rarely the case.
I should also mention that this meta tag will be completely ignored if you continue to have the disallow command in your robots.txt file. If you blocked Google’s access to that page through your robots file, it will not be able to read the meta tags you set in place, so it’s a good idea to double-check just in case.
April 17th, 2010 by Joe Majewski
Technorati is one of the Internet’s largest search engines for blogs, with over 100 million distinct blogs indexed as of this writing. As a webmaster, you should find yourself regularly promoting your site in a variety of ways.
For example, some of the first few things that I do upon launching a new website include getting indexed with Google and Yahoo, signing up for tools such as Google Analytics and Google Webmaster Tools, and posting in forums with my domain name linked in my signature.
While producing unique content that people find useful is the most important thing that you can do as a webmaster, you must also work equally hard outside of your password-protected Admin panel. Neglecting to keep up with the newest Internet trends will reflect poorly upon your website in the long run. HTML standards are always being modified, and in order to stay in the race it would be to your benefit to understand how the web works.
This entire process can simply be referred to as search engine optimization (SEO); I mention this concept a lot on my blog, as I find it to be the most important set of standards and practices that can and will bring good fortune to a dedicated webmaster.
In my own words, search engine optimization is the effort put forth to enhance your source code and provide web crawlers with the required data to optimize your website’s reach in order to achieve the highest possible ranking in the search engine results pages (SERPs) for any given keyword or phrase.
Google uses it’s trademarked Googlebot to crawl the web. Although Googlebot is a brilliant piece of work, it is still a bot, and it still follows step by step instructions much like every other program written in the past century.
When first learning how to make a website, the idea of sharing information with Googlebot will probably not come to mind for most individuals. Instead, you write some HTML, view the page in a web browser, and make a judgment based on how nice it looks. This is certainly a crude interpretation, but my point is that new web developers tend to think about how their visitors view the site, without realizing that a web crawler’s view of the site may be completely different.
With that said, I have already written a slew of articles on search engine optimization. If you are curious, I strongly urge you to read on, as you may find yourself making a few minor adjustments to your website’s layout that cause your traffic to double. Wouldn’t that be nice?
Today I’m submitting this website to Technorati, one of the world’s largest search engines for blogs. In order to prove that I’m the owner of this blog, I need to verify with the following code: 3VNDRN8GG855
The deed is done. It’s out of my hands now, and into the hands of the Technorati staff. In due time, I may or may not be indexed within their pages. Only time will tell. There’s an endless amount of things you can do to promote your own blog, and today I took the Technorati approach.
April 3rd, 2010 by Joe Majewski
If your blog doesn’t authenticate new comments by using a CAPTCHA test, then it shouldn’t be long before you begin receiving a steady amount of spam each and every day with your WordPress blog.
Akismet is a plug-in that comes standard with your WordPress installation, and it does a great job of identifying your spam comments. It’s safe to allow your spam inbox to grow to enormous sizes, as it doesn’t take up very much database space, and you never know if you’ll ever decide to manually search through them for legitimate comments.
Personally, I try to delete new spam as soon as I receive it. There has been occasions where Akismet has falsely accused a comment of mine to be spam, and it’s important to me that all legitimate comments posted get displayed.
Make sure your WordPress settings force you to personally accept comments before allowing them to appear on your website. I would also recommend that you allow WordPress to automatically accept comments made from IP addresses with previously accepted comments. It’s generally safe to assume that a single IP won’t spam you after posting a legitimate comment. Be aware, however, that many spam posts are designed to look “real”, so don’t let anything fool you.
The reason that spam comments are harmful to your blog is due to the links that they attempt to post, and the keywords that they throw on your pages. If enough spam comments get permitted, Google may suspect your website to be spam itself, and this could result in your domain name being sandboxed, or permanent PageRank 0 for all pages, making it very difficult to receive traffic.
Spam comments oftentimes contain many hyperlinks to other websites, which causes your PageRank to be stolen and passed on to the domains being linked to. This could be harmful for your blog in the long run, and it doesn’t appear professional to your visitors that are reading spam comments below your real content.
By whatever means necessary, keep spam comments off of your blog. You can let Akismet do most of the work, but realize the devastation that spam can cause. Be safe.
March 28th, 2010 by Joe Majewski
What are backlinks, why are they important, and how can I get more? These are just a few of the frequently asked questions about search engine optimization. Unfortunately, most professional webmasters neglect to share their secrets simply because the information is invaluable. Yes, invaluable. Metaphorically speaking, backlinks are Internet gold.
Also known as incoming links, inbound links, inlinks, and inward links, backlinks are the key to success when trying to increase traffic to your website. To put it simply, a backlink is a hyperlink from an external website that points to yours. The more pages on the Internet that link to your website, the better. It may seem like I’m stating the obvious, but there’s more to it than the simple fact that more links to your site means more people clicking those links. There is a lot more value in a single backlink than you might think.
Take a look at my article on Google PageRank if you aren’t already familiar with it’s role in determining search results. In short, PageRank (PR) is a number that gets assigned to each page on the Internet to represent it’s importance. When Google calculates this value, it finds all incoming links and takes a “share” of the PR from those pages and gives it to the page linked to.
Take this, for example:

This is a small-scale depiction of how PageRank is obtained. As you can see, page A contains links to and from pages B and C, while pages B and C simply link back to A. The first thing that I’d like to point out is that page A is the big winner, as it has links coming from two sources. Although page A does lose some of it’s PR when linking to the other pages B and C, this is outweighed by the links coming right back.
Let’s hypothetically assume that each page on the Internet begins with a single point of PageRank, and that 85% of it’s PR is given away to pages that get linked to. I didn’t just pull those numbers out of thin air, but they are rather the accepted values used by most webmasters for calculating PR. Proceeding, we can calculate the PageRank of page A by using this formula:
PageRank(A) = .15 + .85 (PR from backlinks)
Looking back at the diagram, we can see that pages B and C each contain only a single link which points back to A. Using the formula above, we can see that .85 PR is given to A from both of the other pages, as the entire 85% is transferred over. Thus, page A will have a PageRank of 1.85 (.15 + .85 from B + .85 from C). Pages B and C both result in a PR of .575 (.15 + .425 from A). The reason that page A gives .425 instead of .85 is because page A has two outbound links, meaning that the PR transferred over has to be shared between all pages being linked to.
Page A: 1.85
Page B: .575
Page C: .575
PageRank cannot be calculated in a single iteration, however. Clearly, page A is the victor, but if those same computations were executed again, the numbers would be slightly different. The second time around, page A would be equal to .15 + .85 (.575 + .575), or 1.1275. After these calculations are performed dozens of times, the values become more and more accurate.
What I’m trying to say is this; backlinks are important. If someone links to this very blog post, my entire website would benefit as a result. The reason is because my entire website is filled with links that circulate throughout all of my pages (kind of like the way all websites operate). The PR that this page would gain as a result of being given a backlink would spread quickly throughout the rest of the site upon performing a dozen or so iterations of PR calculation. The home page, for example, is linked to at the top of every page, so it will naturally have the highest PageRank.
There still remains one unanswered question; how do you get more backlinks? For the most part, providing useful and quality content to your readers is the best way to attempt to generate backlinks. You could also put a link to your site in the signature of a forum that you might actively post on.
If you are having issues with gaining quality backlinks, then perhaps you could clean your website of outbound links. Linking to other websites will lower your PageRank, as seen in the example above. Many people, including myself, believe that it is not a good idea to remove all outbound links from your website. Google’s web crawler doesn’t like running into dead ends while searching the Internet, and if you remove all outbound links, then it will certainly be stuck within your domain and you may be penalized.
Understanding the importance of quality backlinks is essential if you intend on optimizing your website for search engines. If you are relatively new to SEO, then I highly recommend that you read more information on the topic. You may surprise yourself with how effective it can be to make slight changes to the overall design of your website.