Building Multi-Million $ Web Sites from Scratch (Part 5 of …)

Code Architecture

When building a large web site from scratch, it’s critical that you have the right code architecture. A weak code architecture can torpedo your entire plan, or can leave you in a position where you can’t scale your site. Given the type of content we have been talking about in this series, you may needs tens of thousands of web pages, or even hundreds of thousands.

You want to be able to implement this in a way where each page is unique and different to avoid duplicate content problems, yet the cost to maintain these pages is low. You also want to make sure you implement a code structure that is very clean from a search engine perspective, so let’s deal with this second issue first.

Clean Code

You want the unique content of your page to show up immediately below the BODY tag. If your BODY tag is at line 30, you want a DIV tag for the main section of your content at line 31, and then your H1 tag at line 32. By unique content, we mean the stuff that shows up only on that page, beginning with an H1 tag that labels the page. Unique content does not include standard navigation or menus, any Javascript or Flash, or standard footers, etc.

Then you must use absolute positioning in your CSS file. The CSS for this will look something like this:

#main {
position: absolute;
left: 100px;
top: 100px;
padding-left: 10px;
padding-top: 10px;
}

where “main” is the name of your div tag where the content related to this statement will appear.

The key then is to organize all of your sections of your page using DIV tags, and use absolute positioning definitions of each of these DIV tags in the layout of your web page. Once you have done this, you can easily move the main body of your unique content right under your BODY tag. You can read more about how to use CSS for SEO in this article.

A last note on this. Some technical people may raise concerns that absolute positioning statements are not interpreted the same way by all browsers, and that there is a risk that it will not look right in some browsers. There is some truth to this concern, but our testing shows that IE, Firefox, Mozilla, and Netscape all have no problem with it.

And we have been told by senior Google people that this type of code architecture will probably lead to higher rankings for your site. So it seems to me that if you double your site volume, but 1% of your visitors have trouble with the layout in their browser, you are still way ahead of the game. Get over it and take the doubled traffic!

Scalable Code

So the next big issue is having highly scalable code. Assuming that you have assembled a ton of content, you need to get this content into a database of some sort. We typically use My SQL, but any database you are comfortable with is fine, provided that your hosting company can set it up for you, or you can get it set up yourself on your web server. Of course, it’s critical that the database be filled with unique and interesting content

.Next you need to think about your code as something that dynamically generates and renders static pages. We are talking about a web application here. You can, of course, render the pages offline, and then push them live. Alternatively, you can do this on the fly as the requests from users comes into the web server.

This is the most important part of the architecture! You want the maintenance of your site to be focused on developing content and updating the database. The better you get at this, the easier it is to grow massive, content rich, web sites.

Another key element is to pick a programming language that you are good at (or can get good at) that makes these sorts of manipulations easy (well easier). Regardless of what you pick, you are going to using the language for things that it may not have been intended for. We use Perl here, but there are certainly other valid choices.

It’s hard to provide more details, because those additional details are dependent on the unique content you have pulled to together and the desired presentation. The concept we have outlined above is not easy. We invested many man-years in developing a flexible, yet industrial strength, code base that could meet our needs.

But once you have it, it becomes an engine that can be used for launching large sites as quickly as you can manage to assemble unique data sets. Unfortunately, building these large databases of content is not easy either. But we never said it would be easy. But here is what it is – it’s reliable and it works. That’s a good thing to know if you are interested in making millions of dollars.

Next up

  1. How to get links
  2. How to monitor results, and what to do about it

Already Published Articles in the Series

  1. Picking a Market and Content Strategy
  2. Using PPC to Enhance your Organic Traffic Strategy)
  3. Site Hierarchy and Keyword Selection
  4. Content Development

del.icio.us tags: , , ,

Comments

  1. Are you saying that search engines give the most weight to the first text they encounter in the body section? While this makes sense, you may, even with CSS, end up doing some pretty strange things to put the content right behind the body tag. Most people put the social web links (del.icio.us and others) at the top of their page where they are easily seen. This would make links with the URL and the title of the content the first thing that the search engine sees, followed by the content. From a usability stand point it makes sense to do this. But what I think you are saying is that from an SEO standpoint you would be better off putting those links at the bottom of the page. Correct?

    Is there any magic to calling your content division “main”? I notice that on your Stone Temple site your content division is called “content”…

  2. stonecold says:

    You can have the del.icio.us and digg tags appear, from a user perspective, right at the top of the page, but still have the key text content of the page show up immediately after the BODY tag. The key here is to use CSS absolute positioning to determine where content elements appear visually, regardless of the order in which they appear in the HTML source file.

    You can read more about this here: http://www.stonetemple.com/articles/css-and-seo.shtml

Speak Your Mind

*

*