Dealing with Stolen Content

One of the wonderful things about the web is that most of the world’s information is accessible online. Better still, a large portion of the world has access to all of that information. Search engines play a huge role in making it easy to sift through that information and find the stuff you are looking for. Problems arise, however, when people who are less scrupulous decide to publish content and decide that the best way to do that is to steal it. Unfortunately, the web makes stealing of your content quite easy, and enforcement of your rights somewhat difficult.

Assessing the Consequences

One of the first things you need to is to assess the level of damages to you. Getting your stolen content removed from someone else’s web site requires a fair amount of work, and you should only pursue it if you are likely to be impacted at some level.

In general, the search engines are pretty good at recognizing the original author of a piece of content. However, the search engines can make mistakes. For example, if you just launched a new blog that has little visibility on the web, and your article it stolen by the New York Times. However, this is pretty unlikely to occur. And, if a prominent site steals your content it is usually quite easy to address, as we will explain below.

The first thing I would do when worrying about content theft is take several different unique strings from the content and search on it within double quotes. For example, if your content included the phrase: “The slow gray fox tripped over the startled dog”, you can search on “slow gray fox tripped over” (including the quotes) in Google and Bing. If your article comes up first, that is a good sign that the search engines know that you are the authoritative source for that content.

Try this with several phrases to make sure that you are OK. One key tip – avoid picking phrases that include punctuation, such as commas, hyphens, and quote characters. These seem to work less well for these types of searches. Once your testing is done, if you show up first for everything you need to consider whether you are suffering any damages.

Another component to consider is whether or not the stolen article contains links back to your site. If it does, the search engines are pretty good at unravelling this type of theft, and knowing that you are the original author. Chances are that you passed the quoted strings test above if this is the case.

Making Content Harder to Steal

There are some things that you can do to make your content harder to effectively steal, or to lower the consequences of the damage if it is stolen. The two major ways to do that are:

  1. Use relative links for images. I.e. something that looks like “/images/yourimage.gif”, instead of “http://www.yourdomain.com/images/yourimage.gif”. The reason for doing this is that it will force the thief to copy all of the images in your content over to their web servers, or to modify the links to absolute links.
  2. For the same reasons, use relative links when referring to your CSS files, or any Javascript you have on the site. Note that if you use third party tools such as Google Analytics, you will need to use absolute links to refer to that. Just make sure you use relative links for any Javascript you have developed for the site and which is hosted on your web server.
  3. Use absolute links when linking to other pages on your site. I.e., “http://www.yourdomain.com/page1.html” instead of “/page1.html”. The reason for doing this is that it ensures that you get links back to your site, unless the thief goes through all the content and modifies those links to make them relative links.
  4. For fun, you can also create a custom piece of Javascript that recgonizes what domain it is on, and if it runs and finds it is not on your web servers, it publishes a big bold image that says “STOLEN CONTENT” on the stolen pages.
    1. The general idea here is to make your content more work to steal than someone else’s content. Few publishers will take all the steps outlined above, and therefore those other people will represent easier targets for thieves than you.

      Taking Action

      Of course, there are times when it is worth taking action. I recommend a three step process when doing this:

      1. Contact the site owner. Use whatever means they provide for doing so, tell them where the offending content is, and tell them they need to take it down, or you will take action. Even though you are angry, there is no need to be nasty about it. Focus on your goal, which is to get it taken down. However, do be very clear that you intend to pursue this further.
      2. If that does not work, the next step is to contact the hosting company for the web site. You can often get this information from their WhoIs records at the registrar, but if it is not there, try using a third party service such as Who is Hosting This. The reason for contacting the hosting company is that they can be held liability for the content theft if you have notified them and they do not act on it. They may be more motivated to avoid the liability than your thief.
      3. The third step is to file a DMCA request with the search engines. Here is the Google DMCA form and the Bing DMCA instructions. The beauty of this is that the search engines also have an obligation to respond. Do not do this lightly! Do it only if you are in fact the original author. If you used a contractor to write that article, do some due diligence to make sure that they did give you original content.

      This three step process should address most issues. Since it will take a lot of time and effort, do make sure you evaluate whether or not it is worth it. If there are no real damages to you, then it probably is not taking action, unless someone is copying your whole site, or otherwise extensively stealing from you.

Open for Questions!

One of the big challenges with having a blog while running multiple businesses is coming up with article topics. So today, I am going to formally open this blog up to requested topics. No matter how basic or complicated the question might be, as long as it is about online marketing, I will consider taking it on and writing about it. Want to know about meta tags? I will write about it. Want to know my thoughts on the future of search, or the impact of a big announcement? I will write about that too.

I can’t promise that I will write about every suggested topic, but I will publish responses to suggested topics 1 to 2 times per week. So bring it on, and let me know what you would like me to address, and I will do my best to accommodate!

Heading Tags, Keywords & Enticing Content

I received a request on Twitter (I am @stonetemple) from @nsandlin for an article on how to balance on page content between enticing content for users and keyword rich content for search engines. It’s a great question, and I am going to run with that, but expand it a bit to include my philosophy on how search engines evaluate on page tags. My short answer on how to set this balance is to default to users first, but as you might expect, there are some subtleties to this!

How search engines look at a page

I believe we need to look past artificial perceptions about how search engines look at a given page. For example, whether you use an <h1> tag or a <h2>, <h3>, or whatever on a page does not matter. What does matter is the relative nature of what you do. For example, if you have lots of basic text on a page, then headers such as these will stand out:

<p><strong>Your Section Header Here<strong><p>

This will stand out just as much as an <h1> tag on that same page. So in my view, you don’t need to use heading tags at all. In contrast, imagine if the <h1> was evaluated as the defining element of your page, then it might be a good idea to put all of the content on your page inside an <h1>, wouldn’t it? Of course that does not make sense.

The better way to think about it is that the HTML markup elements you use on your page are a way to communicate to users what your page is about, and as long as you use basic HML markup the search engines will see those signals and weight the words/phrases in their accordingly.

If you have multiple types of heading tags, or use my <strong> tag type heading above with other more traditional heading tags, then the search engines will consider the relative weight of these. These tags will cause extra weight to be allocated to the words embedded in them.

Ultimately, think of this as your page has a score, we can call it a certain amount of equity points, that is defined by its link profile and other external signals. These equity points can be allocated to the content of a page, and your use of heading tags of various types, and other HTML tags helps the search engine see how you weight the various aspects of the page. Note that I am making this up and I have no proof as to exactly how the search engines look at this, but I believe that the basic concept I am about to outline is 100% correct. Bottom line is that how you use heading tags does not add equity points to the page, it simply allocates the points yo ualready have. I.e, it’s all relative, so spend your points wisely!

Balance of Keyword Rich and End User Enticing Headings

There are many scenarios where this does not need to be a choice. The great thing about keyword tools is that they provide real insight into the user mindset for many common web interactions. For example, if you use the Adwords Keyword Tool, and you search on digital cameras using exact match mode, and then you sort on search volume, you will see that “Olympus” comes out on top with 368,000 searches per month, with “canon cameras” second at 246,000.

This type of data tells you something about how people search when they are looking for a digital camera – there is a strong tendency for people to search on brand names first, rather than entering a generic search query like “digital cameras” (which still comes in at a healthy 165,000 searches per month). This is an example of the type of insight you can get from these tools. This also maps into what users will look for when they look at your web page.

Clearly if they search on a phrase like “Olympus camera” they will respond well to that phrase being present on the page. But what if they come from another website via a link, or as direct traffic? It turns out that the keyword mindset may still work out well for you for that type of visitor.

My examples so far keyed on the search for a digital camera, but what if you are writing a news article of some sort? That can be a different story, particularly if your site is a news or blog site with lots of subscribers. The person’s mind may not be on a product. Instead, they may just be browsing looking for things that interest them. What will catch their attention may be something like: “16 ways that green tea can save your life”.

The distinction is that in the digital camera scenario we were dealing with people who are shoppers. They are looking to acquire something, either now, or in the near future. In the news scenario we are dealing with browsers. Their mindset is quite different. They don’t know what they are looking for yet, so the hook to get them engaged needs to be different. In deciding how you to approach things on your site, the first step is to decide what type of user is going to be on the page in question. Are they browsers or shoppers?

For shopping oriented pages the triggers for the user are easily discovered using a keyword tool. For a news oriented page, the orientation is probably different. You need to think beyond the keyword tool to hook the user. On some sites you can have both types of pages, and you need to treat them differently. That’s perfectly OK. Notice that the way I have outlined this I am not trying to use the blog as the direct means for bringing people in for conversion purposes. Of course, you can use blogs differently, in which case you would need to adjust. However, my preference is to use blogs to attract users, links, and build relationships wit the community, and to have other pages on the site that are keyword focused.

One other factor that I want to mention, and that is the impact on potential linkers. Links, and social media sites, play a large role in driving rankings on your site. Whatever you do, don’t do things that alienate that audience. They will be your judge and jury. Give them a great experience, and they will reward you with links or positive social media mentions.

Summary

My philosophy is to lean in direction of the user the great majority of the time. As noted, keep in mind that keyword tools often provide valuable insight into the web population’s way of thinking of things. In other scenarios, such as the news or blog scenario, the mind set may be quite different, and you need to treat those pages differently. Do the search engines recognize all of these signals perfectly? No, they don’t, but they are definitely trying to get there.