Reviewing My Website With Google's Webmaster Guidelines Part 6 - Schema.org

Share this on:

50th day of the epic 100 posts in 100 days challenge

Web pages can contain a lot of information. A lot of that information can be useful. But how do the search engines know what's what? Who's the author of the article (if there are many names in the text)? When was it written? Is it about an event? Is it describing a film or a book?

Humans can tell this information from the context, but search engines find it much more difficult.

There's a lot of information on the page but it is unstructured. It has a structure in terms of layout but not what it actually means. This is what people are refering to when they talk about the semantic web.

The semantic web is a movement to try and get webmasters and website owners to give some hints to the search engines in their web pages what all the stuff actually means.

That's the reason behind Schema.org. It has defined markup to go in the underlying web page code that tells the search engines what's what.

It has definitions that cover creative work, events, organizations, people, places, products and more.

Now, usually, this is something I add on my client's sites but something I have neglected on this site. So lets go ahead and add some markup to make this a search engine friendly website.

I followed the instructions on http://www.schema.org/docs/gs.html and started adding the markup to my page templates.

For this post I concentrate on Thing > CreativeWork > Blog . What markup you use will depend on the type of content and the information you want to convey on your web page. To find what data type you need you will have to think about this and think of a word or phrase that describes it, you can then go the the schema.org website and use the search field to search for an appropriate data type.

This involved adding markup in the same way that we described yesterday for the open graph protocol for Facebook. This time though the mark up goes in the <body> of the page rather than the <head> . It is closer to the information it is describing.

You first define the area of HTML that you are telling the search engines about like this:

<div class="articleBody" itemscope itemtype="http://schema.org/BlogPosting">
    .... 
</div>

Then you tell it about the individual items, like this:

<div class="articleBody" itemscope itemtype="http://schema.org/BlogPosting">
       <h1 class="title" itemprop="name">[ [*pagetitle] ]</h1>
       <meta itemprop="author"  content="[ [*publishedby:userinfo=`fullname`] ]"> 
       <meta itemprop="about"  content="[ [+article_tags] ]">  
       <meta itemprop="description"  content="[ [*introtext] ]"
       <div class="journalEntry" itemprop="articleBody">
          [ [*content] ]
       </div>
       ..................
</div>

Where the contents of the square brackets are replaced by the appropriate values by your content management system (in this case MODX). Your web developer should be able to set this up for you.

Next we can test to see if the markup is correct by going to Webmaster Tools > Your Site > Other resources > Structured Data Testing Tool

So that code is now added to about 60 of my existing articles, and any subsequent articles.

Everything seems to check out ok except a couple of errors:

Error: Page contains property "comment" which is not part of the schema.
Error: Page contains property "comment" which is not part of the schema.
Error: Missing required field "dtstart".<br>Error: Missing required field "name"

So this relates to the way I marked up my comments template for the blog, so I remove property comment, and add name, but property dtstart is still given as an error and a bit of a mystery....hmmm...

After a bit of too-ing and fro-ing I eventually got it to validate without any errors. The mistake I made was trying to wrap the comments with the item type:

itemtype="http://schema.org/UserComments"

Which is wrong as UserComments come under Events Thing > Event > UserInteraction > UserComments. Though I found it from from an expected type link on the Thing > CreativeWork > Article > BlogPosting page. What I needed was http://schema.org/Comment

OK, so all my existing blog posts and new blog posts will be marked up nicely with this microdata format that the search engines approve of. Lets see if I get any search engine brownie points.

You can see the unederlying code for this page (or any page on the website) by hitting F12 (in IE, Chrome or Firefox) or right-clicking the page and selecting "view page source".

What web design did I do today?

I did this:

7 hours 37 minutes Client websites: updating blogs, ironging out some caching problems effecting menu highlighting, checking search engine rankings, javascripting fun, registering sites with Google and Bing, and communicating with clients
24 minutes Admin: email and social media

Total: 8 hours 1 minute

Nothing out of the ordinary report, another day in the home office.

Exercise: Half an hour trimming hedges, 1 hour yoga.

Tomorrow: A day off cycling in the Yorkshire Dales.

Share this on:
Mike Nuttall

Author: Mike Nuttall

Mike has been web designing, programming and building web applications in Leeds for many years. He founded Onsitenow in 2009 and has been helping clients turn business ideas into on-line reality ever since. Mike can be followed on Twitter and has a profile on Google+.