Content

Providing Context Clues with Structured Data – Moving Beyond Org Schema

Aug 24th, 2020

While structured data implementation is still far from universal, we are noticing that more brands are managing the bare minimum – org and website schema. While this is welcome, there is a long way to go before these sites are doing all they can

At this point, I’ll admit, anyone who has taken a look will be within their rights to cry ‘physician, heal thy self’, but to offer a defence of sorts – our development time is spent on clients’ rather than our own site. With this mea culpa out of the way, let me explain what I mean by moving beyond org schema.

We’ve covered schema several times – from making your content more machine readable to building entities for EAT purposes, but I’ll cover some of that ground again just to build a bit of a foundation.

What is schema markup?

The result of a collaboration between Yahoo, Bing and Google back in 2011, there came in to being a site called schema.org, this site seeks to unify the language used by webmasters to provide metadata on pages which can be easily read by search engine spiders and parsers. Schema markup is how we refer to the code that provides this metadata.

If structured data is (to extend the metaphor) the scaffolding that allows for better understanding of information, then markup is the individual scaffolding poles. With hundreds of varieties of possible markup types, the aim is to create a machine readable internet – or in the words of the creator of the world wide web, Tim Berners-Lee, a semantic web:

I have a dream for the Web [in which computers] become capable of analysing all the data on the Web – the content, links, and transactions between people and computers. A “Semantic Web”, which makes this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The “intelligent agents” people have touted for ages will finally materialize.

How this relates to search

If we take the Berners-Lee quote, the intelligent agents he spoke of are best represented by the various crawlers and ranking algorithms used by search engines. While not intelligent in a way most people would recognise, they would certainly breach the boundaries of ‘smart’, these agents have grown increasingly complex over the last thirty years and are now at the stage where they are (as I’ve discussed elsewhere) either already or soon to be making calls on rankings all by themselves.

The construction of the semantic web, therefore, is of increasingly immediate importance. Information retrieval and entity extraction are core parts of any SERP from the rich results that feature alongside the blue links, to judgements made about the quality and trustworthiness of individual pieces of content.

Why we’re not doing enough

For the most part, the way we’re approaching schema is in blocks – and I see this almost constantly as I navigate the web (because I’m the type that absolutely will check out your site’s fundamentals before buying a t-shirt or packet of screws) – while the percentage of sites still lacking schema entirely has dropped considerably over the last three or four years, the ones that do tend to have a few basic types and little else.

These types tend to be:

Organistion
Website
CreativeWork

This is what I’m referring to as blocks – they are communicating meta-information (information about the information, or data about data) about the pages, but they lack any real connectivity to the emerging semantic web as a whole. Richard Wallis, in his talk at the 2019 Benchmark Search and Digital Marketing Conference (back in the before times), used a jigsaw metaphor to communicate the building of connectivity for pages – and I have since, ruthlessly and without conscience, stolen it. It’s a great way to think about it, and the talk itself is well worth watching.

These individual ‘@types’ are jigsaw pieces with no joining sides – they offer a description of your data, but no contextualisation of it. This is true even for one of the other more common types – the ‘Product’ type. While this type is fairly widely adopted in eCommerce, it is often done with too few properties or external mentions and, again this isolates the structured data from its broader spot in the semantic web.

Why is this important in the age of EAT?

While not every industry needs to be concerned with EAT, those that do could do worse than making structured data a priority in the coming months. This is especially true of brands that are chasing larger competitors, or start ups – while website authority is determined in large parts by inbound linking (quality of links equalling distance from select seed sites), there are other measures of authority that smaller brands can appeal to and this is where entity detection comes in.

Using structured data, you can build expert entities and express your authority in other ways while your link profile is built naturally over time. This entails using properties like ‘employee’ to define specific experts your brand employs, and linking them to ‘Person’ @types that detail their relevant industry memberships and qualifications.

By building your brand’s expert entities and contextualising them within the overall industry ontology, you can demonstrate your authority to the various ‘intelligent agents’ that are trying to parse and rank the terabytes of information that are added to the web everyday.

What we can do to improve

The short answer is to give your structured data more connective sections – and you can do this by ensuring that you’re using as many properties as are relevant to each of your @types but also by using such sub-types as ‘Mentions’ – which allows you to pick out points of reference in the content and join them to the entity in question.

Take this brief markup of some of the entities referred to in my previous article on this site:

As a subtype to the ‘Article’ type, this small section of contextual metadata would not only connect the data to the overall knowledge graph for each of these individual entities, but also disambiguates words such as ‘leads’ and ‘search’ to place them in the greater industry ontology.

Overall, we come back again to the jigsaw metaphor (thank you Richard Wallis) – the more other pieces of the semantic web that your information can interlock with, the better the various algorithms can understand and judge your content. The way the industry approaches schema at the moment is both literally and figuratively disjointed, but things are progressing – in the meantime, it’s a way for you to potentially steal a march on your competition as ML algorithms are likely to require far more structured data than those with set parameters.

It’s not just about having schema, it’s about connecting schema.

Keep up to date with the latest in search and digital marketing news and opinion by subscribing to our newsletter – or contact us today to see what we can do for your brand.