Historically, confusion within the SEO community was around how Google was able to understand the content, and therefore the context of the page (and thus organically rank it effectively), if that content is rendered by the client, instead of the server. This confusion was overcome by Google’s admission of a step between crawling and indexing, whereby Google will attempt to render the content of the site.
To understand the scale of the task of crawling and rendering the web, we should keep in mind that today’s web has around 130 trillion documents, and Google’s crawler is comprised of thousands of machines running constantly, but even with these thousands of machines, operating power is finite, meaning that:
Google will crawl a page, fetching the server side rendered content and will then run initial indexing on that document. Rendering JS powered websites however, whereby content is loaded by the client, requires a lot of resource, so as above, this is deferred until resources are available; perhaps several days later.
Google will then perform another wave of indexing on that client side rendered content, meaning if your site uses lots of client side JS for rendering, you can trip up when content is being indexed. With Googlebot undertaking two waves of crawling and rendering, some details can be missed.
If the website is a single page progressive web app (PWA), all unique URLs will share a base template of resources, which are then filled in with content via AJAX or fetch requests.
John Mueller mentioned the need to ask questions in this situation, such as “Did the initial server side render of the webpage have correct canonical tags”, because if canonical tags are rendered by the client, Google will completely miss them because that second wave doesn’t check for the canonical tag at all.
The important take away from this example is that this is a real issue that affects indexing and it’s a key consideration for how search engines understand the content of a webpage.
Alongside client side and server side rendering is hybrid rendering, whereby pre-rendered HTML is sent from the server to display, but upon interaction by the user, the server will add JS content on top. In this situation, the search engine will only render and index the pre-rendered html content.
How can indexing issues for JS powered websites be overcome?
Recognising this as a legitimate issue for your JS powered website is the first step. Once recognised, Google’s recommendation involves adding a new step to your server infrastructure to act as a dynamic renderer. This will effectively read client side rendered content to users, and will send a pre-rendered version of the content to search engines. More information around how this idea might be achieved can be found here.
Google was quick to state however, that this is not a requirement for JS powered sites to be indexed.
While dynamic rendering sounds like and likely is a lot of work, a JS powered website may want to implement this if it is particularly large and changes rapidly, and requires quick indexing, since these sites will not want to wait the few days for the second wave of rendering and indexing. Google also pointed out the need to compare the time and effort of this implementation against the potential benefits.
How can I see the page in the same way Google renders it?
Use Google Search Console for this. Navigate to fetch as Google, click on path, and you’ll see the downloaded HTTP response.
In order to check how to content is actually rendered (visually), use the Google’s Mobile Friendly Test. This will show you what was created after rendering with the Mobile Googlebot. If the page is not rendering as expected, you can navigate to ‘page loading issues’ within the Mobile Friendly Test, and this will show all resources blocked by Googlebot.
Slow and inefficient pages will render inconsistently. In general, if the page works in the Mobile Friendly Test, it’ll work for search indexing too.
In order to check how the desktop version of your webpage is rendered, use Google’s Rich Results Test.
Notes on indexing images and Click to Load patterns
The presenters mentioned that lazy loaded images will only sometimes be indexable, depending on how they are implemented. If you want to ensure that Googlebot can crawl and index lazy loaded images, JSON-LD structured data should be employed to refer to the images. Images referred to solely by CSS will not be indexed.
Finally, regarding ‘click to load patterns’, such as infinite scroll and tab to load more content, Googlebot will not see these by default when crawling and rendering. This content should be preloaded, or alternatively, separate URLs can be used.