Search engines love it when you make their jobs easier by requiring less time and bandwidth to understand your site. When it comes down to it, you can save yourself a lot of headaches if you can talk your developers out of using a JS framework if the application doesn’t require it. In a recent study, it was found that Bing and Yahoo are not able to render JS frameworks on their own like Google is, and even then Google doesn’t handle it as well as server-side frameworks. But warnings aside, if you end up losing this battle and must ultimately go down the JS route, here are some tips to help you make the most of a JS framework.
Crawling JS sites:
Just because Google can “understand your JS,” you still need to do traditional on-page SEO practice to ensure the best results. A lot of frameworks are going to be calling page content and meta tags such as titles and descriptions. In the source code, that looks like this:
If you look at the source code, you are just going to see the same JS call on every page, which means your traditional crawler is going to see it, too (try using "inspect element" instead of "view source" to find out what happens after JS executes). So how do you audit your site for weak, missing, and duplicate meta data? Enter Screaming Frog 6.2. For those of you that aren’t familiar with Screaming Frog, it’s one of the most widely used SEO crawler in the industry. For those of you who are familiar with Screaming Frog, did you know you can now configure it to render client-side JS? To my knowledge it’s the only mainstream crawler to do so (I’m looking at you, DeepCrawl). This will allow you to see what the page looks like AFTER JS is executed, and ultimately what the search engine and the user are going to see. This is what you will need to audit.
To do this, simply fire up the latest version of Screaming Frog, then select:
You can leave the timeout to 5 seconds in most cases.
Use Http header directives instead of meta in the <head>
Shout out to Mike King for this one. A lot of people don’t realize they can use http header canonicals and X-robots to control indexation as opposed to the traditional <rel=”canonical” href=http://www.index-me.com> and <meta name=”robots” content=”noindex”>. The advantages of these are two-fold: accuracy and crawl budget optimization. Since the http header is the first thing a search crawler reads, it ensures it will see the tags without having to download AND render your wall of scripts. Side note: this can be a great practice for all sites that are struggling with crawl budget waste.
Don’t block scripts
This one sounds like a no-brainer but it often gets overlooked. If you are using a lot of scripts to render your page, you have to let Google access them. As recent as 2014, it was a best practice to block Google from crawling your resources, as search engines traditionally just crawled the DOM. The easiest way to check for this is by using the Fetch as Google in Google Search Console. Enter your URL and Google will show you what your page looks like to them vs. the user.
Both images should look pretty similar, but they don’t need to be exactly the same. You will want to look out for any important items such as headings, body copy and links that aren’t showing up properly. If you scroll to the bottom of the results page, you will see a list of resources that Google is not allowed to access. If you do any paid media advertising, you might see a bunch of conversion tracking scripts here from ad servers like DoubleClick, Conversant, or Marin. If you see any scripts that are on your domain or CDN, you should check your robots.txt file and get these opened up to Google.
Does EVERYTHING have to be in JS?
This is never a fun conversation with your development team: “So you know that JS framework that you guys love so much because it’s super fast? It’s killing our organic traffic. Can we make some elements HTML?” Protip: Bring snacks and/or beer to this meeting. In an ideal world, you would like the following items to be included in the DOM, in order of priority:
1. Title Tag
2. Meta Description
4. Body Copy
6. Canonical (if you can’t get it done in the http header)
7. Meta Robots (if you can’t get it done in the http header)
The first two, title tag and meta description, should be pretty easy; they aren’t going to slow the site down since they are just 2 lines of code, and they will help with your indexation and ranking. The rest of the list might be more of a stretch, but if it is possible to serve the content in the DOM and use JS to manipulate/place it in the proper spot in the page, then that will be a huge win for your SEO efforts.
Pre-render your site and serve Google a HTML cache With this option, you can use a pre-render solution such as prerender.io to serve Google a cached HTML version of the JS page the users are seeing. I know what you are thinking, this sounds a lot like cloaking. Relax, it’s Google approved.
I’m not going to go any further into pre-rendering, but if you're interested you can learn more from the Pre-render master, Richard Baxter.