Launching an SEO Program in Real Time, Part 3: How to Focus Your Technical SEO Resources for Maximum Value

In the Third Post in an Occasional Series, Ryan Johnson Shares Ricoh USA's Strategy for Revamping Its SEO

February 4, 2018

Ryan Johnson on SEO

This guest post was contributed by Ryan Johnson, who is the SEO Program Manager at Ricoh USA, Inc.

In previous LinkedIn Marketing Solutions Blog entries, I described Ricoh USA’s plan to build an SEO strategy from the ground up and our approach to optimizing content for search. Now, as we move into the third quarter of this strategy, I’d like to focus on what technical issues any tech team should focus on that will make the biggest impact on site rankings.

While keyword research and optimization is still a very important factor in SEO rankings, there are technical SEO elements that must be in place to improve the chances of success. Most of us are operating with a limited tech budget; focusing on these three areas will provide the most noticable results:

1. Site Speed

This is the most important factor in a well-optimized site. Over the past year, Ricoh has focused on optimizing the site for non-branded, competitive keywords. However, even with well-optimized content and a healthy backlink profile, it is still a challenge for Ricoh USA to break into page one in searches.

Organic traffic in 2017 has been steadily climbing, rebounding from a crash in 2016 prior to launching an SEO program. This increased traffic is not translating into the predicted boost in non-branded keyword rankings and share of voice.

When looking at where the site is still falling short, site speed emerges as the primary culprit. Our site currently ranks in the bottom nine percent of all websites tested according to Pingdom’s site speed test.

In 2010, Google announced that site speed would be a factor in search rankings (and this has since extended to mobile as well). Website speed is one of about 200 separate factors used in search rankings, but can have some outsized effects.

One reason is user experience. In Google’s eyes, a slow website does not provide the best result because it makes for a bad customer experience. A slow site leads to more bounces, with 53% of mobile users abandoning sites that take longer than three seconds to load, which can result in decreased conversions.

Slow page speed also affects rankings and indexing because search engines can crawl fewer pages using a website’s allotted crawl budget. For Ricoh, this can negatively affect the number of pages that seach engines can read and index on our site. So, as we are creating newly optimized pages here at Ricoh, it takes longer for a search engine to index, understand and rank all of them because our pages are slow and overloaded with code.

As site speed improves, we should see an increase in “pages crawled per day” and a decrease in “time spent downloading a page” in Google Search Console.

2. Crawl Management

As mentioned above, a website only gets a certain allotment of crawl from search engine spiders, meaning Google will only crawl a certain amount of pages on your site per visit.

There appears to be a strong correlation between the amount of backlinks to a website and the amount of crawl that site receives each month. Therefore, it is important to manage the limited amount of crawl available, especially for sites with fewer backlinks.

It is important to designate high value pages and take action to ensure that search engines visit them often. These may be high-value product pages, pages that convert well or pages that are updated frequently.

We were missing several critical crawl management strategies which we have started to correct:

Updating the robots.txt file. This file tells search engines what pages you do and do not want crawled and indexed on your website. Last year, this was ricoh-usa.com’s robots.txt file:

One problem with this file was in the line that says
User-agent: Googlebot-ImageTags
Disallow: /


This was telling Google not to crawl and index our site’s images. It doesn’t make any sense for a company to exclude images from search. All this does is remove one more potential avenue for people to discover your site.

The other issue was in the line that says
User-agent: *
Allow: /
Disallow: /App_Config*


This sequence says that we want all crawlers to crawl our site except for pages in the App_Config folder. This leaves way too much of our site open to crawl, potentially wasting crawl allotment.

Improving Internal linking. A well-maintained site structure makes content easily discoverable by search bots without wasting crawl budget. We were missing a lot of opportunities to link between pages, which meant it was harder for search engines to discover important content.

Internal link anchor text is used by crawlers to determine the topic of target pages, so they can help improve keyword rankings. Internal linking also helps Google’s search crawlers understand the layout of a site and the page relationships within it. An internal linking strategy can be used to establish a hierarchy of category pages, main pages, and subpages.

We have started adding more internal linking to product pages and thought leadership content, potentially boosting crawl of high-value pages and increasing understanding of our site.

Blocking unnecessary pages from crawl. The “disallow” section of robots.txt should be used to remove unimportant pages and site elements from crawl and index. We really don’t need search engines reading our privacy policy, terms of use or blog category pages. Disallowing some elements is helping us avoid wasting crawl on unimportant content.

Cleaning up the sitemap. XML sitemaps help web crawlers by making content easier to find. We are embracing the importance of keeping our sitemap up-to-date and removing clutter like unnecessary redirects, non-canonical pages and blocked pages.

Among other issues, Ricoh’s sitemap had miswritten URLs that were creating the appearance of duplicate content on the site. Cleaning up the sitemap and adding clear hierarchical relationships between pages will improve our site indexing and contribute to a better understanding of our site by search engines moving forward.

3. Canonical URLs

It is important to make sure there is one version of each page being recorded and indexed as one page instead of multiple pages. We realized that many of our pages were being seen as multiple duplicate pages due to faulty self-referencing canonical URLs.

Search engines index URLs, not pages. Therefore, if a web page has multiple different URLs leading to it, a search crawler may see those as multiple duplicate pages.

These multiple URL versions can be caused by errors in link construction, URL variances caused by page filtering, or a crawler picking up both http:// and https:// versions of a URL. So, this URL, equitrac-express/_/R-S-EQ5EENU5-5-PS1, was seen as having two duplicates: Equitrac-Express/_/R-S-EQ5EENU5-5-PS1 (where someone probably wrote this link somewhere on the site using capital letters) and nuance-sup-reg-sup-equitrac-express-sup-reg-sup/_/R-S-EQ5EENU5-5-PS1 (where the “registered trademark” symbol is being pulled into the URL).

A properly functioning canonical URL would tell search engines that no matter what path someone took to get to this page, there is only one true URL for this page.

Run any site through a free (limited to 500 pages, unlimited crawl for $199/year) website spider like Screaming Frog to see all of a site’s canonicals and pages that are reading as duplicates.

How Do We Know If This Is Working?

There are several signs to look at moving forward to determine our success.

1. Average pages crawled. We should start to see an increase in our average pages crawled in search console, which is a good indicator of site speed and readability.

2. Site speed test. There are many site speed tests available, like YSlow, Google Page Speed Insights and Pingdom Site Speed Test. Site speed should be checked every quarter or so.

3. Pages indexed. There are two ways to verify this. One is through Search Console, which provides insight into the number of pages indexed by Google. As site crawl improves, this should increase.

Another method to check indexing is to enter site:site address into Google search (in this case site:https://ricoh-usa.com). This will show all of the pages Google has indexed for a particular website.

I once ran this test for a non-Ricoh client and found that they had only one page on their site that was indexing, so they had spent the last decade or so running an essentially invisible website.

4. Internal link metrics. Looking at the internal links tells us where our internal links are pointing on the site. If we want to accentuate certain pages, we can use this data to change the internal linking structure.

5. Duplicate content. If a crawler picks up duplicate meta content (page titles and meta descriptions), this can be a sign that canonical URLs are not working correctly. We have been able to diagnose many issues with our sitemap and canonical URLs by digging into duplicate meta content warnings in a Screaming Frog Crawl.

Content Can Only Do So Much

A site that hits all of the right keywords and has a healthy backlink profile will still only enjoy moderate success if these technical issues are not managed.

A search engine can only show the pages that it can crawl and understand, and the primary goal of SEO is to make a search engine’s job easy. Adding these technical  elements to Ricoh’s SEO strategy will pay far-reaching dividends in terms of content indexing and organic traffic, and should be a priority in any brand’s search plan.

For more insight into how marketers are making digital tools work for them, subscribe today to the LinkedIn Marketing Blog

Topics