Archives for :

The Marketing Agency Helping the Little Guys

Over the last few years, the number of digital marketing agencies has continued to increase, … WebCroppers specializes in local digital marketing.

SearchCap: Google link warning, in-store sales & more SEM

Below is what happened in search today, as reported on Search Engine Land and from other places across the web. The post SearchCap: Google link warning, in-store sales & more SEM appeared first on Search Engine Land.

Please visit Search Engine Land for the full article.

Evidence of the Surprising State of JavaScript Indexing

Posted by willcritchlow

Back when I started in this industry, it was standard advice to tell our clients that the search engines couldn’t execute JavaScript (JS), and anything that relied on JS would be effectively invisible and never appear in the index. Over the years, that has changed gradually, from early work-arounds (such as the horrible escaped fragment approach my colleague Rob wrote about back in 2010) to the actual execution of JS in the indexing pipeline that we see today, at least at Google.

In this article, I want to explore some things we’ve seen about JS indexing behavior in the wild and in controlled tests and share some tentative conclusions I’ve drawn about how it must be working.

A brief introduction to JS indexing

At its most basic, the idea behind JavaScript-enabled indexing is to get closer to the search engine seeing the page as the user sees it. Most users browse with JavaScript enabled, and many sites either fail without it or are severely limited. While traditional indexing considers just the raw HTML source received from the server, users typically see a page rendered based on the DOM (Document Object Model) which can be modified by JavaScript running in their web browser. JS-enabled indexing considers all content in the rendered DOM, not just that which appears in the raw HTML.

There are some complexities even in this basic definition (answers in brackets as I understand them):

  • What about JavaScript that requests additional content from the server? (This will generally be included, subject to timeout limits)
  • What about JavaScript that executes some time after the page loads? (This will generally only be indexed up to some time limit, possibly in the region of 5 seconds)
  • What about JavaScript that executes on some user interaction such as scrolling or clicking? (This will generally not be included)
  • What about JavaScript in external files rather than in-line? (This will generally be included, as long as those external files are not blocked from the robot — though see the caveat in experiments below)

For more on the technical details, I recommend my ex-colleague Justin’s writing on the subject.

A high-level overview of my view of JavaScript best practices

Despite the incredible work-arounds of the past (which always seemed like more effort than graceful degradation to me) the “right” answer has existed since at least 2012, with the introduction of PushState. Rob wrote about this one, too. Back then, however, it was pretty clunky and manual and it required a concerted effort to ensure both that the URL was updated in the user’s browser for each view that should be considered a “page,” that the server could return full HTML for those pages in response to new requests for each URL, and that the back button was handled correctly by your JavaScript.

Along the way, in my opinion, too many sites got distracted by a separate prerendering step. This is an approach that does the equivalent of running a headless browser to generate static HTML pages that include any changes made by JavaScript on page load, then serving those snapshots instead of the JS-reliant page in response to requests from bots. It typically treats bots differently, in a way that Google tolerates, as long as the snapshots do represent the user experience. In my opinion, this approach is a poor compromise that’s too susceptible to silent failures and falling out of date. We’ve seen a bunch of sites suffer traffic drops due to serving Googlebot broken experiences that were not immediately detected because no regular users saw the prerendered pages.

These days, if you need or want JS-enhanced functionality, more of the top frameworks have the ability to work the way Rob described in 2012, which is now called isomorphic (roughly meaning “the same”).

Isomorphic JavaScript serves HTML that corresponds to the rendered DOM for each URL, and updates the URL for each “view” that should exist as a separate page as the content is updated via JS. With this implementation, there is actually no need to render the page to index basic content, as it’s served in response to any fresh request.

I was fascinated by this piece of research published recently — you should go and read the whole study. In particular, you should watch this video (recommended in the post) in which the speaker — who is an Angular developer and evangelist — emphasizes the need for an isomorphic approach:

Resources for auditing JavaScript

If you work in SEO, you will increasingly find yourself called upon to figure out whether a particular implementation is correct (hopefully on a staging/development server before it’s deployed live, but who are we kidding? You’ll be doing this live, too).

To do that, here are some resources I’ve found useful:

Some surprising/interesting results

There are likely to be timeouts on JavaScript execution

I already linked above to the ScreamingFrog post that mentions experiments they have done to measure the timeout Google uses to determine when to stop executing JavaScript (they found a limit of around 5 seconds).

It may be more complicated than that, however. This segment of a thread is interesting. It’s from a Hacker News user who goes by the username KMag and who claims to have worked at Google on the JS execution part of the indexing pipeline from 2006–2010. It’s in relation to another user speculating that Google would not care about content loaded “async” (i.e. asynchronously — in other words, loaded as part of new HTTP requests that are triggered in the background while assets continue to download):

“Actually, we did care about this content. I’m not at liberty to explain the details, but we did execute setTimeouts up to some time limit.

If they’re smart, they actually make the exact timeout a function of a HMAC of the loaded source, to make it very difficult to experiment around, find the exact limits, and fool the indexing system. Back in 2010, it was still a fixed time limit.”

What that means is that although it was initially a fixed timeout, he’s speculating (or possibly sharing without directly doing so) that timeouts are programmatically determined (presumably based on page importance and JavaScript reliance) and that they may be tied to the exact source code (the reference to “HMAC” is to do with a technical mechanism for spotting if the page has changed).

It matters how your JS is executed

I referenced this recent study earlier. In it, the author found:

Inline vs. External vs. Bundled JavaScript makes a huge difference for Googlebot

The charts at the end show the extent to which popular JavaScript frameworks perform differently depending on how they’re called, with a range of performance from passing every test to failing almost every test. For example here’s the chart for Angular:

Slide5.PNG

It’s definitely worth reading the whole thing and reviewing the performance of the different frameworks. There’s more evidence of Google saving computing resources in some areas, as well as surprising results between different frameworks.

CRO tests are getting indexed

When we first started seeing JavaScript-based split-testing platforms designed for testing changes aimed at improving conversion rate (CRO = conversion rate optimization), their inline changes to individual pages were invisible to the search engines. As Google in particular has moved up the JavaScript competency ladder through executing simple inline JS to more complex JS in external files, we are now seeing some CRO-platform-created changes being indexed. A simplified version of what’s happening is:

  • For users:
    • CRO platforms typically take a visitor to a page, check for the existence of a cookie, and if there isn’t one, randomly assign the visitor to group A or group B
    • Based on either the cookie value or the new assignment, the user is either served the page unchanged, or sees a version that is modified in their browser by JavaScript loaded from the CRO platform’s CDN (content delivery network)
    • A cookie is then set to make sure that the user sees the same version if they revisit that page later
  • For Googlebot:
    • The reliance on external JavaScript used to prevent both the bucketing and the inline changes from being indexed
    • With external JavaScript now being loaded, and with many of these inline changes being made using standard libraries (such as JQuery), Google is able to index the variant and hence we see CRO experiments sometimes being indexed

I might have expected the platforms to block their JS with robots.txt, but at least the main platforms I’ve looked at don’t do that. With Google being sympathetic towards testing, however, this shouldn’t be a major problem — just something to be aware of as you build out your user-facing CRO tests. All the more reason for your UX and SEO teams to work closely together and communicate well.

Split tests show SEO improvements from removing a reliance on JS

Although we would like to do a lot more to test the actual real-world impact of relying on JavaScript, we do have some early results. At the end of last week I published a post outlining the uplift we saw from removing a site’s reliance on JS to display content and links on category pages.

odn_additional_sessions.png

A simple test that removed the need for JavaScript on 50% of pages showed a >6% uplift in organic traffic — worth thousands of extra sessions a month. While we haven’t proven that JavaScript is always bad, nor understood the exact mechanism at work here, we have opened up a new avenue for exploration, and at least shown that it’s not a settled matter. To my mind, it highlights the importance of testing. It’s obviously our belief in the importance of SEO split-testing that led to us investing so much in the development of the ODN platform over the last 18 months or so.

Conclusion: How JavaScript indexing might work from a systems perspective

Based on all of the information we can piece together from the external behavior of the search results, public comments from Googlers, tests and experiments, and first principles, here’s how I think JavaScript indexing is working at Google at the moment: I think there is a separate queue for JS-enabled rendering, because the computational cost of trying to run JavaScript over the entire web is unnecessary given the lack of a need for it on many, many pages. In detail, I think:

  • Googlebot crawls and caches HTML and core resources regularly
  • Heuristics (and probably machine learning) are used to prioritize JavaScript rendering for each page:
    • Some pages are indexed with no JS execution. There are many pages that can probably be easily identified as not needing rendering, and others which are such a low priority that it isn’t worth the computing resources.
    • Some pages get immediate rendering – or possibly immediate basic/regular indexing, along with high-priority rendering. This would enable the immediate indexation of pages in news results or other QDF results, but also allow pages that rely heavily on JS to get updated indexation when the rendering completes.
    • Many pages are rendered async in a separate process/queue from both crawling and regular indexing, thereby adding the page to the index for new words and phrases found only in the JS-rendered version when rendering completes, in addition to the words and phrases found in the unrendered version indexed initially.
  • The JS rendering also, in addition to adding pages to the index:
    • May make modifications to the link graph
    • May add new URLs to the discovery/crawling queue for Googlebot

The idea of JavaScript rendering as a distinct and separate part of the indexing pipeline is backed up by this quote from KMag, who I mentioned previously for his contributions to this HN thread (direct link) [emphasis mine]:

“I was working on the lightweight high-performance JavaScript interpretation system that sandboxed pretty much just a JS engine and a DOM implementation that we could run on every web page on the index. Most of my work was trying to improve the fidelity of the system. My code analyzed every web page in the index.

Towards the end of my time there, there was someone in Mountain View working on a heavier, higher-fidelity system that sandboxed much more of a browser, and they were trying to improve performance so they could use it on a higher percentage of the index.”

This was the situation in 2010. It seems likely that they have moved a long way towards the headless browser in all cases, but I’m skeptical about whether it would be worth their while to render every page they crawl with JavaScript given the expense of doing so and the fact that a large percentage of pages do not change substantially when you do.

My best guess is that they’re using a combination of trying to figure out the need for JavaScript execution on a given page, coupled with trust/authority metrics to decide whether (and with what priority) to render a page with JS.

Run a test, get publicity

I have a hypothesis that I would love to see someone test: That it’s possible to get a page indexed and ranking for a nonsense word contained in the served HTML, but not initially ranking for a different nonsense word added via JavaScript; then, to see the JS get indexed some period of time later and rank for both nonsense words. If you want to run that test, let me know the results — I’d be happy to publicize them.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Albuquerque Transmission Repair Shop Tranco Retains 505IM As Marketing Arm

(MENAFN Editorial) Tranco Transmission Repair is a local provider in Albuquerque, announces a new partnership with 505 Internet Marketing, …

Search in Pics: Google drum set, a deep blue room & Google sneakers

In this week’s Search In Pictures, here are the latest images culled from the web, showing what people eat at the search engine companies, how they play, who they meet, where they speak, what toys they have and more. Google deep blue meeting room: Source: Twitter Google elephant march…

Please visit Search Engine Land for the full article.

SearchCap: Search trademark issues & search pictures

Below is what happened in search today, as reported on Search Engine Land and from other places across the web. The post SearchCap: Search trademark issues & search pictures appeared first on Search Engine Land.

Please visit Search Engine Land for the full article.

Help Local Marketing Agency – Takes 10 Min

Hi Craigslist, We need 3 people that don't use their facebook advertising account to let us rent it out. We work on advertising campaigns for businesses …

Initial Interest Confusion rears its ugly head once more in trademark infringement case

Columnist Chris Silver Smith details recent trademark infringement cases in which site search results pages caused legal issues for retailer sites. E-commerce businesses should take notice! The post Initial Interest Confusion rears its ugly head once more in trademark infringement case appeared…

Please visit Search Engine Land for the full article.

Should SEOs Care About Internal Links? – Whiteboard Friday

Posted by randfish

Internal links are one of those essential SEO items you have to get right to avoid getting them really wrong. Rand shares 18 tips to help inform your strategy, going into detail about their attributes, internal vs. external links, ideal link structures, and much, much more in this edition of Whiteboard Friday.

Should SEOs Care About Internl Links?

Click on the whiteboard image above to open a high-resolution version in a new tab!

Video Transcription

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week we’re going to chat a little bit about internal links and internal link structures. Now, it is not the most exciting thing in the SEO world, but it’s something that you have to get right and getting it wrong can actually cause lots of problems.

Attributes of internal links

So let’s start by talking about some of the things that are true about internal links. Internal links, when I say that phrase, what I mean is a link that exists on a website, let’s say ABC.com here, that is linking to a page on the same website, so over here, linking to another page on ABC.com. We’ll do /A and /B. This is actually my shipping routes page. So you can see I’m linking from A to B with the anchor text “shipping routes.”

The idea of an internal link is really initially to drive visitors from one place to another, to show them where they need to go to navigate from one spot on your site to another spot. They’re different from internal links only in that, in the HTML code, you’re pointing to the same fundamental root domain. In the initial early versions of the internet, that didn’t matter all that much, but for SEO, it matters quite a bit because external links are treated very differently from internal links. That is not to say, however, that internal links have no power or no ability to change rankings, to change crawling patterns and to change how a search engine views your site. That’s what we need to chat about.

1. Anchor text is something that can be considered. The search engines have generally minimized its importance, but it’s certainly something that’s in there for internal links.

2. The location on the page actually matters quite a bit, just as it does with external links. Internal links, it’s almost more so in that navigation and footers specifically have attributes around internal links that can be problematic.

Those are essentially when Google in particular sees manipulation in the internal link structure, specifically things like you’ve stuffed anchor text into all of the internal links trying to get this shipping routes page ranking by putting a little link down here in the footer of every single page and then pointing over here trying to game and manipulate us, they hate that. In fact, there is an algorithmic penalty for that kind of stuff, and we can see it very directly.

We’ve actually run tests where we’ve observed that jamming this type of anchor text-rich links into footers or into navigation and then removing it gets a site indexed, well let’s not say indexed, let’s say ranking well and then ranking poorly when you do it. Google reverses that penalty pretty quickly too, which is nice. So if you are not ranking well and you’re like, “Oh no, Rand, I’ve been doing a lot of that,” maybe take it away. Your rankings might come right back. That’s great.

3. The link target matters obviously from one place to another.

4. The importance of the linking page, this is actually a big one with internal links. So it is generally the case that if a page on your website has lots of external links pointing to it, it gains authority and it has more ability to sort of generate a little bit, not nearly as much as external links, but a little bit of ranking power and influence by linking to other pages. So if you have very well-linked two pages on your site, you should make sure to link out from those to pages on your site that a) need it and b) are actually useful for your users. That’s another signal we’ll talk about.

5. The relevance of the link, so pointing to my shipping routes page from a page about other types of shipping information, totally great. Pointing to it from my dog food page, well, it doesn’t make great sense. Unless I’m talking about shipping routes of dog food specifically, it seems like it’s lacking some of that context, and search engines can pick up on that as well.

6. The first link on the page. So this matters mostly in terms of the anchor text, just as it does for external links. Basically, if you are linking in a bunch of different places to this page from this one, Google will usually, at least in all of our experiments so far, count the first anchor text only. So if I have six different links to this and the first link says “Click here,” “Click here” is the anchor text that Google is going to apply, not “Click here” and “shipping routes” and “shipping.” Those subsequent links won’t matter as much.

7. Then the type of link matters too. Obviously, I would recommend that you keep it in the HTML link format rather than trying to do something fancy with JavaScript. Even though Google can technically follow those, it looks to us like they’re not treated with quite the same authority and ranking influence. Text is slightly, slightly better than images in our testing, although that testing is a few years old at this point. So maybe image links are treated exactly the same. Either way, do make sure you have that. If you’re doing image links, by the way, remember that the alt attribute of that image is what becomes the anchor text of that link.

Internal versus external links

A. External links usually give more authority and ranking ability.

That shouldn’t be surprising. An external link is like a vote from an independent, hopefully independent, hopefully editorially given website to your website saying, “This is a good place for you to go for this type of information.” On your own site, it’s like a vote for yourself, so engines don’t treat it the same.

B. Anchor text of internal links generally have less influence.

So, as we mentioned, me pointing to my page with the phrase that I want to rank for isn’t necessarily a bad thing, but I shouldn’t do it in a manipulative way. I shouldn’t do it in a way that’s going to look spammy or sketchy to visitors, because if visitors stop clicking around my site or engaging with it or they bounce more, I will definitely lose ranking influence much faster than if I simply make those links credible and usable and useful to visitors. Besides, the anchor text of internal links is not as powerful anyway.

C. A lack of internal links can seriously hamper a page’s ability to get crawled + ranked.

It is, however, the case that a lack of internal links, like an orphan page that doesn’t have many internal or any internal links from the rest of its website, that can really hamper a page’s ability to rank. Sometimes it will happen. External links will point to a page. You’ll see that page in your analytics or in a report about your links from Moz or Ahrefs or Majestic, and then you go, “Oh my gosh, I’m not linking to that page at all from anywhere else on my site.” That’s a bad idea. Don’t do that. That is definitely problematic.

D. It’s still the case, by the way, that, broadly speaking, pages with more links on them will send less link value per link.

So, essentially, you remember the original PageRank formula from Google. It said basically like, “Oh, well, if there are five links, send one-fifth of the PageRank power to each of those, and if there are four links, send one-fourth.” Obviously, one-fourth is bigger than one-fifth. So taking away that fifth link could mean that each of the four pages that you’ve linked to get a little bit more ranking authority and influence in the original PageRank algorithm.

Look, PageRank is old, very, very old at this point, but at least the theories behind it are not completely gone. So it is the case that if you have a page with tons and tons of links on it, that tends to send out less authority and influence than a page with few links on it, which is why it can definitely pay to do some spring cleaning on your website and clear out any rubbish pages or rubbish links, ones that visitors don’t want, that search engines don’t want, that you don’t care about. Clearing that up can actually have a positive influence. We’ve seen that on a number of websites where they’ve cleaned up their information architecture, whittled down their links to just the stuff that matters the most and the pages that matter the most, and then seen increased rankings across the board from all sorts of signals, positive signals, user engagement signals, link signals, context signals that help the engine them rank better.

E. Internal link flow (aka PR sculpting) is rarely effective, and usually has only mild effects… BUT a little of the right internal linking can go a long way.

Then finally, I do want to point out that what was previous called — you probably have heard of it in the SEO world — PageRank sculpting. This was a practice that I’d say from maybe 2003, 2002 to about 2008, 2009, had this life where there would be panel discussions about PageRank sculpting and all these examples of how to do it and software that would crawl your site and show you the ideal PageRank sculpting system to use and which pages to link to and not.

When PageRank was the dominant algorithm inside of Google’s ranking system, yeah, it was the case that PageRank sculpting could have some real effect. These days, that is dramatically reduced. It’s not entirely gone because of some of these other principles that we’ve talked about, just having lots of links on a page for no particularly good reason is generally bad and can have harmful effects and having few carefully chosen ones has good effects. But most of the time, internal linking, optimizing internal linking beyond a certain point is not very valuable, not a great value add.

But a little of what I’m calling the right internal linking, that’s what we’re going to talk about, can go a long way. For example, if you have those orphan pages or pages that are clearly the next step in a process or that users want and they cannot find them or engines can’t find them through the link structure, it’s bad. Fixing that can have a positive impact.

Ideal internal link structures

So ideally, in an internal linking structure system, you want something kind of like this. This is a very rough illustration here. But the homepage, which has maybe 100 links on it to internal pages. One hop away from that, you’ve got your 100 different pages of whatever it is, subcategories or category pages, places that can get folks deeper into your website. Then from there, each of those have maybe a maximum of 100 unique links, and they get you 2 hops away from a homepage, which takes you to 10,000 pages who do the same thing.

I. No page should be more than 3 link “hops” away from another (on most small–>medium sites).

Now, the idea behind this is that basically in one, two, three hops, three links away from the homepage and three links away from any page on the site, I can get to up to a million pages. So when you talk about, “How many clicks do I have to get? How far away is this in terms of link distance from any other page on the site?” a great internal linking structure should be able to get you there in three or fewer link hops. If it’s a lot more, you might have an internal linking structure that’s really creating sort of these long pathways of forcing you to click before you can ever reach something, and that is not ideal, which is why it can make very good sense to build smart categories and subcategories to help people get in there.

I’ll give you the most basic example in the world, a traditional blog. In order to reach any post that was published two years ago, I’ve got to click Next, Next, Next, Next, Next, Next through all this pagination until I finally get there. Or if I’ve done a really good job with my categories and my subcategories, I can click on the category of that blog post and I can find it very quickly in a list of the last 50 blog posts in that particular category, great, or by author or by tag, however you’re doing your navigation.

II. Pages should contain links that visitors will find relevant and useful.

If no one ever clicks on a link, that is a bad signal for your site, and it is a bad signal for Google as well. I don’t just mean no one ever. Very, very few people ever and many of them who do click it click the back button because it wasn’t what they wanted. That’s also a bad sign.

III. Just as no two pages should be targeting the same keyword or searcher intent, likewise no two links should be using the same anchor text to point to different pages. Canonicalize!

For example, if over here I had a shipping routes link that pointed to this page and then another shipping routes link, same anchor text pointing to a separate page, page C, why am I doing that? Why am I creating competition between my own two pages? Why am I having two things that serve the same function or at least to visitors would appear to serve the same function and search engines too? I should canonicalize those. Canonicalize those links, canonicalize those pages. If a page is serving the same intent and keywords, keep it together.

IV. Limit use of the rel=”nofollow” to UGC or specific untrusted external links. It won’t help your internal link flow efforts for SEO.

Rel=”nofollow” was sort of the classic way that people had been doing PageRank sculpting that we talked about earlier here. I would strongly recommend against using it for that purpose. Google said that they’ve put in some preventative measures so that rel=”nofollow” links sort of do this leaking PageRank thing, as they call it. I wouldn’t stress too much about that, but I certainly wouldn’t use rel=”nofollow.”

What I would do is if I’m trying to do internal link sculpting, I would just do careful curation of the links and pages that I’ve got. That is the best way to help your internal link flow. That’s things like…

V. Removing low-value content, low-engagement content and creating internal links that people actually do want. That is going to give you the best results.

VI. Don’t orphan! Make sure pages that matter have links to (and from) them. Last, but not least, there should never be an orphan. There should never be a page with no links to it, and certainly there should never be a page that is well linked to that isn’t linking back out to portions of your site that are of interest or value to visitors and to Google.

So following these practices, I think you can do some awesome internal link analysis, internal link optimization and help your SEO efforts and the value visitors get from your site. We’ll see you again next week for another edition of Whiteboard Friday. Take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Propel Marketing Wins Local Media Association's Best Digital Agency Award

Propel Marketing's CEO Peter Cannone accepts the award from Nancy Lane, President of LMA, for Best Digital Agency on behalf of Propel at Local …