Episode 61: Optimising internal links and content structure
In this episode you will hear an Authoritas 'Tea Time SEO' recording of Mark...
Or get it on:
In this episode, you will hear Mark Williams-Cook going through the 3 new metrics that have been introduced in the latest version 6 of Lighthouse. He'll also be discussing the latest news and thoughts on how Google is handling unlinked mentions.
Show notes
New podcast from Google Webmasters called “Search Off the Record” -https://twitter.com/googlewmc/status/1263464185607880704
Unlinked citations and nofollow - https://www.seroundtable.com/unlinked-citations-google-rankings-29490.html
Lighthouse v6.0 - https://web.dev/lighthouse-whats-new-6.0/
Transcription
MC: Welcome to episode 62 of the Search with Candour podcast. Recorded on Friday 22nd of May 2020, my name is Mark Williams-Cook and today we're gonna be taking a deep dive into the new lighthouse metrics and the lighthouse version six update that's just been released, and we'll also have a quick discussion about unlinked citations and nofollow links.
Before we kick off, I thought it'd be really cool to mention I saw yesterday that Google webmasters has announced its own podcast, which will surely be better than this one, and it's called Search Off the Record, and it's gonna have people like John Mueller and Martin Split from Google on it. I listened to the two/ two-and-a-half minute ad that they ran about Search Off the Record, and essentially they're saying that they're going to be talking about what they're essentially doing behind the scenes at Google, what they're working on day to day, and just give webmasters some insight maybe into what's coming, before it's been announced. It's not there as a podcast version of the documentation, they're going to be talking about other things. So it's called Search Off the Record, looks like it's going to be really great, I'll definitely be listening to that, so check it out.
I saw an interesting discussion this last week come up about unlinked citations and I wanted to talk about it quickly because it's a topic I've discussed before with other SEOs, and this is around the ideas if we just think about links in general about how Google is handling different types of link, and in this conversation I think it would be good to include the changes that were made to nofollow, that we've talked about in previous episodes, so if you haven't heard of that before or listened to those episodes or you missed it, Google, I believe it's the first of March, said that they were going to make this change whereby they started treating the nofollow tag as a hint, rather than a directive - the difference being a directive is something that a search engine will always obey and a hint is something they will use in combination, normally with other factors, to decide whether they want to honor it or not. And the general thinking about that was that Google is trying to navigate its way around an issue they've kind of created for themselves, which is that many editorial sites, such as newspapers, will just blank it nofollow, every external link on their site, and they're doing this for obvious risk mitigation factors themselves, because linking out to other sites - there's that perceived risk of does it look like a paid link or is it a trustworthy site? Are we accidentally linking to a competitor and helping them? So as a fix-all solution, many of these editorial sites just took the decision to nofollow every external link and this saved them the time and hassle of having to manually go through and make decisions about which link should be followed and which link shouldn't, because it's not really important to them.
That did create this side issue for Google which is that part of the web, if you want to call it PageRank algorithm, the algorithm they're using total to look at how web pages are linking together, relies on the fact that good web pages are also linking to other good web pages and citing them and mentioning them, so Google can take these these hints. So nofollow the definition was that this nofollow tag stops links from passing page rank essentially, it stops that importance getting across. So in this nofollow update, as we've covered before, as I said they have changed their position and said we're going to start treating nofollow as a hint, although, as it was reported that hasn't actually happened yet, so we don't really know what the impact of that was. And at the same time Google released some granular forms of nofollow, which were rel sponsored and UGC - UGC being user-generated content which allowed webmasters to specifically say, okay this link is paid for, this link is generated by users, or this link is nofollow and the general thought consensus behind this is that, firstly, Google said that you don't have to use those specific granular level rel sponsored or rel UGC, it's absolutely fine just to keep nofollow, which to me and i think a few people agree with this, would indicate that maybe they are using this voluntary tagging system to help train their models to be able to spot - okay this whole group of links here were marked as sponsored, these ones are just marked as nofollow but they look really similar, therefore, in terms of how we treat them as a hint we're going to fall on the side of the fence that they're probably sponsored. So it helps them, I think, detect whether they should be respecting that nofollow tag or not. So that's an interesting thought now about whether when these changes kick in and Google is treating nofollow as a hint, whether these links from editorial sites that are just blanket no following do become followed in a way if you like. So there's that whole discussion about how Google may or may not be treating links with nofollow.
The discussion I picked up on unlinked citations is something, as I said I've spoken about before, I find it really interesting which is that this discussion about whether someone mentions your brand or even your website URL and doesn't link to it, whether Google counts this or not. And interestingly enough, someone directly just asked John Mueller this on Twitter the other day and he was kind enough to reply and his reply was: “The short version is, usually not, in my opinion, and the long version is somewhat hard to squeeze into tweet form, as you can imagine I’m not in the mood to write a long essay, so hopefully the short version is useful as a starting point.” I think it is useful as a starting point cited to kind of take us back in time a bit, it was in 2013 that I saw this first discussed and Google said they could use these unlinked citations for discovery, so that they can use them to discover new URLs, new web pages, that they weren't aware of before but they didn't use them for ranking.
I've previously discussed patents that have been filed by Google that include both explicit and implied links. So if I just read out, I'm going to read out part of a patent, a ranking patent, that was filed by Google that mentions implied links which says: “An implied link is a reference to a target resource eg a citation to the target resource, which is included in a source resource but not an express link to the target resource,” so, sorry expressed not explicit link so - “but it's not an express link to the target resource thus, a resource in the group can be the target of an implied link without the user being able to navigate to the resource by following the implied link. and there's been other mentions in patents about, a link can be an express link or an implied link. An express link exists where resource explicitly refers to the site and implied link exists where there is some other relationship between a resource and the site. So the fact that these things are mentioned in Google patents when they're talking about, so the first patent was specifically about how to rank web pages, looking at links and what things should be considered, it's interesting that Google is starting to look at or has been, I shouldn't say start to, it has been looking at these things.
The more recent speculation is that brand mentions, things like that, can be used for entity identification. So there's long been this push by Google to switch to, as they call it, this thing's not strings model, which is this graph database, where they're building up an understanding of what things are, what things are part of other things, and what the relationships between those things are, which then gives you a framework to be able to answer more complicated questions. So you know Google, if you type something like who is the founder of Warner Brothers, Google could just give you normally an instant answer to that and a question like that actually requires a huge amount of understanding from the machine to understand that Warner Brothers is a company, a company is founded by things called founders, founders are types of people, and these are the people that we've identified as founders and related to Warner Brothers - so it's really interesting to see how all of this meshes together.
I wanted to summarise it really which is that, in my opinion, Google is using these unlinked, implied citations, in some way for discovery - like this said, maybe for entity and identification, but I think it's fair to say that where you can get link, it's always going to be a lot better. So if you're doing link reclamation, where you're getting alerts where your brand, your websites, your products, have been mentioned and they're not linked, you know that's great, it's brilliant people are talking about you - there is all kinds of research as well done around how Google's looking at how people are talking about brands and entities online and if they're positive or negative, but if you can still get a link linked to those people I certainly wouldn't stop doing that.
Lighthouse version 6 has just been released, if you haven't used it before, where have you been? Lighthouse is an automated website auditing tool that's basically there to help developers find opportunities to diagnose and improve user experience of their sites. You can access it, if you're in chrome, by hitting f12 and that'll open up the developer panel, and it should be called lighthouse in the list of tabs - it used to be called audit or audits, it's called lighthouse now, so it's been renamed. And this will allow you to run a set of audits on your site; so performance progressive, web app checks, best practice accessibility, SEO and you can simulate mobile or run on desktop. So I'd use this really regularly for looking at the performance of sites, it's a really nice tool to get some feedback on that. Just one note that's worth taking into consideration, if you are going to use it, I always run it in an incognito window that doesn't have any other Chrome extensions running. So other Chrome extensions can interfere with the results you will get, so I try to run it from like a clean instance of the browser.
It gives some really helpful metrics, so the two that I use regularly with lighthouse are TTI and FCP. So TTI stands for ‘time to interactive’ and that measures the time from when the page starts loading, to when it's main sub resources have loaded, and it's capable of reliably responding to user input frequently. So essentially, when they can start using the page there's TTI. And ‘first contentful paint’ is FCP, and that's how long it takes for the site to start being rendered on the screen. And it's interesting because FCP, I really like it as a metric, because it's not just how long this took to load this webpage, because it's looking at when things are starting appearing on your screen, this is actually what directly impacts the satisfaction of the user in terms of site speed. So if your page took three seconds to load and it went from being completely blank for 2.9 of those seconds and then then in that last tenth of a second, it just all loaded and then, you had another website that also took three seconds to load but after half a second it started loading, like the main text, navigation, things like that, and it still took three seconds to have the full page load but people could see that content earlier, you'll tend to find those users will feel their experience on that site is much faster and will have less frustration, they'll be more satisfied. So FCP was always been a really interesting metric for me and how we can get the pages loading as quickly as possible to try and keep people engaged in the site.
So the big announcements for version 6 of Lighthouse is they have introduced some really interesting new metrics - so there's three new metrics they're bringing in, they've replaced three metrics, so three in and three out. The ones going out are ‘first meaningful paint’, not first contentful paint, first CPU idle and max potential FID, I'm not going to talk about them because they're on the way out, so we want to talk about the new metric. So there's three new metrics the first one is ‘largest contentful paint’ - which is LCP, and lighthouse describe this as, a measurement of perceived loading experience, so very similar to what we just talked about with the first contentful paint FCPs, so this is largest contentful paint it marks the point during page load when the primary or largest content is loaded and is visible to the user. LCP is an important complement to the first contentful paint, FCP, which only captures the very beginning of the loading experience. LCP provides a signal to developers about how quickly a user is actually able to see the content of a page. An LCP score below 2.5 seconds is considered good. So one of the issues I think they were having with FCP, with the first contentful paint, was this was really counting as soon as anything was starting to be rendered on the page - you know, the first kind of pixel, which isn't necessarily always helpful to users, so this complement says largest contentful paint is looking at when this main body is loaded. So they're saying this gives a better average signal of when the user is seeing things loaded, and they're going to be happy, and their experience is going to be improved and they set the bar at that as 2.5 seconds or under is good.
The second new metric they've introduced is called ‘cumulative layout shift’ - CLS, which I had a guess at what it would be and I'm pretty happy to say is that, in it's a measurement of visual stability. So CLS cumulative layout shift, quantifies how much page’s content visually shifts around. A low CLS score is a signal to developers that their users aren't experiencing unjú content shifts. A CLS score below zero point one is considered good. So you've probably all experienced this, especially on advert heavy websites, but I think it's actually sometimes used intentionally as a trick to get you to click on ads, which is you start loading a web page and it becomes interactive, so you can click on stuff, and then just as you go to click on something, the whole navigation shifts as a new element loads in. I've certainly seen this on those click bait types of sites where they're saying, here's 10 pictures of what celebrities look like now and then they want you to click previous and next, and the last things normally load is an advert which is very close to the previous/next, so you can look at the image and then half a second later you go to click on next, but just before you do, a new advert loads pushes the next previous links down and sure enough, there's an advert there that you've accidentally clicked on. And that can just be frustrating anyway for users because as people are going through sites quickly, they may want to just start clicking and interacting with things before it's fully loaded. So cumulative layout shift, CLS, is now a way that lighthouse has got of measuring how much of the page is shifting around as its loading.
The third new metric is called total blocking time, TBT and this quantifies load responsiveness measuring the total amount of time when the main thread was blocked long enough to prevent input responsiveness. TBT measures the total amount of time between first contentful paint, FCP, and time to interactive TTI - so that was one of the first initial metrics I spoke about. It is a companion to the TTI and it brings more nuance to quantifying main thread activity that blocks a user's ability to interact with your page. So this is essentially the time from when the content starts appearing on the page, to when it becomes interactive. So this is a really nice, as it says it layer of nuance. So before the TTI was just from when we requested the web page to when it was interactive and that could sometimes be artificially blown up, by if the first contentful painttook a long time to start to load. So it might be that the delay between first contentful paint and interactive is really small, but the TTI is still really high because you waited 3, 4 seconds before the FCP even started. So this total blocking time, TBT, measures the time between these two metrics. So the time between first contentful paint and time to interactive - so at a glance you can see if you’ve got issues with the webpage starting to render, or actually the block you've got in terms of interactivity is this loading time. And there's another note saying, additionally total blocking time, TBT, correlates well with the field metric first input delay, FID - which is a core web vital.
So along with this, Lighthouse have recalculated how they score performance. So each of these metrics previously had a weight associated with it, so you get a 0 to 100 performance score when you run these audits and because they now have new metrics and they've shifted some old metrics around, they have recalculated the weightings of these metrics. So I've put a link in the show notes at search.withcandour.co.uk, that will link to, they did some analysis of how websites real websites were affected by these score changes, so if actually published the the data they collected there, and the summary is that around 20% of sites see noticeably higher scores, around 30% have hardly any change, and around 50% see a decrease of at least five points. And along with this, really helpfully, they've published a scoring calculator which helps you see the difference between Lighthouse version 5 and Lighthouse version 6 scores. So when you run an audit now with lighthouse version 6, the report will come back with a link to the calculator with your audit results already populated in it. So you can see, okay well if we've lost 10 points now between version 5 and version 6 of Lighthouse will tell you exactly where you're losing those points. And really interestingly, the version 6 of Lighthouse comes with, where it can source location links, which means that some of the issues that Lighthouse finds about a page can be traced back to a specific line of source code, and the report will state the exact file and line that's relevant. So this is to make these changes really easy to explore in dev tools, it's taking us forward a couple of steps, so rather than the historically more vague advisor all you need to look at this category of thing to make your site faster, this will actually start to, where possible, be really specific in your source code to where you can make improvements.
Some of you have been using Lighthouse through a Chrome extension because there is actually a Chrome extension for lighthouse - part of this announcement for version six is they've said, unfortunately, the browser extension was too complicated to support and the Lighthouse team felt that because the Chrome dev tools lighthouse panel is a better experience, in that the report integrates with other panels, they could reduce their engineering overhead by simplifying the Chrome extension. So instead of running lighthouse locally, the browser extension, lighthouse, now uses the PageSpeed insights API and they said they recognise that will not be a sufficient replacement for some of their users, because there are some key differences.
So the key differences are the PageSpeed insights is unable to audit non-public websites, and that's because it's run via remote server - so if you run your chrome lighthouse reports through dev tools, its run locally through that browser, so you can order anything that you can see in your browser essentially. Also the PageSpeed insights is not guaranteed to use the latest lighthouse release, and the PageSpeed insights is a Google API and using it constitutes accepting the Google API Terms of Service, so not everyone may be able to accept those terms of service. So there are, it seems, some quite good reasons now, if you are using the Chrome extension for Lighthouse to actually switch and start using the inbuilt version in chrome dev tools, and even if you're not a technical person, not a developer, as I said it's really easy to access - you just hit f12 and it's just there, under Lighthouse, you can check which audits you want to run, just hit run and it runs in pretty much the same way.
There's actually a whole host of more information and changes in version 6 of Lighthouse, so these were just the main ones that I wanted to make everyone aware of, I think that will impact maybe recommendations and things we look at for SEO and helpful things that will give us easier conversations with developers. So it's important that if we're doing technical SEO, that we understand these new metrics because these will become standard in terms of how people are looking at performance. But there are loads more updates with things like, continuous integration, and the lighthouse audit looking at unused JavaScript, and a whole bunch of things that I think, if you're working with developers I would strongly encourage you to get them to go and read the full post, which we’ll link to as I said in search.withcandour.co.uk and that’s everything I've got time for in this episode, thank you very much for listening, we're going to be back in one week's time, of course, which will be Monday the 1st of June, so I hope you'll tune in.
In this episode you will hear an Authoritas 'Tea Time SEO' recording of Mark...
In this episode, Mark Williams-Cook will talk about why talking about the...