Lesson 3 - What Is Indexation?

Transcript

Once Google has found a page, the next question is whether it should store that page and make it eligible to appear in the search results.

This process is called indexation and it’s a fundamental concept in SEO because a page that is not indexed cannot appear in the search results at all.

It doesn’t matter how well written it is, how many backlinks it has, or how useful it may be. If it’s not indexed, it’s going to be invisible to searchers.

Now, by the end of this video, you’re going to understand what indexation means, how to check whether a page has been indexed, why mobile first indexing matters, and how files like the robots.txt, as well as your meta tags, influence which pages Google decides to index.

After a page has been crawled, search engines then decide whether or not to index it. Now, indexing is the process where Google tries to understand what the page is about and whether it’s suitable to be shown in the search results.

During indexing, Google analyzes the content of your page and that includes the text, the images, the videos, the metadata, and the overall structure of your web page. This information is then stored in Google’s index, which you can think of as a huge library of pages that Google can choose from when somebody searches something.

Now, it’s important to understand that crawling does not guarantee indexing. A page can be discovered and still never make it into the index. And there are many different reasons why this can happen.

For example, the content might be very thin or duplicated elsewhere. The page could be blocked intentionally. The site might be trying to push too many pages at once. And in some cases, the page or even the website might not meet the so-called quality threshold that Google is looking for.

Now, there are a few different ways to check whether or not a page has been indexed in Google. And in this section of the video, we’re going to go through some of the most reliable ones. Now, the first one we’re going to look at is Google Search Console, which I’ve already spoken about a few times in this course.

So, what I decided to do is just have a quick detour and show you how to set this up for your website if you’ve not already got Google Search Console set up, so you can follow along with the next steps.

Now, the first thing you want to do is go to search.google.com/search-console/about. And when you go there, you’ll see a page that looks like this. You’ll want to click on this start now button. And when you do that, you’ll get taken through to a page that looks like this.

If you don’t have any properties inside of your Google Search Console right now, this is exactly what you’ll see. When you come in here, you want to click add property and then add property again.

And there’s a couple of different ways that you can add your website to Google Search Console, but I’m going to show you the easiest one, which is using the URL prefix.

So, what you want to do is go over to the website that you want to add and copy the URL of the homepage. Click into this box and then paste that into there, making sure that you keep your HTTPS at the beginning. Then click on continue and just give it a couple of moments and then it’s going to give you a few different ways for you to verify that you own this website.

Now, there are a few different ways that you can do this and you can choose whichever one suits you best. The way that I personally find easiest is to just download this HTML file by clicking here, letting that download locally onto your PC.

And then what you need to do is you need to upload that file to the root directory of your website. So you’re going to either need to give this to your developer or if you have access to an FTP client and you can get into your website’s files, you want to upload this to the root directory of your website.

So I’m going to do that right now. Okay, so now I’ve just gone into FileZilla and I’m looking at the root directory of my website right now. And what I’m going to do, you can see there’s already one of these codes in here.

This is from when I actually verified it on my regular account, but I’m going to put another one in here now so I can show you. So I’ve got my downloaded file over here, and I’m just going to drag this over to here like this. And we’re just going to let that upload. There we go.

Now I’m going to come back to search console and I’m going to click on this verify button right here just to let search console know that it’s now on the server and it can go and take a look. And there we go. So now it says ownership verified verification method HTML and I can click on go to property and it’s going to take me through to search console.

Now, one thing you’ll notice here is that when I do this I’ve already got loads of data. That’s because search console has been set up for this website using another one of my Google accounts. But for you, if you check in here and it says that there’s no data and to check back in a couple of days, just know that that’s perfectly normal and to be expected.

There are a few different ways for you to check whether or not a page has been indexed in Google Search Console, but the easiest way is by using the URL inspection tool.

All you need to do is go over to the page that you want to check has been indexed in Google and copy the URL. Then go over to search console and find this bar that’s at the top of the page here that says inspect any URL in your website address.

Paste that URL into there and hit enter. And then it will just take a couple of moments and then it will give you back this report here which will tell you very clearly in this section here whether or not the page has been indexed.

So in my case, you can see it says right here page indexing. Page is indexed.

The easiest way to check whether or not a page has been indexed is by using the site search operator in Google. You’ll want to type site colon and then the URL of the page you want to check whether or not has been indexed into Google and then see whether or not the page comes up.

If the page appears, it means that it’s been indexed. If it doesn’t appear, it means it’s not been indexed. This is a much quicker and easier way than going through the rigmmoral of setting up search console, but it will not give you back any information as to how to resolve the issue if it’s not indexed.

Mobile first indexing means that Google primarily uses the mobile version of your website when deciding how to index and rank your pages. This shift happened because most people are now browsing the web on their mobile devices rather than their desktop computers, which you’ll usually see reflected in your own data on Google Analytics.

Now, from an SEO standpoint, this means that the mobile version of your site is no longer secondary. It’s the main version that Google is looking at. So, if your mobile site contains less content than your desktop version, you could be missing out on some valuable ranking opportunities since Google is expecting the same content to be available on both versions of your website, even if it’s laid out a bit differently to suit the mobile format.

A lot of indexing problems come from simple technical settings being wrong. Two of the biggest causes are mistakes with the robots.txt file and also misplaced meta tags. In this section, we’re going to start by looking at the robots.txt file, and then we’re going to talk about the no index tag.

There is another very common culprit called the canonical tag, but we’re going to cover that in more depth in a later lesson. So, the robots.txt file also sits at the root of a website in the exact same place as we put the Google Search Console verification file earlier.

It tells search engine crawlers where they can and they can’t go. Now, when it’s set up correctly, it can be really useful for keeping the crawlers out of areas you don’t need them in. For example, admin sections or maybe test folders.

But the problem is that it’s very easy to get wrong and one incorrect rule can block search engines from entire sections of your website or in worst case scenario, it can actually block the whole website from being crawled altogether.

The other big source of issues is meta tags, especially the no index tag I mentioned earlier. Now, meta tags as a whole will give page level instructions to search engines, and the no index tag specifically tells Google and other search engines not to index a page at all.

Sometimes they’re added on purpose, like on thank you pages after form submissions or private content. But the real problem is when it gets added by accident, and one of the most common situations where this happens is when a developer is building a new website and they’ll leave a no index tag across the whole thing while it’s being worked on.

Then the new site goes live and the traffic drops and no one can work out why nothing is indexing. Another common situation is when a single new page is being built and the developer adds that no index tag to stop it getting indexed while it’s still being developed. Then when that page goes live and the new tag doesn’t get removed, it doesn’t get indexed and nobody understands why.

So if you’re noticing that pages aren’t getting indexed in search results, it’s always worth checking your robots.txt file and then checking to see whether there are no index tags that have been added to the pages that aren’t getting indexed.

So, in summary, you now understand that indexation means your pages are added to a search engine’s index and can appear in search results. You also know how to check if pages are being indexed by Google. And you know a few things that you can check if your pages are not getting indexed.

In the next video, I’m going to teach you about sitemaps and how they can help search engines discover your content more effectively.

Lesson 3 – What Is Indexation?

Transcript

Meet Dan M. Jones