Advanced SEO Glossary
Bow-to-Stern Latency – Noun phrase. The amount of time that elapses from when a search engine caches the deepest reachable page after the last time it caches the root URL of the site.
Cache-to-Ranking Latency – Noun phrase. The length of time that elapses from a page’s contents being reported in a search engine cache image until the page contents are found for specific queries.
Clustered Results – Noun phrase. You see this most often with Ask, but it happens in Google quite a bit and I think Yahoo! also does it sometimes. You’ll see 2 pages from the same site, the 2nd one indented. Now, Ask likes to put little folders in the margin to show how smart they are about clustering search results from multiple sites under a single topic. But did you know that Google clusters sites and hides them from you? If you change your Google Preferences to show more than ten listings per page, you’ll see the clustered listings. That is why so many data center tools show you different rankings from what you think you’re seeing.
Collapsed Results or Collapsed Listings – Noun phrases. Usually what I call Clustered Results when I cannot think of the word “cluster” (which is more often than not). Technically, these expressions should really only refer to the hidden clusters described above.
Crawl-to-Cache-Time – Noun phrase. The amount of time that elapses from when a search engine fetches a page from a Web site until the page’s contents appear in the search engine’s cache report for the page. Abbreviated as CCT.
Crawl-to-Passed-Value Time or Crawl-to-Passed-Value Latency – Noun phrases. The amount of time that elapses from when a search engine fetches a page until the links on the page pass value (PageRank or anchor text) to their destinations.
Crawl-to-Ranking Time – Noun phrase. The amount of time that elapses from when a search engine fetches a page from a Web site until the page is returned in the top ten results for a designated query.
FCP – Acronym. Frequently Cached Page. A page that is crawled and cached by a search engine on a very frequent basis, usually every two weeks or less.
Filter – Noun. A process whereby a Web document is evaluated and either flagged as “spam”, “potential spam”, “adult-oriented content”, “illegal content”, or something else. Each search engine employs multiple filters. Some filters were designed into the algorithms from the start. Some filters have been added as afterthoughts as search engines have had to react to manipulative or otherwise previously undetected inappropriate content.
Filthy Linking Rich Principle – Noun phrase. The more links a document has accrued, the more links the document will accrue. Stated another way, the more visible a document is in search engine results, the more likely the document is to accrue links, and hence the more visible the document becomes in search engine results.
First Visibility Principle – Noun phrase. The first document to cross the Idiot threshold (q.v.) becomes the first authority on the topic.
Fuzzy point – Noun phrase. The approximate state of knowledge where the information about a document’s indexing status and information about the number of queries to which the document is relevant are approximately equal.
Host – Noun. A much-used term in academic search engineering literature to distinguish between “Web document collections” on a systemic level. A host is not necessarily the same as a site. Hosts are generally defined to be either entire domains (example.com) or sub-domains (sub1.example.com). A domain to which one or more sub-domains belong would be treated as multiple individual hosts, distinct from one another. A host is easier to identify than a Web site, which may be only a part of a host’s content.
Idiot Threshold – Noun phrase. A figurative status at which a document has accrued enough meaningless links through word-of-mouth that the document assumes the status of being an authority on a topic.
Index – Noun. The database(s) against which queries are resolved. All of the major search engines maintain multiple indexes. Each is a separate, distinct database, either physically (kept in separate files) or virtually (logically segmented portions of a master database). The expression database is probably inappropriate for describing what the search engines maintain. When you see me refer to Main Index, think of that as the “static Web page index”. Other indexes may include Image Indexes, News Indexes, and Blog Indexes. I have some ideas on how these various indexes are built, but I don’t expect to share them on this blog.
Index – Verb. The process of adding information about Web content to a search engine’s database about the Web. The indexing process may entail considerable effort depending upon the complexity and applicability of the document.
Indexer – Noun. A type of program that search engines use to update their databases with information about retrieved and parsed Web documents. You rarely see even knowledgeable SEO forum moderators and admins speak of indexers and parsers, perhaps out of a misguided concern that they will confuse people who are new to search engine optimization. Unfortunately, those new people visit the forums to learn about SEO, so teaching them the wrong terminology does them a great disservice.
Influencer – Noun. A Web site or individual whose content is deemed to be influential in adjusting search result (q.v.) rankings, usually either through the creation of new content or the placement of links to other documents. Some blogs (q.v.) can be powerful influencers.
Internal Links or Internal Linkage – Noun phrase. These are the links within your own site that point to other pages in your site. Search engines may use a different, host-level definition for internal links. It is possible that all the major search engines now distinguish between host-internal and host-external links. See host for more information.
Internal PageRank – Noun phrase. This is the actual static value that Google computes and adds to dynamic (run-time, query-time) relevance scores to determine search results rankings. Matt Cutts distinguished between Internal PageRank and Toolbar PageRank on his blog. He also confirmed that he was talking about Internal PageRank where I cited him in my PageRank: Where it helps, where it doesn’t help, and other facts post at Spider-Food in July 2006. Most SEO forum moderators and admins appear to be speaking about Internal PageRank when they discuss PageRank at all, except where they qualify their remarks to address the Toolbar PR value (that nearly all moderators and admins now tell people to ignore). The Toolbar PR value is a proxy value and it is only published 3-4 times a year, making it a virtually worthless indicator of quality or value.
Link mass – Noun phrase. The combination of all connected links that lead to any given page in a hypertext document collection. Absolute link mass cannot be measured. Relative link mass can be approximately measured.
Link pathway – Noun phrase. Two or more pages connected as in a chain (a “path”) by hypertext links.
Link pathway segment – Noun phrase. A segment or portion of a larger link pathway at least 1 document long and at least 2 documents shorter than the link pathway (the beginning and terminating documents in the link pathway cannot be in the link pathway segment).
Link Trap – Noun phrase. Similar to a link bait page, a link trap is usually built by cheaters in reciprocal linking schemes where the outbound links are designed not to pass value.
MCP – Acronym. Moderately Cached Page. A page that is crawled and cached by a search engine on an occasional basis, usually every two to six weeks.
Opacity – Noun. A metric or measure of a range of search listings (q.v.) for a query which are not obviously optimized to be included in the search results. A perfectly opaque search result (q.v.) has an Opacity value of 1.0, reflecting the fact that none of the search listings (q.v.) are obviously optimized for placement in the result.
Page Zone, or Zone – Noun phrase, noun. An arbitrarily designated visible portion of a Web page. Page zones are used for advertising and link placement.
PageRank Trap – Noun phrase. A specialized form of link bait, a PageRank trap is a page whose outbound links only point to other pages on the same domain or site. Usually an article or forum discussion thread that attracts links from other sites.
Parser – Noun. A type of program used by search engines to break down your HTML pages into components for indexing. The parser strips your indexable content and passes it to one or more indexers. Many SEO forum moderators and admins who should know better continue to speak of “spiders” doing the parsing and indexing. Spiders basically retrieve files and place them into (search engine internal) queing areas for the parsers to munch on.
Partially Indexed Listings – Noun phrase. See URL Listings below.
Preservation, or Preservation Principle – Noun, noun phrase. The belief that a Web site can retain all or most of its PageRank by “hoarding” or “sculpting” PageRank. The Preservation Principle is an SEO myth.
Quality Links – Noun phrase. A nonsense expression with no real value or purpose other than to act as a catchall for the types of links people think are better than “those other links”. Googlers use “quality links” as a subtle way of telling people to stop getting cheap spammy links. Many SEO forum moderators and admins use “quality links” in a somewhat broader but similar fashion, if only because they don’t know exactly what criteria make links good for any particular search engine but they recognize that people who are asking about linkage have a problem. Nearly everyone else seems to use the expression to refer to their (usually non-performing) backlinks. I wrote about high quality links at SEOmoz (in a post designed to rank for “high quality links” on the basis of content — but the lesson passed over everyone’s head, except for Aaron Pratt who saw what I was doing right away).
Saturation – Noun. 1) The extent to which a Web site’s pages are included in a search engine index. 2) The extent to which a Web site’s pages appear in a given query’s search result. 3) The extent to which a link profile is distributed across a Web site’s pages.
SERP – Acronym for Search Engine Results Page. Everyone seems to know this acronym by now. I have always hated it even though I now reluctantly use it. SRP (search results page) would be better, since it’s all inclusive. You can have a DRP (Directory Results Page) which some people might argue should be called a DSRP (Directory Search Results Page). I still get click throughs from Yahoo! and DMOZ directory page listings (or a DLP, Directory Listings Page).
Sitelinks – Noun. Google invented this term, which is better than my classic “little clustered links under the main listing”. Sitelinks are those “little clustered links under the main listing” that deep link into the site by category or topic. Many people wonder how these Sitelinks appear. Googlers always say, “That’s algorithmically determined and we have no control over them” — meaning, “We wrote special commands into our software to create those things and we’re not going to tell you what criteria are used to decide which sites get them.” My best guess is that sites that have more than 1,000 pages of content, clear content categorization in their non-breadcrumb internal links, and lots of deep links from other domains are good candidates for Sitelinks. Other criteria are probably taken into consideration. Sitelinks are only shown for the top listing in a popular query result.
Sitemap – Noun. A page on your Web site that links to all the other pages, or at least to all the important section top-level pages. Google has usurped this expression for their “Google Sitemaps” feature (now incorporated into the XML Sitemaps standard supported by several major search engines), where you can upload a file listing all of your pages for their crawlers. I have noticed that Googlers are now speaking of HTML Sitemaps to distinguish those Web site pages from the XML Sitemaps. I think it would be best if everyone adopted the convention of saying “XML Sitemap” or “HTML Sitemap” so we are all on the same page.
Transparency – Noun. A metric or measure of a range of search listings (q.v.) for a query which are obviously optimized to be included in the search results. A perfectly transparent search result (q.v.) has a Transparency value of 1.0, reflecting the fact that all of its listings are obviously optimized for inclusion in the search result.
Trust – Noun. Currently the latest SEO buzz word. Generally speaking, the SEO community picks up on a concept about six months to two years after it’s been worked through by the search engineers. Hardcore spammers (the ultimate “Black Hat” SEOs) are usually pretty good at detecting trends before everyone else. Trust has now officially been done to death. It is incorporated into every algorithm (including Windows Live even though we all agree that Microsoft still has a way to go) and the search engines are already looking at other issues. Trust is being placed in the hands of the Webmasters, but most Webmasters don’t seem to want the responsibility.
Update – Noun. From the SEO side, an update is any noticeable change to the way a search engine behaves. From the search engines’ side, an update is any intended change in a search engine’s makeup or data. Matt Cutts offers an incomplete explanation of a Google update in his December 2006 Explaining Algorithm Updates and Data Refreshes post. He wrote a similar post in September 2005 with What’s An Update?. I don’t expect Matt to confirm every algorithmic change. That would pretty much defeat the purpose of many of them. Yahoo! and Windows Live occasionally issue “weather reports”. Matt has informally issued some on Google’s behalf.
Uncertainty Principle of SEO – Noun phrase. The two states of a Web document (indexed and relevance to a given set of queries) cannot be determined at the same time. The more queries to which a page is known to be relevant, the less information about the page’s indexing status can be determined. The more information about a page’s indexing status that is known, the fewer queries to which the page is relevant.
Universal Search – Noun phrase. The practice by major search engines like Ask, Google, Live, and Yahoo! of melding results from several search databases to provide the user with a more diverse selection of search listings (usually combining video, news, blog, Web, book, and other search tools).
Universal Search Injection – Noun phrase. The practice by Universal Search-capable services of augmenting search results (q.v.) with additional, supplemental listings not normally included in the standard 1..10 listings. Universal Search Injection listings usually have more complex structures, provide more information, and are more transient in nature than normal search listings (q.v.). Universal Search Injections may include links to several documents.
URL Listings or URL-only Listings – Noun phrase. These are the site listings that appear in Google with nothing more than a URL. Matt Cutts explained that they are uncrawled links that Google knows something about from inbound linkage. Google will (or used to) occasionally pull a description from the Open Directory Project for uncrawled links, but you often see them without any description at all. Uncrawled links are not shown in Google’s SafeSearch mode. Matt also discussed them here.
Validate – Verb. Every time I use this word people reach for their W3C manuals. When I speak of search engines validating Web sites, I don’t mean they are looking to see if the HTML code meets some arbitrary standard. I mean they pass each URL through a process whereby they establish, according to their own criteria, that the site is “not spam”. Many spam sites appear to validate. The search engines are not perfect. Nonetheless, many spam sites don’t last long because they don’t validate or their validation is revoked. Maybe I could have used a better expression, but I can’t think of one.
Related Posts
| Hello there! If you find this page useful, you might want to subscribe to the RSS feed or by Email for updates on this topic. |


























Leave a Reply