Posts tagged with Stopwords

Balance stopwords to your advantage

August 27th, 2008

Create better content – reduce stopwords

The terms ‘Stopword’ or ‘Stopwords’, are used by Google and others to describe words that do not add relevant content in the semantic sense to a piece of text.

Generally these include adverbs, prepositions and conjunctions though this is not always the case! A few common examples are ‘and’, ‘what’, ‘where’, ‘is’, ‘to’, ‘why’ and ‘if’.

Some retrieval systems have far more extensive lists. The University of Neuchatel have extensive Stopword lists in many languages. For English look at the second table in the column entitled ‘Stopword List‘. There is also a table of the most frequently used English words which is also worth a look.

Natural language text or data is generally stored in text search and retrieval systems such as search engines and document storage systems.

Some systems however replace stopwords with tokens or markers in stored text to save on storage and speed up search results. When the retrieved text is returned as the result of a search, the full natural language version of the text is displayed.

Goggle indexes Stop Words.

What is regarded as a Stop Word varies from system to system, their sophistication and needs and this has a direct bearing on how we optimise text.

We all use keywords and phrases in the text we write for our blogs and websites.

The main reason we incorporate niche market relevant keywords and phrases is that we will increase the number of times our web page is found from an SEO perspective. The caveat is that our textural content has to be regarded as sufficiently relevant to the search to be included in the SERPs.

If we regard the textural content as our web site’s or blog’s real estate, how can we further increase its value?

Without doubt, we need to write in a natural flowing way for the benefit of our readers.

However, by ensuring that we use language and terminology that is niche specific, we can reduce the number words that are traditionally regarded as stopwords and increase our relevant and indexed content.

The result will be more concise posts or pages that are truly subject driven.

It is also important to consider how people search on the web. Search Engines allow researchers to specifically include stopwords by enclosing a search term in parenthesis or by using the + sign. Look at Google’s page entitled Use of common words

So we have to also think about where stopwords might have a valid place. Here are two examples but there are more:

  • Page titles and headings
    • These should be in natural English and include stopwords. They are first thing the reader will look at when scanning a post or web page. It is also a likely place to include a key phrase or sentence incorporating keywords.
  • Anchor text
    • Link text should include stopwords to make sense to the reader but also because it should match the destination from the SEO perspective.  (Will be the subject of another post)

If a post or page is focusing on a single keyword or phrase, we need to be wary of crowding it.

None of this means we have to spam keywords or phrases, or that we write in a stilted and rigid way.

It means we have to carefully consider how we achieve a good balance between textural optimization from the SEO perspective and general readability.

Part of our series on SEO Terminology

del.icio.us Digg Facebook Google Google Reader Ask.com MyStuff Ask.com Yahoo! MyWeb Netscape Newsgator reddit StumbleUpon Technorati yigg.de Webnews.de ReadMe.ru