Google displays incorrect dates from news sites

I first became aware of the fact that Google displays dates in the search results after reading a blog post (in Swedish) by Simon Sundén. He also described how Google sometimes misinterprets the date an article or blog post is published. For example, this article was published on Newsmill in February 2009, but Google thinks it was published in December 1999 (see screen shot at Sundén’s blog) because it has the date 18 Dec, 1999 in the headline.

But there may be more to this story. Today I found that Google was displaying search results with the date 27 May, 2010 on articles that were in some cases several years old. Here are a few examples from Swedish dailies online.

– Dagens Nyheter, 29 Oct, 2003 – “Aftonbladet driver populismens journalistik”

– Aftonbladet, 8 Feb, 2007 – “Här är Bloggsverige!”

– Aftonbladet, 7 Oct, 2008 – “Välstajlad profilbild avslöjar dig”

– Aftonbladet, 3 Dec, 2008 – “Moderaterna ense efter krismöte”

– Ålandstidningen, 9 Dec, 2009 – “Zandra lämnar Xit – blir nöjesreporter på Aftonbladet”

– Expressen, 10 Dec, 2009 – “Moderaterna backar i ny mätning”

But Google thinks all these articles were published yesterday, 27 May, 2010. A few screen shots below:

ab-wendela-hans

dn-ab

alandstidningen

The immediate effect of this is that search results that aren’t very relevant to you may end up being ranked extremely high in the search results in Google. The article in Aftonbladet about my blog survey “Bloggsverige” is ranked #4 in Google on a search for Bloggsverige, when I know that previously it has not shown up in the top results.

It is also quite possible, as Simon Sundén also concludes, that it may be possible to game the system by fooling Google into thinking your blog post or article has been published more recently than it actually has.

I still haven’t quite sorted out exactly why Google misinterprets the dates of the articles listed above, but one thing is clear. All these articles have a more recent date in the code at one place or another, probably all of them have 28 May or 27 May 2010 somewhere. Once I or someone else figures this out, I will update this post. I would also like to know if this flaw is something that mostly benefits major news sites like the ones listed above.

Update:  James Royal-Lawson and I discussed this matter briefly on Twitter this evening and James posted his thoughts a few minutes ago. His conclusion is that Google takes the first date it finds, or at least the first date it finds reliable, and uses it to determine when the article has been pulblished. Since many online dailies have a number of different dates for different parts of each page, Google misinterprets the publication date. And if I look at for example the article in Dagens Nyheter above, from way back in 2003, that is exactly the case. The date 28 May, 2010 comes a few hundred lines of code before the actual publication date.

Update 2: Some more info here from Michael Gray.

How to lose 5,000 inlinks per month and alienate bloggers

If you have been blogging a few years, you may have noticed a change in the way other bloggers react to your content. Back then when I started, in 2004 and 2005, all you had to do is write a witty comment to some news story and five other bloggers would link to your post, possibly adding a few views of their own. Nowadays, you can spend weeks on research for a specific blog post and “all you get back” are a number of retweets, and maybe maybe one or two blog links. I’m not saying it’s a bad thing, especially for me who have an old blog with thousands of old links to it, I already have a high page rank. And it’s great that Twitter and Facebook have made it increadibly easy for your thoughts to travel across the web. But it was easier back then to get link love and, since a link is a “vote” on your content in Google, thereby building a good page rank for your blog.

So I will try to more often reward really good blog posts with a link back from my blog, not “only a retweet”.

“Good content deserves more than a retweet.”

Here is a good example that ties in well with the link love theme of this post. Simon Sundén, a great Swedish SEO expert, wrote a story the other day about how one Norwegian daily voluntarily turned down 5,000 natural inlinks per month. In short, Dagbladet.no used to use Twingly to show which blog posts link back to a given article. Since this concept is a win-win for both the paper and the blogger, many Scandinavian news sites have introduced Twingly. The news site gets lots of links and some traffic, while the blogger gets traffic back and some recognition.

What Sundén noticed was that Dagbladet.no stopped using Twingly some time late in 2009 and as you can see from the red columns in the graph below, the effect was that the number of inlinks per month dropped drastically from 5,000-6,000 to a measly 1,000. Meanwhile, competing daily VG.no kept Twingly and has enjoyed a steady level of links from bloggers (see blue columns below).

In the long run, VG.no will probably become a stronger site from a search perspective, compared to Dagbladet.no because bloggers are more likely to link to a similar article on VG.no than on Dagbladet.no.

twingly-inlinks-vg-vs-dagbl

Image credit: Simon Sundén.