Skip to main content

Full-up Google choking on web spam?

Webmasters have been seething at Google since it introduced its 'Big Daddy' update in January, the biggest revision to the way its search engine operates for years.

Alarm usually accompanies changes to Google's algorithms, as the new rankings can cause websites to be demoted, or disappear entirely. But four months on from the introduction of "Big Daddy," it's clear that the problem is more serious than any previous revision - and it's getting worse.

Webmasters now report sites not being crawled for weeks, with Google SERPS (search engine results pages) returning old pages, and failing to return results for phrases that used to bear fruitful results.

"Some sites have lost 99 per cent of their indexed pages," reports one member of the Webmaster World forum. "Many cache dates go back to 2004 January." Others report long-extinct pages showing up as "Supplemental Results."

This thread is typical of the problems.

With creating junk web pages is so cheap and easy to do, Google is engaged in an arms race with search engine optimizers. Each innovation designed to bring clarity to the web, such as tagging, is rapidly exploited by spammers or site owners wishing to harvest some classified advertising revenue.

Recently, we featured a software tool that can create 100 Blogger weblogs in 24 minutes, called Blog Mass Installer. A subterranean industry of sites providing "private label articles," or PLAs exists to flesh out "content" for these freshly minted sites. And as a result, legitimate sites are often caught in the cross fire.

But the new algorithms may not be solely to blame. Google's chief executive Eric Schmidt has hinted at another reason for the recent chaos. In Google's earnings conference call last month, Schmidt was frank about the extent of the problem.

"Those machines are full," he said. "We have a huge machine crisis."

And there's at least some anecdotal evidence to support the theory that hardware limitations are to blame.

"The issue I have now is Googlebot is SLAMMING my sites since last week, but none of it makes it into the index. If it's old pages being re-indexed or new pages for the first page, they don't show up," writes one webmaster.

The confusion has several consequences which we've rarely seen discussed outside web circles.

Giving Google the benefit of the doubt, and assuming the changes are intentional, one webmaster writes: "In which case Google's index, and hence effectively 'the Web as most people know it' is set to become a whole lot smaller in the coming weeks."

It's barely more than a year since Yahoo! and Google were engaged in a willy-waving exercise to claim who had the largest index. (See My spam-filled search index is bigger than yours!)

Now size, it seems, doesn't matter.

There's also the intriguing question raised by search engines that are unable to distinguished between nefarious sites and legitimate SEO (search engine optimization) techniques? The search engines can't, we now know, blacklist a range of well-establish techniques without causing chaos. In future, will the search engines need to code for backward bug compatibility?

And lingering in the background is the question of whether the explosion of junk content - estimates put robot-generated spam consists of anywhere between one-fifth and one-third of the Google index - can be tamed?

"At this rate," writes one poster on the Google Sitemaps Usenet group, in a year the SERPS will be nothing but Amazon affiliates, Ebay auctions, and Wiki clones. Those sites don't seem to be affected one bit by supplemental hell, 301s, and now deindexing."

With $8 billion in the bank, Google is better resourced and more focussed than anyone - but it's still struggling. Financial analysts noted that its R&D expenditure now matches that of a wireline telco.

Only a cynic would suggest that poor SERPs drive desperate businesses to the search engines own classified ad departments - so if you want to play, you have to pay. Banish that unworthy thought at once.

(Thanks to Isham Research's Phil Payne for the tip).® [theregister]

Comments

Popular posts from this blog

US says world safer, despite 11,000 attacks in '05

The U.S. war on terrorism has made the world safer, the State Department's counterterrorism chief said on Friday, despite more than 11,000 terrorist attacks worldwide last year that killed 14,600 people. The U.S. State Department said the numbers, listed in its annual Country Reports on Terrorism released on Friday, were based on a broader definition of terrorism and could not be compared to the 3,129 international attacks listed the previous year. But the new 2005 figures, which showed attacks in Iraq jumped and accounted for about a third of the world's total, may fuel criticism of the Bush administration's assertion that it is winning the fight against terrorism. Asked if the world was safer than the previous year, U.S. State Department Counterterrorism Coordinator Henry Crumpton told a news conference, "I think so. But I think that (if) you look at the ups and downs of this battle, it's going to take us a long time to win this. You can't measure this month ...

Al-Qaeda number two in new video

Al-Qaeda's number two Ayman al-Zawahiri has appeared in a video saying that Iraqi insurgents have "broken the back" of the US military. He praised "martyrdom operations" carried out by al-Qaeda in Iraq in the video, posted on an Islamist website. And he called on the people and army of Pakistan to fight against President Musharraf's administration. This is the third message from prominent al-Qaeda leaders to emerge within a week. A tape from Osama Bin Laden was broadcast on 23 April, followed two days later by a message from Iraqi insurgent Abu Musab al-Zarqawi. Pakistan focus Zawahiri, who wore a black turban and a white robe in the video, described the leaders of Egypt, Jordan, Saudi Arabia and Iraq as traitors, and urged Muslims to "confront them". He praised Iraqi militants, saying that the US, Britain and allies had "achieved nothing but losses, disasters and misfortunes" in Iraq. "Al-Qaeda in Iraq alone has carried out 800 ma...

Does light have mass?

The short answer is "no", but it is a qualified "no" because there are odd ways of interpreting the question which could justify the answer "yes". Light is composed of photons so we could ask if the photon has mass. The answer is then definitely "no": The photon is a massless particle. According to theory it has energy and momentum but no mass and this is confirmed by experiment to within strict limits. Even before it was known that light is composed of photons it was known that light carries momentum and will exert a pressure on a surface. This is not evidence that it has mass since momentum can exist without mass. [ For details see the Physics FAQ article What is the mass of the photon? ]. Sometimes people like to say that the photon does have mass because a photon has energy E = hf where h is Planck's constant and f is the frequency of the photon. Energy, they say, is equivalent to mass according to Einstein's famous formula E = m...