Internet

Microsoft Bing, Yandex Create New Search Protocol

Microsoft Bing and Russian search engine Yandex on Monday announced a new protocol designed to speed up search updates of websites.

Called IndexNow, the protocol uses an API to allow websites to easily notify search engines whenever content is created, updated, or deleted. Once the search engines are notified of updates, they can quickly crawl and reflect the website changes in their index and search results.

“Ensuring timely information is available for searchers is critical,” Microsoft explained in its Bing Blog.

“Yet historically,” it continued, “one of the biggest pain points for website owners has been to have search engines quickly discover and consider their latest website changes. It can take days or even weeks for new URLs to be discovered and indexed in search engines, resulting in loss of potential traffic, customers, and even sales.”

Microsoft maintained that IndexNow is an initiative for both a more efficient and open internet.

It explained that by telling search engines whether an URL has been changed, website owners provide a clear signal helping search engines to prioritize their crawl of those URLs. That limits the need for exploratory crawls to test if the content has changed.

Additionally, search is opened up because by notifying one search engine, a website notifies all search engines that have adopted IndexNow.

Lack of Standards

“IndexNow is a good idea because it simplifies the process of getting new content indexed,” observed Greg Sterling, vice president of market insights at Uberall, a maker of location marketing solutions based in Berlin.

“It also ensures that new content will be indexed quickly — or immediately,” he told TechNewsWorld.

Currently, there’s no standard for updating search engines, explained Jon Alexander, vice president of product management at Akamai Technologies, a content delivery network service provider in Cambridge, Mass.

“Thousands of different crawlers are trying to monitor changes to websites across the internet,” he explained to TechNewsWorld.

“Because Akamai serves so many of those sites, we’re seeing that firsthand,” he continued. “It’s a massive job that drives enormous load on websites and consumes tremendous amounts of power that creates additional environmental impact.”

“We prefer to see an open standard that allows everyone to update search engines the same way,” he added.

Wasted Visits

Search engines have been crawling the internet for information for many years, but this appears to be the first time a major initiative has been launched to make the process more efficient.

“I can’t speak to the motivations of Microsoft and Yandex in creating this, but it’s something that seems overdue,” Sterling said.

Alexander explained that for some websites, crawlers make up half the traffic on the site, and it’s growing all the time.

“This could’ve been addressed at any point over the last 20 years,” he said. “We’ve finally reached a critical juncture at which the scale and inefficiencies are forcing a better solution.”

Not only are crawlers consuming bandwidth at websites, but they are wasting it as well.

Cloudflare bloggers Abhi Das and Alex Krivit noted in a company blog that after studying how often bots revisit pages that haven’t changed, they concluded that 53 percent of crawler traffic is wasted on those kinds of visits to websites.

Crawler Hints

Cloudflare is a web performance and security company in San Francisco. It has a program called Crawler Hints to keep search engines up to date on changes at its customers’ websites.

Search engines use a complex network of bots to crawl the ever-changing content on the internet so people can find relevant, timely content, the company explained in a news release. Today, approximately 45 percent of internet traffic comes from web crawlers and bots.

To help improve the efficiency of crawlers on the web, it noted, Cloudflare launched Crawler Hints — an easy way to signal to bot developers when content has been changed or added to a site, so they can make more efficient choices about what to crawl.

What’s more, it continued, website owners will be able to improve site performance by reducing unnecessary bot traffic and providing timely content, which ultimately helps improve search rankings.

Now, Cloudflare is using the IndexNow standard to bring Crawler Hints to major search engines, it added.

“A fast, reliable website and timely search results are at the heart of any growing business, whether it’s a startup or Fortune 500 company,” Cloudflare CEO Matthew Prince said in the news release.

“Since the beginning, we’ve worked to help our customers to give them the speed, reliability, and security they need to do business,” he continued. “Today, we’re taking that one step further by working with Microsoft and other major search engines to help any website owner reduce inefficiencies while also delivering their users reliable, relevant, and timely online experiences.”

Business Benefits

Online businesses should benefit from IndexNow, Sterling noted, because product inventory changes and pricing information can be quickly communicated and indexed by search engines.

“Retailers will be able to more quickly alert search engines to new products, prices, and descriptions now that they’re telling the engines about updates rather than waiting on scraping,” Alexander added. “That means making more current information available to potential customers.”

Websites should see changes for the better from the new protocol, too. “All websites that participate should benefit, but particularly websites that have time-sensitive content that regularly updates or changes, such as sites with events and job listings,” Sterling said.

“It also gives publishers more control over what gets indexed than in the past,” he noted.

Smaller websites could also rake in benefits from the protocol because their changes will be registered faster. “I find with my smaller sites, I can be waiting weeks for Google to come along and check a sitemap for changes. Indexing new pages can take over a month,” Colin McDermott, founder of SearchCandy, a search marketing and blogger relations company in Manchester, U.K., wrote on Reddit.

Smaller search engines can also reap rewards from IndexNow because crawling is expensive and resource intensive. “Rather than taking a brute force approach and scanning every piece of text on every site, engines are being alerted to what’s new,” Alexander explained. “It’s a markedly faster, more efficient, and effective process to surfacing fresh, relevant content.”

Google Not Interested

The biggest search engine of them all, however, won’t be benefiting from IndexNow. Google has decided not to participate in the initiative.

“It’s interesting that Google has declined to participate,” Sterling said. “The company may be taking a wait-and-see approach.”

“It may also believe that participating would put Bing and potentially other engines on a more equal footing and would diminish its proprietary advantage over rivals,” he added.

“My assumption is the only reason Google didn’t get involved is they are too invested in their own indexing API, which is coming up to 3 years old now (yet still only designed to work for job + streaming sites),” McDermott wrote.

Google did not respond to our request to comment on this story.

John P. Mello Jr.

John P. Mello Jr. has been an ECT News Network reporter since 2003. His areas of focus include cybersecurity, IT issues, privacy, e-commerce, social media, artificial intelligence, big data and consumer electronics. He has written and edited for numerous publications, including the Boston Business Journal, the Boston Phoenix, Megapixel.Net and Government Security News. Email John.

Leave a Comment

Please sign in to post or reply to a comment. New users create a free account.

More by John P. Mello Jr.
More in Internet

Technewsworld Channels