Data Management

EXCLUSIVE INTERVIEW

eBay’s Paul Strong on Building the Data Center of the Future

In this age when an Internet eon is roughly the equivalent of a calendar year and any given company titan can flicker out of existence in a flash, few companies, especially those solely Internet-based, are seen as enduring and solid and fated for perpetuity. Among these rarities, the world’s largest online auction company, eBay, certainly holds it own. Even so, eBay too can be gobbled up by the times (or, as novelist Stephen King would describe it, the Langoliers) if it stands still even for a minute.

Survival for eBay, like many companies, depends more on advances in science than it does on marketing trends of the day or P&Ls (profit and loss statements) that speak of profits past. In the land of now, the science of tomorrow is the only sure bet to make it in a nebulous future where only God knows what profits are to be had. More so than ever before, it is the scientists and the technologists who must creatively step forward to build the game pieces before the game itself is even defined and long before the players can begin to design a game plan.

Paul Strong, distinguished research scientist at eBay, is just such a future builder. He is to be the plenary speaker at CMG’08 in December, where his speech, “The Shape of Infrastructure to Come,” will no doubt rivet listeners to their seats. It is, after all, the most pressing and harrowing topic of the day. All that is to come will spring from the bowels of the evolving state of infrastructure.

For those who may have been locked in a cave the last decade — perhaps awaiting the apocalypse or something equally engaging — eBay is the world’s largest virtual economy, and as such is one of the most powerful voices in real world commerce, whether there is an e- attached or not.

While Strong will discuss eBay’s infrastructure, how it has evolved, the future of data center infrastructure and the challenges ahead in his upcoming speech in December, TechNewsWorld has a sneak peek for you in an exclusive interview with this formidable titan of tech.

TechNewsWorld: What does eBay’s current infrastructure look like and how has it evolved?

Paul Strong:

eBay’s infrastructure has evolved over many years, from the three or four machines that Pierre Omidyar built from parts in his living room one weekend in 1995, to something like 15,000 servers spread across four geographical locations that currently comprise eBay.com. This evolution has been driven by the demands of the business, i.e., the need for phenomenal scale, near-continuous availability and almost exponential growth over many years.

Whilst, in general, eBay uses off-the-shelf servers, operating systems, storage and networking equipment, almost all of its application components and many of its management tools have had to be homegrown. After all, no one had built a massive auction platform before. The eBay software platform started life many years ago as a relatively simple three-tier application, which was deployed in two physical tiers. It has since evolved into a set of massive, distributed applications that support some 84 million active users trading (US)$2,040 worth of goods on the site every second and with in excess of 125 million items available at any given time.

The infrastructure can be broken down into three main areas: database/persistence, the auction platform and the search platform. As the names imply, the auction platform supports listing, viewing, selling items and so forth, and the search platform enables eBay’s data to be rapidly searched, supporting the auction platform and other activities. These platform services are hosted in six or seven main data centers, with most application components being run in parallel across multiple sites.

TNW: Tell us about the future data center infrastructure and the challenges that lie ahead.

Strong:

There are a number of trends which will define future infrastructure, ranging from high-level changes in the way people do business, to the applications and services that infrastructure supports, to fundamental changes in the technologies that will comprise infrastructure.

The trends towards ever greater disaggregation will continue. Trends such as SOA (service-oriented architecture) offer the opportunity for greater flexibility, business agility and return on investment, through modularity and reuse.

Cloud computing is again a natural evolution of a set of long-term trends, very specifically toward outsourcing non-core, non-differentiating technology and/or business process. Clearly the initial thrust is to remove the need for businesses to own physical IT infrastructure.

Green IT, building more energy efficient infrastructure components, and being able to power and cool them efficiently will not only remain important but will almost certainly grow in importance.

TNW: Will future data centers at eBay or elsewhere be affected by The Grid, meaning the network described as the “parallel Internet” set to go live this summer as scientists in Cern, Switzerland, activate it alongside the Large Hadron Collider? It is said that grid technology will eventually render the desktop obsolete as people flock to widely available cloud computing.

Strong:

Yes and no. The difficulty with this question is that it assumes that The Grid is something new and that it is discontinuous with what we have today and where infrastructure is going in the future. This is not the case.

Grid computing is a natural evolution of the longstanding trend toward greater and greater disaggregation of workload/services and the distribution of that workload across a networked set of resources to achieve scaling and resilience. What started on the mainframe as monolithic applications evolved into client/server, then three-tier architectures, then Web services and, finally, for the time being, SOA. These application paradigms have driven us to think of the platform less in terms of discrete servers and other resources connected by networks, and more in terms of a fabric of resources. When you integrate that platform and think of it as a whole, you have a Grid.

The networked resources, servers, storage, switches, etc., are turned into a general purpose Grid system through the use of software, that thus enables you to do more work, faster and potentially more efficiently than you can just by managing a bunch of discrete servers.

eBay is an example of this. All of our applications essentially scale out across hundreds or thousands of servers. In essence, our infrastructure is already a form of Grid. Most of the applications we run are transaction-oriented in nature, but the platform is a network distributed set of resources that we coordinate using various hardware and software elements. So Grids will not render the Web obsolete. They already exist, and they sometimes underpin the Web, or at least some of the services available on it.

TNW: Are there other technologies on the horizon that current or future infrastructures will be susceptible to, or influenced by, that you are aware of now?

Strong:

I think that the existing trends in virtualization, the move to more scale out architecture, chip multi-core/multi-threading and so forth, will continue.

Other new things on the horizon include 10 GB Ethernet, and with it the potential for a converged fabric via Fiber Channel over Ethernet. No one in their right mind really likes having lots of different cabling infrastructures in their data centers, so a converged fabric seems very desirable.

However, a number of candidates have come, and in some cases gone, yet none has so far been successful in gaining critical mass. It would be nice if one did. Of course, the real challenges are in making such a fabric easy to manage, trustworthy, i.e., that types of traffic can be logically separated to guarantee their performance or security and be both cost effective and non-disruptive. If this is cheap and simple, it will take off.

Leave a Comment

Please sign in to post or reply to a comment. New users create a free account.

Technewsworld Channels