Ingres and VectorWise Claim Database Speedup

To hear those in the know tell it, database management systems leave a lot to be desired: They require too much hardware, and they make poor use of it. Some advocate a shift in the type of hardware used—away from the CPU toward graphics-processing units [see "Data Monster," September 2009]. But open-source database firm Ingres, of Redwood City, Calif., and start-up VectorWise, of Amsterdam, see the answer in a better use of the CPU.

Their prototype software has shown more than a 10-fold improvement in performance and an 80-fold improvement in some tasks.

To get such an improvement, database luminary Peter Boncz and others at the Dutch national math and computer science research institute, Centrum Wiskunde und Informatica (CWI), took a close look at how modern CPUs work and used what they found to make a database system from scratch. They formed VectorWise in 2008 and joined forces with Ingres and Intel this year.

Database systems today are written "for the machine of 20 years ago," says Bill Maimone, Ingres's chief technology officer. They can't easily take advantage of a modern processor's ability to perform a single instruction on a large set of data, and they're at the mercy of the relatively slow movement of data on and off the CPU.

To solve these problems, CWI computer scientists came up with versions of database operations that work on sets of 100 to 1000 values, or vectors, instead of on one database value at a time. As a result, some operations that take tens or hundreds of CPU clock cycles in other databases take just a handful in the VectorWise system.

The scientists also constructed the system so that all the work is done on data in the CPU's cache, where the processor cores can quickly get at it, instead of in main memory, which can take hundreds of clock cycles to fetch. This required them to compress the data in some parts of the cache and come up with fast decompression algorithms so that the process of fetching data didn't bog down.

Both tasks were helped by VectorWise's use of a database storage scheme called column-store. Data is sent from storage to the CPU as strings of values from the same attribute domain—for instance, a list consisting only of salaries rather than a record containing employee names, salaries, and other data, explains Daniel Abadi, an assistant professor of computer science at Yale University. Column-store makes it easier to perform vector calculations because all the needed values are stored contiguously. Column-store data is also easier to compress, he notes, because it has more inherent order to it.

Abadi calls VectorWise "a company to watch," but its software won't be for everyone. One limitation, he points out, is that the new system is designed to run on a single machine with a database of less than 10 terabytes. That would rule out most databases used by big retail firms.

software database vector

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Ingres and VectorWise Claim Database Speedup

Firms rebuild database functions from scratch

IEEE Spectrum's Top Biomedical Stories of 2025

Grid-Scale Bubble Batteries Will Soon Be Everywhere

How Mars Time Differs From Earth's by Microseconds

Related Stories

Why IT Projects Repeat Costly Mistakes

Trillions Spent and Big Software Projects Are Still Failing

Airflow: From Stagnation to Millions of Downloads

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Enjoy more free content and benefits by creating an account

Saving articles to read later requires an IEEE Spectrum account

The Institute content is only available for members

Downloading full PDF issues is exclusive for IEEE Members

Downloading this e-book is exclusive for IEEE Members

Access to Spectrum 's Digital Edition is exclusive for IEEE Members

Following topics is a feature exclusive for IEEE Members

Adding your response to an article requires an IEEE Spectrum account

Create an account to access more content and features on IEEE Spectrum , including the ability to save articles to read later, download Spectrum Collections, and participate in conversations with readers and editors. For more exclusive content and features, consider Joining IEEE .

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to all of Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more about IEEE →

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to this e-book plus all of IEEE Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more about IEEE →

Access Thousands of Articles — Completely Free

Create an account and get exclusive content and features: Save articles, download collections, and post comments — all free! For full access and benefits, subscribe to Spectrum.

Ingres and VectorWise Claim Database Speedup

Firms rebuild database functions from scratch

IEEE Spectrum's Top Biomedical Stories of 2025

Grid-Scale Bubble Batteries Will Soon Be Everywhere

How Mars Time Differs From Earth's by Microseconds

Related Stories

Why IT Projects Repeat Costly Mistakes

Trillions Spent and Big Software Projects Are Still Failing

Airflow: From Stagnation to Millions of Downloads