This evening, Hewlett-Packard (HPQ) held a webcast, moderated by Morgan Stanley analyst Katy Huberty, to discuss its big data software product, Vertica, featuring a slide show and discussion by HP’s head of that unit, Colin Mahony.
There were no new announcements in the presentation, which was an introduction to technical aspects of the program, and a Q&A comparing the product to offerings from International Business Machines (IBM), SAP AG (SAP), Teradata (TDC), EMC (EMC), and Oracle (ORCL).
Mahony explained the main features of Vertica, which the company acquired in 2011, after it was founded by database pioneer Michael Stonebraker in 2005.
Vertica is for use in querying large amounts of data to analyze patterns, the intersection of the big data and analytics crazes.
The main features are a “columnar” data structure that Mahony said is “much, much faster” than other forms of retrieving data for analysis. Where there are queries acting against multiple data points, the columnar format can be much more rapid than a traditional relational database approach, Mahony explained:
If you ask to see every male within 100 miles of San Francisco between the ages of 30 and 60, we know with that you’re requesting an analytic workload that requires three variables, age, address, and gender. By storing all those together on disk, we can get it fast and return it to the user. By adding more columns, and not taxing the system as a row store, where every query has to go through every row, the results can be much faster.
Mohony says there’s a big push by HP to sell Vertica through the “freemium” licensing model. Meaning, you download a free copy at Vertica.com, you can try it out against a data set of up to one terabyte, and then if you want to get more serious you can pay for a license.
Mahony said HP is not trying to replace “data warehouses” that companies have spent years building. “Our approach is never to say get rid of the data warehouse, that’s not a good strategy,” said Mahony. “Instead, we say surround the warehouse, and go after analytic use cases, the killer queries.” Nor is Vertica meant to be used for transactions, where online transaction processing (OLTP) systems such as Oracle’s database still serve their intended purpose.
Mahony was asked about competition from a number of products, including SAP’s “HANA” in-memory database, EMC’s “Greenplum” software, IBM’s “Netezza,” and Teradata and Oracle’s dedicated analytics machines.
HANA is not really a competing product he said. HP and SAP are “great partners,” and HANA is more about in-memory storage of data. “I applaud many of their core design principals,” which include a columnar format, he said. But the in-memory database will take a while to have the aggregate storage capacity for the many terabytes of data that Vertica is designed to sift through. Vertica itself integrates with the clustered file system storage technology known as Hadoop.
The Teradata and Oracle hardware is proprietary, versus the HP servers’ “open standards,” he said, which means more flexibility.
“Unlike Netezza or Teradata, what people in the industry call proprietary refrigerators, with Vertica and HP, you get the experience of an appliance but also know that you’re not locked in. You can swap out hard drives, for example, you can scale it out to your needs.”
I would note that HP seems to be on something of a campaign to have hosted presentations of its technology these days. The company will hold a briefing a week from today with HP’s head of its networking division, Bethany Mayer, regarding software-defined networks, hosted by ISI Group analyst Brian Marshall. You can catch the webcast of that conference here.
Here are a few choice slides from today’s presentation:
No comments:
Post a Comment