Thoughts from Inside the Box : November 2007

Previous Next
1


"What Netezza is doing is... going a step further: score the data as it is streamed into the appliance and before it even hits the database... However, it is not just the performance gain that is significant. This initiative means that developers are embedding analytic software into the Netezza Data Warehouse Appliance so that it becomes, in effect, an application appliance."

Philip Howard, Director of Research, Technology, Bloor Research - from his 5th October posting, "The Netezza Developer Network"

Pardon the title's riff on the late-1970s Elvis Costello hit song What's So Funny 'Bout Peace, Love and Understanding, but a recent mini-dustup got me to thinking about providing a bit of insight into why Netezza's approach to "Streaming Analytic™ Appliances" is different from others' entries in the market. It seems the recasting of Netezza's mission in terms of streaming analytics rather than the more-limiting data warehouse appliances, along with the launch of the Netezza Developer Network (NDN), has caused something of a hullabaloo among some of our competitors (refer to recent stories from Teradata/SAS, Greenplum, IBM/SPSS and industry analyst, Curt Monash).

And well it should. While some would seem to declaim Netezza's positioning on the topic as 'nothing more than UDFs', and argue that what matters is supporting them effectively, we must beg to differ (and to differentiate). In short, we feel the Netezza approach to Streaming Analytics opens the door to dramatically change the way data warehouse systems are viewed, used and even deployed.


The positioning of some of our larger, (recently) publicly-traded competitors may suggest that they see themselves not just as expert in the domain of data warehouse systems, but also as experts in the ways of CRM, advanced scoring and analytics, etc. They seem to have bolted on homegrown software packages as extensions of their data warehouse offerings in the market. That may well be the case, but we don't really see how it's possible for one vendor to "corner the market" on innovation - a view that we think is borne out in recent announcements of closer UDF-based partnerships. Still others, more from the new-entrant category, claim that the only thing required is simply to support basic UDF functionality as an extension to the database. We think both ends of that argument are incorrect.


Instead, we at Netezza think it best to "stick to our knitting". Our aim is to provide the high-performance infrastructure along with a technical and community foundation to enable others much more expert than we are to drive the algorithmic and application-level innovation by their ability to exploit the performance of our streaming analytic appliance. To again provide a riff on something from the late-70s and early-80s BASF ad (a campaign that has recently been rekindled in that company's marketing), our vision could be summarized as, "We don't make your advanced applications; we make your advanced applications 'run like rockets'**."


** "Raw SPU functions are called like any other SQL function... ...and they run like rockets by exploiting the Netezza architecture."

Justin Lindsey, Chief Technology Officer, Netezza, speaking at the 2007 International Netezza User Conference, 26th September, 2007

What Netezza Provides

What Netezza provides in this mix is an extremely high-performance system, particularly well-suited at storage-intensive operations (like data warehousing) and in particular, operations that (like data warehousing & BI) can benefit from a data streaming architecture in which critical reduction of unnecessary data can be accomplished as rapidly as it is read from the storage elements - allowing for greater processing efficiencies. We've written extensively about this before (see Spotlighting FPGAs, parts one, two and three) and won't repeat the arguments here.

Another key lever that Netezza provides by way of the NPS^®^ appliance is the fact that our intelligent storage elements known as Snippet Processing Units (SPUs) are really each compute nodes. They are capable of running compiled C or Java code, with the added task-by-task "customizability" of an FPGA that can further accelerate performance, operating in an MPP compute grid but with the simplicity of the our appliance approach.


Consider this: if those 100s of SPUs in an NPS appliance could be used to run C code to execute SQL query processing tasks, why couldn't they equally be tasked to perform tasks that go well above and beyond those enabled (encumbered?) by the set-based, structured-data logic of SQL? Where others may use UDF or even UDA functionality in the data warehouse systems to collect up and standardize use of SQL functionality across users, the streaming analytics enabled by Netezza allows users to "draw outside the lines" of SQL.


Another thing Netezza provides for NDN members is a set of some basic building blocks - functions and an algorithmic work area that form the foundation for more advanced work to be produced. In so doing, some of these appear to be greatly in common with the standard fare of "traditional" SQL functional extensions: record-level functions or User-Defined Functions (UDFs) and aggregate-level functions or User-Defined Aggregates (UDAs) are part of the foundation. But some of the other parts go far beyond those definitions allowing for developers to implement functions retaining a sense of state or to cascade multiple complex algorithmic processes to build even more powerful solutions, all making use of the streaming nature of the NPS analytic appliance to push performance even further.

And finally, what Netezza provides is a simple development appliance platform on which NDN members can develop and verify their algorithms, including the performance impacts of operating in parallel. Affectionately known by the decidedly non-marketing name, SPUBox, the platform is a fully-functional version of the NPS appliance, with four Snippet Processing Units, a host processor and network connectivity. Weighing in at a little over 40 pounds (18 kilos), one might call it a "0^th^ generation luggable analytic appliance" but one that only consumes about as much energy as two 75W light bulbs. We granted more than ten of them to new NDN members at our September global user conference in Boston, some with special decal "wraps" to stand out above the ordinary compute platforms you may be used to.


http://i111.photobucket.com/albums/n148/nzfrisco/NUC2007/Wednesday/IMGP2085.jpg

It Takes a Vilage...

I think the word potential is important when applied to streaming analytics, because what we're doing is opening the door to the potential of the data warehouse to be used in a very different way. Those extended uses are being made possible by Netezza and our community of users, developers and partners that is being fostered and growing, virtually with each passing day.

What we are seeking to unleash is a new level of performance and innovation in the use of storage-intensive analytical computing, but the important bit is that Netezza is not looking to do this alone. In fact, we do not picture ourselves as having cornered the market on analytical algorithm writers. Instead what we launched with the NDN is intended to evolve as a cooperative and competitive global web of experts who will build on their own and one another's innovations. Here, the term, "coopetition" seems trite; I'd prefer to think of the NDN as an opportunity for innovative "mashups" at the building-block, advanced algorithmic and applications levels.


Foundational elements are used together to enable basic value-add functions to be built. Those are mixed and matched, typically but not always with standard SQL fare, to enable more complex algorithms to be realized. And, in turn, the algorithms enable very high-performance applications and new uses of the NPS appliance to be realized. In some cases, one entity may do most or all of the above work. In many others, we are already seeing cooperation among members to use and reuse modules developed elsewhere to extend the capabilities.

Like what has been accomplished by artists with a humble plastic child's toy such as the Lego, these capabilities can be mixed and built-upon to create innovations we may not even be able to imagine today.

Picture4.jpg

The opportunity is helped along by the network effects of an open community and members (by last count, in excess of 50 spread around the world) spanning entities from university professors and graduate students to BI applications providers to end-customers of the Netezza Performance Server^?^ appliance - and everything in between. This is where the true industry expertise lies. This is also the source of innovation for what can be possible with the opening up of Netezza's architecture to more than "just" data warehouse and BI.


Where will all of this lead? To advanced text, image, bioinformatics or video processing? Perhaps. Into the domain of the 'what if' Monte Carlo or Genetic algorithm simulations for risk analysis and predictive resource optimization? That's another possibility. But we're confident that people are going to use the NPS appliance in new and innovative ways as a result of Streaming Analytics and the NDN - and in ways which may well help shape the features and functionality of the appliance in releases to come.

What's So Special?

What's so special about all that? Well with these foundational building blocks, imagine being able to develop customer- or threat-scoring algorithms that could be accomplished in as little as one pass through record data in a data warehouse instead of multiple passes required to denormalize or pivot data, or worse still, large extracts of the data from the warehouse to an off-board computing complex in order to perform the denormalization and scoring tasks. What if this single-pass technique yielded a 10X speedup in processing? What if it could be more than 100X - perhaps even allowing a task that formerly was accomplished in over 10 hours to be done in less than 20 minutes? Might that change the way that particular analytical task was used? Might that change someone's business? We think it could. More importantly, so do many of our customers, partners and prospects.


To date, the Netezza Developer Network has dozens of active partners participating in the program globally, with more than 100 applications to become part of the program pending [note: if you're thinking of your own really exciting "on stream" application ideas, you can apply online at http://www.netezza.com/ndn]. We think from the combined innovation and expertise of this group, the NDN has the potential to take the NPS analytic appliance to new levels of performance and new applications domains that will continue to include, but may go far beyond, the standard Data Warehouse Appliance of our roots.

1 Comments 0 References Permalink
0

by Ellen Rubin - Netezza, Vice President of Marketing

"The best vision is insight." - Malcolm S. Forbes, former publisher of Forbes magazine (1919-1990
Leadership conferences are generally a mixed bag. They tend to take on weighty topics and raise interesting questions, but have a tough time providing any real insights or doing more than re-hashing mass-market ideas. The Forbes Leadership Networks Forum in Chicago this past week seemed at first glance like it might fall victim to this tendency. The title of the event was "America the Innovator: The New Rules of Global Market Growth," and the marketing claimed that it would help attendees learn about a staggering range of subjects, including but not limited to:

  • Global disruption by fast-growing economies like India and China
  • America’s role in this global economy
  • Innovation for Corporate America
  • Using analytics for competitive advantage
  • Social networks as the new holy grailI was exhausted just reading the brochure.

Happily, the conference included some terrific speakers who managed to provide hours of real insight and entertainment, and to stimulate lots of discussion among attendees. Steve Forbes, President & CEO of Forbes, former candidate for U.S. President, and now co-chair of the Rudy Giuliani campaign, kicked off the day and had a lot to say about America in the global economy. He stated his positions and biases upfront, including the need to lower taxes to make the U.S. more competitive for corporations, and to open immigration and get better at "letting in non-terrorists." Whether or not these opinions resonated with everyone in the room (doubtful), Steve was very clear on his bottom line: the only way for America to succeed is for its companies and institutions to become even more innovative and stay on the cutting edge.

To explain how,Professor Clayton Christensen - world expert on disruptive innovation - took the stage and wowed the audience for the next couple of hours. Professor Christensen has written several famous books on the subject, including The Innovator’s Dilemma and The Innovator's Solution. At last year’s Netezza User Conference, he was a keynote speaker and dazzled the crowd with his vision, brilliance and dry wit. I won’t restate some of his widely-known ideas and insights, but I wrote down a short quote that resonated strongly for Netezza: "A disruptive technology is one that simplifies a complex problem."

Professor Christensen also shared some fascinating and contrarian views about Apple and the Harvard Business School (where he is a professor, but not afraid to say some unpopular things, it appears).

On Apple: they may be on top of the world right now, with over 100 million iPods sold and a great stock price, but they’re being disrupted by non-proprietary, standards-based, inexpensive mobile phones. Apple has won so far in the early stages of market disruption through its integrated, proprietary approach, with iTunes and all its sleek and beautiful products. Over time, however, Christensen predicts that the mobile phone players will carry the day and to compete, Apple will need to embed itself inside (à la "Intel Inside") and let the other devices pull content from iTunes. He feels the real opportunity is in the personalization of content from iTunes, not the devices themselves, although Steve Jobs hardly seems likely to agree.

On HBS: Professor Christensen said that a common problem that limits corporate innovation is that companies optimize their performance based on Wall Street driven statistics, such as gross margin percentage, that turn out not to be the ones that matter for their competitive survival. In the case of gross margin, this drives companies to build a broad range of product lines of high-margin niche products and to rule out innovative new products and technologies that don’t meet the hurdle rate. In the case of HBS, the "wrong statistic" is the high starting salaries of HBS graduates. Although this metric gets HBS ranked as the top business school on many lists, it is in fact making the graduates too expensive for most potential hiring companies. As a result, the hiring companies are forming their own corporate "universities" that allow them to hire cheaper talent and train them for specific corporate skills and knowledge. This has led to dramatic reductions in the number of applicants and recruiting companies at HBS. As an HBS alum, I had to chuckle at the thought of Professor Christensen presenting these ideas and meeting with stony silence in some oak-paneled room.

Next up was Professor Tom Davenport, world expert and author on the subject of "Competing on Analytics" - something very near and dear to us at Netezza. Professor Davenport lectured at Netezza University, our continuing education program, and his ideas have become a mantra for us as well as many other vendors and corporations in the world of analytics, including SAS, Accenture and others. Professor Davenport’s main thesis is that companies that use analytics for strategic competitive advantage outperform those that don’t. A common tendency in Corporate America is to, by Malcolm Gladwell - Davenport’s wry comment: "As with overeating and other American habits, we don’t need any encouragement on this.") Instead, we need to look at examples like Amazon, Best Buy, Capital One, Google, Wal-Mart and others, where competing on analytics is the corporate strategy with commitment from the CEO down. (It was great to note that many of the companies Professor Davenport profiled as the case studies for competing on analytics are already Netezza customers!)

Academic theories are always interesting to hear, especially when presented by someone as dynamic and fun as Professor Davenport. But what made the case were the customer examples shared by him and his panel, which included Netezza customer, Rob Holland, SVP of U.S. Retail Measurement at ACNielsen (a service of The Nielsen Company), and Carol J. McCall, VP of Research & Development at Humana. Some highlights:

  • Best Buy segments their stores based on customer profiles, such as "Jill, the soccer mom," and sells targeted merchandise for the specific segments. The segmented stores earn twice as much as the non-segmented ones.
  • Humana uses predictive models for high-deductible health-care products to offer different pricing and options based on what customers need. They also use models to predict individual health, which enables them to build better relationships with customers who are at-risk and help them change their lifestyle behaviors to improve their health. These uses of analytics have returned more than $600 million to Humana and helped over 400,000 people!
  • ACNielsen analyzes data from tens of thousands of retail locations and grocery scans from over 100,000 grocery families. Based on this analysis, the company can break down the data to specific clusters of stores and tailor programs to help their retail and CPG customers be more competitive.I won’t even try to cover the other sessions or content, but it was definitely a packed schedule. The day ended with baseball; specifically, Moneyball. Wearing a Red Sox cap in celebration of the recent World Championship win, Davenport tied the day together with the story of how Billy Beane, general manager of the Oakland A’s, exploited an arbitrage opportunity by analyzing baseball statistics to find the real metrics that predicted success (batting average turns out not to matter much, while on-base percentage matters a lot). This let Beane pick the undervalued players and compete effectively against much-wealthier teams. (At the risk of being repetitive, Billy Beane was the keynote speaker at Netezza’s first user conference; we definitely have hit the trifecta with great speakers on innovation and analytics!)

In a sense, you could boil the whole day down to one major point: Make analytics a key aspect of your corporate strategy and leverage data to determine the critical metrics for your business - and don’t delay. As Davenport recently told a crowd at the SPSS user conference, "There's not much time to spare because somebody's going to become your analytics competitor." Or better yet, in the words of Bill James, the sabermetrics genius who inspired Billy Beane:

"There will always be people who are ahead of the curve, and people who are behind the curve. But knowledge moves the curve."* *

Ellen Rubin

0 Comments Permalink
0

by Ellen Rubin - Netezza, Vice President of Marketing

"Hang on - It's starting again \\ Hang on - There's no shelter from the wind \\ Hang on - Like a fire from the sky \\ Winds of change are blowin' by"
- closing chorus from "Winds of Change", by The Jefferson Starship and Grace Slick, 1982 \\ [click here for YouTube video]
Witches, Elvis impersonators, bikers and the odd clown rush by, while outside, strong winds are blowing and the sky is dark and stormy.

Yes, it's Halloween at The Data Warehousing Institute's Orlando conference, and as 600 attendees try to celebrate without their families (actually, no one seemed too upset), Hurricane Noel is blowing in.
In fact, for the data warehouse community the storm has already hit. The appliance revolution has taken place and the impact is causing some extreme after-shocks. The talk around the halls was about appliances, and there were two half-day courses dedicated just to this topic. One was led by Richard Winter and Rick Burns of WinterCorp, providing an overview on appliances as well as a comparison of the different architectures and products on the market.

There’s certainly a need for this kind of information. Since Netezza launched the appliance category, built a community with well over 100 large customers and became publicly-listed, pretty much every major vendor in the industry has launched its version of an appliance, and at the TDWI show, several new players were clamoring for attention. It's pretty confusing for people who are just beginning to consider the appliance approach. Unlike the attendees at Netezza user conferences and events, the typical TDWI attendee is probably thinking something like this: "Boy, I’ve been hearing a lot about appliances lately, there seems to be a lot of news about them from TDWI and in the press, and Gartner says they're becoming mainstream. I better find out more about what they really can do and whether I need to think about this for my organization."

I guess when you've been evangelizing about the appliance concept for more than five years it's good to be reminded that in many ways, this is still a new frontier.

Back at the WinterCorp course, Richard talked about the market trends that have created a need for appliances, and made the point that "what you really want is to answer any question on any level of your data at any time." I couldn't agree more. Actually, he told a joke that really made the point: A doctor, a lawyer and a statistician went deer hunting. The doctor shot at a deer and was two feet too high and to the right. The lawyer shot and was two feet too low and to the left. The statistician said, "No need for me to shoot - according to the statistics, I've already hit the deer!" In case you missed the punchline, Richard added, "Sometimes, highly aggregated data does not get you the right answer for many business questions." Again, couldn't agree more: all the data, all the time is what appliances are about.

I also had a chat with Wayne Eckerson, a leader at TDWI and expert on predictive analytics. I was describing how, through our Netezza Developer Network, Netezza is opening up our appliance to developers all over the world who are doing new and cutting-edge analytics "on stream," leveraging Netezza’s streaming architecture. Wayne pointed out that there has really been an evolution over time from purpose-specific desktop systems just for the analysts in an organization who do the heavy quant work, to now, embedding some of that functionality in the data warehouse, and eventually, being able to combine it with the more traditional reporting and analysis work done by BI users. The new frontier in appliances is all about this broader role of analytics - including more groups of users, types of data and analytic algorithms - that can be done "inside the appliance," and as usual, Netezza customers and partners are at the forefront of the revolution.

Ellen Rubin

0 Comments Permalink