This Question is Answered

10 Replies Last post: Jan 28, 2010 4:43 AM by melina386  
jpongin   4 posts since
Dec 16, 2009
Currently Being Moderated

Dec 16, 2009 2:10 PM

Netezza Documentation

Hi,

 

I tried to post this in the "Prospective Users" list, but it didn't give me that option.

 

We're on MySQL, and we'd like to port over to Netezza.  I have several questions:

 

1)  Is Netezza strictly designed for only DW type applications?  Can it also handle hi rates of DB transactions?  Does it handle batch write operations?

2)  Can you point me Netezza's doc for supported data and table manipulation/queries commands and functions?  We need this reference to properly port over our queries.

 

Thank you.

David Birmingham   151 posts since
Sep 24, 2007
Currently Being Moderated
1. Dec 16, 2009 2:23 PM in response to: jpongin
Re: Netezza Documentation

Netezza is a not a transactional system. Intake for a Netezza machine is best served in bulk, at a gigabyte of data at-a-time or more. Several-records-in and several-records-out are a transactional model, where Netezza deals in bulk, millions and billions-at-a-time.

 

When you say you want to port your SQL statements, this is probably not completely true. You want to port your functionality to Netezza, but generally speaking by the time a system has

so run-its-course and is in need of performance boosts, several iterations of performance tuning have already been applied. And these are entirely artificial when moving over to Netezza.

 

Here's an example: If you take the standard cursor-based stored procedure, let's say you have a cursor pulling from a table and passing it through four business rules before putting it back. In one

transactional system using 3400 members of the cursor, it took 30 seconds per operation, translating into hours of time. In Netezza, we would take these rules and "stand them on their side", passing

all 3400 records through one rule, then the next and so forth. Each operation takes mere seconds, effectively deriving the same answer in less than a minute.

 

So in the end, you will have to think about what in your existing system is being processed "in bulk" and this is what you would push over to Netezza. Some of it, your transactional stuff, may remain

behind and require you to pump data between the two. That's okay, because that's how most reporting system work.

Shawn Fox   384 posts since
Aug 15, 2006
Currently Being Moderated
3. Dec 18, 2009 11:41 PM in response to: jpongin
Re: Netezza Documentation

You can't do anything 1 row at a time on Netezza.  There are no indexes to allow you to get good performance.  At best you might get 10 queries per second.  Netezza is designed to process massive amounts of data in bulk.  You can process millions of rows per second, but your application has to be written to process sets of data at a time, not 1 row at a time.

 

Typically you would bulk load a big set of data into one or more staging tables, process the data using SQL in bulk operations, then load the results into your target tables.  It is actually far easier to write an application in this way than it is to do row at a time processing, but it is a very different mindset.  Netezza handles all the performance issues for you because every SQL statement always runs in parallel across 10s to 100s of cpus and disks.

David Birmingham   151 posts since
Sep 24, 2007
Currently Being Moderated
4. Dec 19, 2009 12:50 AM in response to: jpongin
Re: Netezza Documentation

The Netezza docs won't help you determine this.

 

What you really need to do is engage with the sales staff and get a POC underway.

David Birmingham   151 posts since
Sep 24, 2007
Currently Being Moderated
6. Dec 19, 2009 1:28 AM in response to: jpongin
Re: Netezza Documentation

Netezza can handle quite a bit more load than a relational database, But it handles the load in bulk

 

 

Keep in mind that relational databases are designed to get data "in", they are not designed to get data out in quantity, such as reporting and batch processing. They are geared for small-size, high-frequency activity, where data warehousing is a discipline focusing on large-scale, low-frequency activity.

 

It is a data warehouse appliance, and circumscribes decades of best-practices in data warehousing. Not transactional processing. At all.

 

So yes, single granular reads and writes will have a hard time.

 

Think of it this way. Netezza can intake 1 record in 1 second. Netezza can intake millions of records in 1 second. How do we want to spend this 1 second? 900 seconds for 900 rows, or 900 seconds for 900 million rows?

 

It is "purpose built" toward this end. An SMP/RDBMS is general purpose, and cannot hope to compete with a purpose-built machine for this activity.

David Birmingham   151 posts since
Sep 24, 2007
Currently Being Moderated
7. Dec 19, 2009 1:36 AM in response to: jpongin
Re: Netezza Documentation

A product that does this? Teradata claims to do this, but does it so poorly that people actually port their active transactional stuff to any other platform but Teradata. I know of one group that spent a year porting their transactions to it, only to spend the next year undoing what they had done.

 

I've had Oracle consultants claim that Oracle can do this, but to match Netezza's scale on the bulk side you're talking about a multiple of many more millions of dollars and careful tuning, engineering and long-term maintenance of the machine to achieve and retain this kind of performance. Netezza does not have this kind of uptake or maintenance.

 

Short answer - reporting'bulk and transactional are like two dogs. One is a shepherd that does large-scale wrangling while one is a retriever with a singleton purpose. Put them both in the same cage, and the reporting system will starve the transactional system for resources. Your users will not stop complaining. Two dogs fighting in a cage. It is simply better engineering to have them in separate cages. It is why data warehousing was born.

 

You would also want to do a POC - no matter what the technology.

David Birmingham   151 posts since
Sep 24, 2007
Currently Being Moderated
9. Dec 19, 2009 9:13 AM in response to: jpongin
Re: Netezza Documentation

Not in the slightest

melina386   1 posts since
Jan 21, 2010
Currently Being Moderated
10. Jan 28, 2010 4:44 AM in response to: David Birmingham
Re: Netezza Documentation

Thanx for sharing

----------------------

http://www.e-datapro.net

More Like This

  • Retrieving data ...