Skip navigation
Currently Being Moderated

Managed NzReclaim vs Groom

Posted by David Birmingham on Feb 18, 2011 11:11:58 AM

So a number of months ago I was in a chat with several colleagues on their use of nz_reclaim on their Mustang systems. If you are a TwinFin user, this really doesn't apply (after all, on the TwinFIn under the covers, the nzreclaim just calls the groom function anyhow).

 

As discussed at the conference and on many forum posts here, the nz_reclaim is one of those options of last resort. People say no, we use a CTAS or other mechanism. The problem with the CTAS, they lament, is that this resets all the rowids in the table and makes it a candidate for a full backup.Groom of course, mitigates all this.

 

But back to Mustang space, because most of us entered the frontier on a Mustang,

 

In their particular environments, which I am seeing is more the case in many places, they run an operational reclaim on their largest tables. Other places just kick off a reclaim on the weekend and let it scan through all the tables. This works great at first, then starts to present problems as we add databases and tables. Then we need something a bit more efficient.

 

If we apply a simple set of queries on the table before engaging the nz_reclaim, even on the largest tables, we can defer the nz_reclaim operation to execute when it's necessary, based on applied thresholds. I mentioned this at the sessions in the past years at the Enzee conferences, but afterward people would ask - how do you find the percentages you are talking about, or to discover those thresholds?

 

So the underpinning mechanics are fairly simple. Each time we perform a delete operation, this is actually a soft-delete. The record is not removed from the used-space of the table, and if you do a lot of operational deletes, you could end up with a lot of unusable dead space owned by the table and unavailable for other use. The update performs a full-record delete-and-insert. So in the end, the problem is the same - we need to clean up the dead space. The TwinFin uses the groom, but the Mustangs use the nz-reclaim, which works a lot like a disk-defragmentation. It is a very expensive operation (in duration especially) and locks the table from access.

 

I will depend upon your own acumen for deriving a list of tables to submit to the function below. It performs the check on the table and gives you a thumbs-up or down based on the provided threshold. I have given a default here of 20 percent, which basically means that if you compare the total records in the table to the total records deleted, if the total records deleted exceeds the threshold, it is declared a candidate. Now that you have a list of candidates, execute nzreclaim on them.

 

The best part of this is that even if you have a very large table that has not exceeded the threshold (say by the weekend run it has only 10 percent dead space) you can forego reclaiming it until the threshold is exceeded. I know of several sites that have operational deletes and updates that use these kinds of thresholds, and they don't perform reclaims nearly as often as those sites that don't use thresholds, or are using an all-or-nothing schedule.

 

In order to "see" the deleted records, we need to access the deletexid field, a hidden column on each Netezza row. If this value is set to "0" it means that the record is normal. If the value is non-zero, it means that the record has been deleted and will be affected by a reclaim. Of course, normal queries ignore the records that have a non-zero deletexid.

 

The function is called nz_reclaim_candidate(), and it accepts the name of a database, a table and an optional parameter for a percentage threshold

 

Note here that TARGET_XFR contain the name of a database on the Netezza machine. All the other assets are manufactured for the demonstration

 

To start out, I'll make a table and add some records to it, forgive the clunkiness of these singleton inserts, as I am only doing this so you can see what I am putting into the table.

 

nzsql -a -d $TARGET_XFR <<!

 

create table zx1 (int1 integer) ;

insert into zx1 select 1;

insert into zx1 select 1;

insert into zx1 select 1;

insert into zx1 select 1;

insert into zx1 select 1;

insert into zx1 select 2;

insert into zx1 select 2;

insert into zx1 select 2;

insert into zx1 select 3;

insert into zx1 select 3;

insert into zx1 select 3;

insert into zx1 select 3;

 

---- now delete some records - this will be 25 percent of the table

delete  from zx1 where int1 = 2;

 

--now lets try to see the records we deleted - this query normally won't show them

select deletexid from zx1;

 

 

--- now lets set up the switch to show the deleted records, active only for this session

set show_deleted_records=true;

 

-- and now we can see the deleted records. It is now simple enough to compare total row count to deleted count, and get a percentage

 

select deletexid from zx1;

 

 

!

 

 

 

# Now let's declare the bash function

 

#==================================

nz_reclaim_candidate()

#==================================

{

exec >&2

DBNAME=$1

TABNAME=$2

MAX_PCT=$3

if [ "x${MAX_PCT}" = "x" ] ; then MAX_PCT=20 ; fi

MDEL=$(nzsql -q -d $DBNAME -A -t -c "set show_deleted_records=true;select count(*) from $TABNAME where deletexid <> 0;" )

echo "delete count=$MDEL"

if [ $MDEL -eq 0 ] ; then return 0 ; fi

MCNT=$(nzsql -q -d $DBNAME -A -t -c "select count(*) from $TABNAME ;" )

echo "table count=$MCNT"

if [ $MCNT -lt $MDEL ] ; then return 1 ; fi

MPCT=$(nzsql -q -d $DBNAME -A -t -c "select (${MDEL}*100)::bigint / ${MCNT}::bigint  ; " )

echo "percentage deleted=$MPCT "

if [ $MPCT -ge $MAX_PCT ] ; then return 1 ; fi

 

return 0

 

}

 

 

 

#### Here you would have a list of tables – I can show you how to grab a list of tables from the database if you want ########

 

###  So lets go through the table list and it will find the table's percentage based on the counts

 

 

#=================================================================================

TABNAMES=zx1

 

for TABNAME in $TABNAMES

do

mtab=$(nz_reclaim_candidate $TARGET_XFR $TABNAME 20 )

mret=$?

if [ $mret -eq 0 ]

then

echo "Not a Candidate"

else

echo "NzReclaim Candidate"

fi

done

 

 

##### drop the test case ####################

 

nzsql -a -d $TARGET_XFR <<!

drop table zx1;

!

 

 

Hopefully this can provide some value to those of you who have mature Mustang environments and need some relief from the "global" nzreclaim approach.

 

Happy hunting!

Comments (0)