Wednesday, January 18, 2012

Incremental Load using Qlikview Components

Qlikview Components (Qvc)  is a script library that simplifies and improves the quality of Qlikview scripting. You can read an introduction to Qvc here.

Today I'll show you how to do Incremental Load (IL) using Qvc. Incremental Load means extracting only the latest changes from a database table and merging those changes with a master QVD file. Writing your own IL script can be rather tedious and frequently involves copy/paste operations followed by forgetful edit errors. Qvc can make it much easier.

Incremental Load is not necessary for every table. We typically use IL for large source tables. IL can reduce the elapsed reload time and impact on the database system by loading only new or updated rows.

To utilize Incremental Load a source table must have both of the following attributes:
  • A unique identifier -- a Primary Key -- for each row. 
  • A "Modification"  column that identifies when a row was added or updated. The column type may be a Date, Datetime or ascending Revision number.
The classic IL logic is this:
  1. Determine the "Last reload Time". The most robust technique is to extract the max value for the "Modification" column from the Master QVD.
  2. Select rows from the database table where "Modification" is greater than "Last Reload Time".
  3. Add and update rows in the Master QVD, based on primary key.
Here is the complete Incremental Load script using Qvc.

// Include Qvc code
$(Include=..\qvc_runtime\qvc.qvs)


// Calling parameters are QVTablename, UpdateColumn, PrimaryKey.
CALL Qvc.IncrementalSetup ('Rates', 'LastUpdate', 'RateId');  


// v.Tablename is set by IncrementalSetup
[$(Qvc.Loader.v.Tablename)]: 
// Whatever LOAD and SELECT goes here
SQL SELECT * FROM dbo.RateTable
// v.IncrementalExpression is set by IncrementalSetup
WHERE $(Qvc.Loader.v.IncrementalExpression) ;


// Update the QVD with the changed rows
CALL Qvc.IncrementalStore

That's it.

You'll also get useful log messages telling you what was done.

00002 1/17/2012 1:23:46 PM; QVD\Rates.qvd exists, rows=31
00003 1/17/2012 1:23:46 PM; Loading rows where LastUpdate >=''01/14/2011''
00004 1/17/2012 1:23:46 PM; Rates loaded, rows=1
00005 1/17/2012 1:23:46 PM; QVD\Rates.qvd updated, rows=31


Review the Qvc Documentation for explanation of parameters and configuration variables. The documentation also contains a working example.

6 comments:

Gumster said...

Haven't had time to try this out yet but I will!
Great post and thanks for the work with QV components, I will try and use it in my next project and see if it can help me to standardize development.

Usemeplz said...

Ohh, it's so hard for me to understand as I'm just beginner. But in general as I understand there's nothing difficult for profies?

quiquehm said...

I am trying to setup an incremental load to build a QVD, but my scenario is a bit different. I have multiple excel sheets being stored in a folder ( these are periodic extractions from the same source system, so same fields everytime ). There is not any datestamp column on these files, and the way I manually create a datestamp column is by using part of the file names in the Qview script. So every file stored I name it like : yyyymmdd-abc.xls ( being yyyymmdd the date when I get the extract and store it in the folder ).Then using a "for each a in" ..function I load these files in sequence adding a datestamp field like $(a) as DateStamp. Then all is stored in a QVD file. What I need to accomplish is everytime I do a data reload I need to avoid loading all the xls files again ( and building the whole QVD again )wasting time. I would need the script to only load the newer xls files stored in the folder. Any ideas on how to set it up ?
Thanks a lot

Anonymous said...

Rob- does it take care about deleted rows ??

Rob Wunderlich said...

Yes, the current version provides for handling deleted rows.
Rob

Anonymous said...

Hi, it's possible to store each qvd separately ?
I mean, I need to store each qvd by week..

is it possible ? if yes, how can I do ?