Monday, September 29, 2008

Loading Multiple Excel Sheets

Load from Excel is usually pretty straightforward, but sometimes you'll need to load multiple sheets and make some determinations at runtime. Details such as sheetnames may not be known at script creation time.

The QV statements "SQLTables" and "SQLColumns" may be used to discover information about the sheets and columns available in a workbook. Both of these statements require an ODBC connection. The ODBC connection may also be used to subsequently read the data, but I find using the LOAD biff more convenient.

First make a OLEDB connection to the workbook:
CONNECT TO [Provider=Microsoft.Jet.OLEDB.4.0;Data Source=workbook.xls;Extended Properties="Excel 8.0;"];

Specify the workbook name, relative to the current directory, in the "Data Source=" parameter. This example uses a "DSN-less" connection. It does not require you to predefine an ODBC datasource.


The SQLTables statement return a set of fields describing the tables in the currently connected ODBC datasource, in this case the workbook. A "Table" is an Excel Sheet.

tables:
SQLtables;

Now I've got a list of sheets in the QV "tables" table. The field name that contains the sheetname is "TABLE_NAME". I'll loop through the set of TABLE_NAME values and load each one using a standard biff LOAD.

FOR i = 0 to NoOfRows('tables')-1
LET sheetName = purgeChar(peek('TABLE_NAME', i, 'tables'), chr(39));
Sales:
LOAD *
FROM workbook.xls (biff, embedded labels, table is [$(sheetName)]);
NEXT


Sheetnames that contain blanks will be surrounded by single quotes. The purgeChar() function above removes any single quotes that may be present in the sheetname.

What if I only want to load those sheets names whose name begins with "Sales"? Wrap the LOAD statement in an IF statement to test the sheetname:


IF wildmatch('$(sheetName)', 'Sales*') THEN
LOAD .....
END IF


How about this case? I want to load any sheet that contains the three columns "Sales", "Year" and "Quarter":

columns:
SQLColumns; // Get list of columns
// Join list with columns of interest
RIGHT JOIN (columns) LOAD *;
LOAD * INLINE [

COLUMN_NAME
Quarter
Sales

Year
]
;

// Create a count of how many columns of interest each sheet has
selectSheets:
LOAD TABLE_NAME as SheetName, count(*) as count
RESIDENT columns
GROUP BY TABLE_NAME
;
// Keep only the SheetName that have all 3 columns
RIGHT JOIN
LOAD SheetName
RESIDENT selectSheets
WHERE count = 3


// Load the selected sheets
FOR i = 0 to NoOfRows('selectSheets')-1
LET sheetName = purgeChar(peek('SheetName', i, 'selectSheets'), chr(39));
LOAD....
NEXT


You may wonder if you could use the Excel Driver instead of the Jet provider like this:

CONNECT TO [Provider=MSDASQL;Driver={Microsoft Excel Driver (*.xls)};DBQ=workbook.xls];

The connection will complete and you can use this connection for SQL SELECTs. However, when SQLTables is called, the connection will enumerate tables/columns for all the *.xls files in the current directory.

This provider uses the parameter "DefaultDir=" (default is .) to control which directory is enumerated for SQLTables and SQLColumns calls. The DBQ parm plays no part. You may find this useful as an alternative to using a traditional "for each filelist..." loop to process multiple files.

Complete text of the examples presented here can be found in the QV Cookbook at:
http://robwunderlich.com/Download.html)


Wednesday, September 24, 2008

Using the Google Chart API inside Qlikview

I recently had a requirement to create a heat map of the US States and set about exploring if there was an easier way than creating a QV scatter plot mapped over an image.

I found the Google Chart API. With some help from the QlikCommunity forum, I got a fairly pleasing result.

The Google Chart API is provided free by Google. You pass in an http request with parameters that describe the data and desired layout, and a chart image is returned. I won't cover the details of the API, it's well documented at http://code.google.com/apis/chart/. Rather, I'll share my experience integrating it with QV.

Here's a screenshot of my results. The map will update as selections are made in the Sales chart. The generated map is not clickable. It's just a static image.





A working QVW of the above may be found in the Qlikview Cookbook at:
http://robwunderlich.com/Download.html

The map is a Straight Table with a single expression. The expression is the http://... string used to generate the map. The representation for the expression is set to "Image". Thanks to Tom on the forum for showing me this technique.

The OnAnySelect document event is used to trigger a "showMap()" macro that creates the variables needed for the http string. In a production application, you would want to be more selective and use field level events on the fields relevant to the chart.

In addition to chart layout parameters, the http string contains two parameters that describe the data.



  • chld= provides the list of states
  • chd= provides the data values for the states

The States and Values are associated by ordinal position in the respective lists. I could not find a way to keep the lists in sync by using QV expressions alone. The solution was to use the Sales chart as my "data source". The macro walks the rows of the table to build two variables -- the State codes and the Sales values. Here's a snippet of the macro code:


set obj = ActiveDocument.GetSheetObject("CH01")
' Collect the locations
locations = ""
for RowIter = 1 to obj.GetRowCount-1
locations = locations & obj.GetCell(RowIter,0).Text
next
ActiveDocument.GetVariable("vValues").setContent values, false


Another area where VBScript was useful was in encoding the data values as specified by the API. I chose the "simple encoding" method. The Sales values are translated to relative values within a range of single characters defined by the API. The doc http://code.google.com/apis/chart/#simple provided a javascript encoding example which I converted to VBS.

The encoding algorithm requires that the maximum value of the dataset be known to properly spread the individual values across the relative range. To determine the maxvalue in the macro, I use the QV evaluate() function to "callback" to the QV expression language.

maxValue = ActiveDocument.evaluate(
"max(aggr(sum(Sales), State))" )



Producing a chart with the Google API does have some downsides. The user must be connected to the internet and the chart will render slower than a native QV chart. It also does not provide for making selections in the chart and tooltip values like a QV chart does. But I found it to be a simple solution to my requirement. I hope that someday the QV product will provide regional maps as chart types.

Update October 3, 2008: Alistair on the QlikCommunity forum has posted an example of calling the Google Chart API without using macros, which I find to be the preferred method:

http://www.qlikcommunity.com/575/?tx_mmforum_pi1%5Baction%5D=list_post&tx_mmforum_pi1%5Btid%5D=3763&tx_mmforum_pi1%5Bfid%5D=9

The next update to the Qlikview Cookbook will include the "macro-less" technique.

Thursday, September 11, 2008

Gripe re Bug Reporting & Research

Love that Qlikview product. Love that Qliktech support staff. Hate that bug communication.

In my last post, I described a bug encountered on a QV Server upgrade. How many customers should be warned of that bug? Under what conditions would the bug be suffered by other customers? All customers upgrading to the target release? Those coming from a specific release? A specific configuration? Are you impacted by this bug?


Answer: you don't know. Qliktech probably does, but you -- the potentially impacted customer, do not. Unless we call or email our support rep, one at a time, and ask just the right question.

I've been nagging Qliktech for some time now about online bug reporting and more importantly -- a searchable bug database. I've been told that it's "coming" for some time now.

What I want is:
  1. To be able to submit and track a tech support ticket through a web interface.

  2. That authorized users in my organization can view the content and status of those tickets.

  3. To be able to search a moderated database of all tickets/bug/solutions/workarounds submitted by all customers.

Are these cutting edge requirements? No, I get these facilities now from all my other enterprise software vendors.

In my current organization, there are two part time QV Admins providing support, a dozen or so developers, a few managers who may want to be kept informed and sometimes a key user who is impacted by a problem.

The current Qliktech support is via phone or email conversation. That means the staff person working the problem has to continually forward the ongoing conversation to all the interested parties in the Enterprise.

A new problem? Has it already been reported and sufficiently documented by another customer? Is there a workaround? I don't know. So I'll take the time to create documentation and maybe a clear example for support. Only to find "we already know about that one". Or I'll make the quick inquiry and get the response "we need more detail". It's a guessing game. It's slow. It's inefficient.

I want to make it clear that I'm satisfied with the quality of the support representatives and the developers I've had contact with. My complaint is not about people, it's about tools. Having those tools would make both the customer and Qliktech's job easier.