27 Jan

ANSI_NULLS setting

Back in my Cyberquery days we used to search for NULL values with colname = NULL. I always thought it was strange but that’s how it worked. Well, ou can do the same thing with SQL Server by using the ANSI_NULLS setting.

SET ANSI_NULLS ON; — This is the default setting.
with cte as
(
select null as col1
union
select 1
)
select * from cte where col1 = NULL;

Returns no rows.

However, the same query with ANSI_NULLS OFF returns a different result.

SET ANSI_NULLS OFF; — Turns ANSI_NULLS off so the = operator will work.
with cte as
(
select null as col1
union
select 1
)
select * from cte where col1 = NULL;

Returns

col1
NULL

You’ll probably never run into it but if you see “= NULL”, and it works, this is what’s going on.

Note: At some point in the future this will be turned off so that this setting is always on and trying to turn it off will result in an error.

MSDN: http://msdn.microsoft.com/en-us/library/ms188048.aspx

20 Jan

< NULL

This is from my Cyberquery days and it involves a dead inventory report. The idea was to produce a report that tracked inventory that had not been sold in the last 365 days. The system tracked the last_sale date in the sku file so I’m sure it looked easy to do. Here’s the Cyberquery code,

WHERE
last_sale < todaysdate – 365

I wound up working on the report because they wanted to change the date to 270 days. So, I changed the code to 270, looked at it for a bit, and did a “wait a second”. What happens if we never sold the item?

Well, since a NULL value can’t be less than anytihng, if the item never sold it would never show up on the report. The solution was actually pretty easy in this case because we also had a last_receipt field so all we had to do was modify the code to look something like

WHERE
last_sale < todaysdate – 270
OR (last_sale = NULL and last_receipt < todaysdate – 270)

Note: it’s been a long time since I’ve written CQ code but according to a sample that I still have that is how it tests for NULL values.

Anyways, there are two morals to this story. The first moral is be aware of NULL values in your data. They bite!

The second moral? There isn’t one. The guy who wrote the report wound up a VP in the company.

11 Jan

My Reporting Structure At A Prior Company

For a recent job interview I put together a short 2-page summary of the things I’d done at Higgins. Mainly this is a data flow diagram showing where the data started and ended. I was involved in all of it (ok, not every single piece, I had an IT manager who liked command line programming) but for the most part I did all of the design, programming, testing and implementation for our reporting.

Reporting (PDF)

What follows is an explanation that I sent to a hiring manager.

The main business ERP was developed by a company called Activant (now Epicor) Falcon. It was a lumber based ERP (used by Higgins and Golden State) that was developed in and ran on Progress (the database and language are tied together in many ways). I don’t have the total database size but it would have been several hundred gigabytes. This is where the majority of our data was stored. The Pella side of the business was much smaller and did most of their analysis in an Access database. This was merged with our data.

Data was pulled from the Falcon ERP in multiple ways and used in multiple systems as follows:

  • Ad-hoc Reporting using Cyberquery – This was standard reporting (Excel, PDF, command line, automated emails, etc.).
  • Cyberquery hold files – These were flat-file based data extracts that were very fast. We used these extensively.
  • Cyberquery data extracts – These were extracted CSV files that were eventually imported into our intranet (Horde application framework – LAMP based) for reporting, CRM (purchasing used SugarCRM), Sharepoint (KPI’s displayed there) and web-based reporting for our on-line customers.
  • We did occasionally use an ODBC connection to access this data directly from PHP but it was awkward so in general we stuck to exporting it into MySQL.

We moved the data around, for the most part, using UNIX/Linux commands (rsync, cat, mysqlimport). Cyberquery also allowed you to create macros which ran AIX commands.

This data wound up in two main interfaces (company websites and our intranet which ran on the Horde application framework). The following are all LAMP based except for some minor reporting done using SQL Server and ASP.

  • Our company websites (www.goldenstateflooring.com and www.higginshardwoods.com) where customers could see their statements, invoice history, payment history and view product information on our blowout sku’s. Everything on those sites was designed and built by me except the appearance.
  • One-Stop Reporting (see the attachment for more information). The concept behind One-Stop was to take multiple reports from our ERP (which often had very poor indexing) and combine them into a single interface (LAMP based) improving efficiency and speed (in some cases reducing a report from hours to seconds). It was focused on four specific areas: purchasing, sales, credit and customer data. For instance, One-Stop Sales allowed you to select from 15+ parameters (inside rep, outside rep, warehouse, sku) and run 6 different reports based around some of the same factors. So it was possible to see sales by rep by warehouse, sales by rep for a given sku, sales by inside rep for a sku, etc. Data was updated once/day.
  • There was also customized sales reporting for the outside sales team, including automated emails, that allowed a salesperson to view all of his sales, customer and credit information. Management could access it to see a salesperson’s performance.
  • We also did some work with exporting data to Sharepoint and SugarCRM. The latter was used by purchasing and we exported PO data to it and the former was for some simple KPI’s for individual warehouses.

This is a bit long but it does clarify the process we were using.

11 Jan

Creating a UNION statement in Cyberquery

I’m going to venture into something that is a small bit of risk, only because I no longer have access to Cyberquery documentation and that is how to perform a UNION using Cyberquery.  The reason for this is that, well, unless it’s been added, there is no such thing in Cyberquery.

In case you don’t know a UNION done to combine the results of two queries.  For instance, you had monthly historical sales data that you had broken off from the main table and now you needed to combine 2005 and 2006 data for some reason.  In MySQL the syntax for a join is the following:

<pre>
mysql> SELECT 1, 2, 3 UNION SELECT 1, 2, 3;
+—+—+—+
| 1 | 2 | 3 |
+—+—+—+
| 1 | 2 | 3 |
+—+—+—+
1 row in set (0.00 sec)

mysql> SELECT 1, 2, 3 UNION ALL SELECT 1, 2, 3;
+—+—+—+
| 1 | 2 | 3 |
+—+—+—+
| 1 | 2 | 3 |
| 1 | 2 | 3 |
+—+—+—+
2 rows in set (0.00 sec)
</pre>

The first example uses the default syntax (you can also use the keyword DISTINCT after the UNION keyword) to produce the same result.  The ALL keyword combines all of the results and omitting it, or using the DISTINCT keyword, returns only unique results.  Also, to confuse the issue a tiny bit, Oracle has additional keywords like INTERSECT, MINUS and you can even do a MERGE instead of a UNION.

So, back to Cyberquery and creating a UNION in that environment.  Here is how I would go about creating the same queries in Cyberquery.  This example is based on a UNIX environment.  If you using it on a different platform the concepts will be the same but the commands will be different.

I would write 2 Cyberqueries:

var1 = 1;
var2 = 2;
var3 = 3;

List/hold=”query1″
var1
var2
var3

var1 = 1;
var2 = 2;
var3 = 3;

List/hold=”query2″
var1
var2
var3

This will create 2 files called query1.hf and query2.hf in, well, wherever your environment puts them.

To produce a UNION ALL from the 2 hold files.

mv query1.hf query1.bak
cat query2.hf >> query1.hf

This simply appends the data from query2 to query1 and any Cyberquery written against query1 would contain both results.

To produce a UNION DISTINCT.

mv query1.hf query1.bak
cat query2.hf >> query1.hf
uniq query1.hf > query3.hf
mv query3.hf query1.hf

In this case you are appending the duplicate data to the first file, then using the uniq command to create a new file with unique data in it, and then overwriting the first file with the uniq file.  Assuming the permissions are set correctly you shouldn’t have any problems doing this.  Oh, and the first mv command creates a backup in case you screw something up.

And, in case you are wondering, all a hold file is in the Cyberquery environment is a flat file with a definition file (hfd).  The definition file provides variable type and positional information so that a Cyberquery can read it and know what the data at column 7 means.

One last point.  Do not do this with different file layouts because the .hfd file won’t understand the positioning and you’ll get something akin to garbage when you try to run the Cyberquery.  The line lengths, and the data in each line, needs to be identical as far as positioning and type are concerned.  It will allow you to write the query, probably, based on the .hfd file but the results will be, at best, interesting.

Hope this helps the few remaining Cyberquery users out there.

10 Jan

My Goal Is To Figure Out Who Built The Goal Table In Your Database

One of the first jobs that rolled my way in 2005 was to set up a method for tracking sales goals and the sales relative to them. They also wanted to present it to users in our Horde portal. I did some digging, found that the ERP had a small little application that tracked goals and sales, so I figured that all I would have to do is find the table where they stored the goals, extract the data, and we would be good.

So, I found the table and fired a basic query against it and here’s roughly what came back for a single sales rep.

Year Sales
2005  
2005 300000
2005 300000
2005 300000
2005 375000
2005 375000
2005 375000
2005 425000
2005 425000
2005 425000
2005 375000
2005 375000
2005 375000

Anyone, whose done any of this kind of work can spot the problem in half a second, “how do you know which of those records apply to which month?” Well, the answer is that the second record was January, and the third February, and so on and so forth. Yes, somewhere, along the way in the development process, someone, you’d think, would have said “wut”?

In order to use it, over on the database side, I had to get tricky and make sure the data export sorted correctly (something that shouldn’t have mattered) and then put counters in the code so I knew what was what. Had to make sure that the second record was January and all that. I managed to do it and it wound up working well enough.

Honestly, I have no idea why someone would build something like this except for one possible clue. In Progress there is an array data type. I really know nothing about it, except that it exists, and these values were in fact using that data type. In Cyberquery, in order to access January’s goal data you would write something like:

WHERE
slm_goal:year = 2005

LIST
slm_goal:sales[1]

Of course, you’d expect to write the WHERE clause as

WHERE
slm_date = 1.2005

But, I digress.

And there you had it, sales data for January 2005. Of course that’s still almost impossible to work with for anything meaningful outside of the system. But, Progress is a funky thing. As I understand it, and really I don’t, the programming language is also tightly integrated with the database. It may be that somewhere in that mess, someone saw this as an easy way to deal with some problem. It’s also possible that this piece of work was never intended to go into the mainstream as it was well-hidden and our main ERP support person didn’t even know about it. Still, it was sitting there on our system.

And the final straw, that ERP, wasn’t something you bought off newegg.com unless they’re selling $500K+ products these days. It was a legitimate ERP that was in a decent chunk of the industry it specialized in.

10 Jan

Cyberquery Report Writer

I thought I’d write about the report writer that almost no one has heard of: Cyberquery.  In fact, its so low on the radar that searching on indeed.com yielded just 13 results (5 and 2 of which were duplicates) meaning that there were a total of 8 advertised jobs in the entire United States.  In fact, a quick Google search shows that no one has ever just written about it, including it’s manufacturer.  So, not only am I part of an “elite” group, writing is close to an internet first.

Cyberquery is a tool developed by a company called Cyberscience.  It’s basic function is as a report writer where it does all the normal things like create HTML, XLS, TXT or PDF based reports.  These reports can be automated to run, the results emailed to specified users and you can, somewhat, format them (this isn’t a strong point of the product but it’ll do 95% of what the average report needs).  It has 2 approaches to report design which are a text based approach using the Cyberquery language and a GUI based tool, which frankly, I almost never used.

A basic Cyberquery would look something like the following:

list
cust:cust_name
cust:address

This is the equivalent of

SELECT cust_name, address
FROM cust

You can also do summary reports using the sum command.

sum/xls
oe_line:ext_price

by
oe_line:sku

This would produce a summary report showing the total extended amounts of all invoices by SKU for all time and produce it in Excel (/xls) tells it to do this.

One of the interesting, and sometimes painful, things that Cyberquery does is auto-join tables based on the indexes in each table.  For instance, it will auto-join between an invoice header table and an invoice detail table using what it determines as a logic choice.  You can, sometimes, change that join just by changing the order of the results (list/sum clause) around.  It’s kind of an interesting art but it works well enough.  You can also override this behavior and force a join between two tables with a DEFINE statement.  Interestingly, it used the old [+] syntax to specify a left or right join.

All of this, however, is pretty normal stuff.  In my opinion what made CQ powerful was it’s ability to generate hold files and run macro files.

A hold file is a flat file and it is very, very fast.  For instance, pulling sales data out of our system could be brutally, painfully slow.  I once had to get will call data out of the system and it took me 2 hours for each run.  With a hold file, you would run it one time to capture the data, then daily to capture that day’s worth of transactions.  That really isn’t new, in the sense that this is done in most data warehouses, but the flat files really are faster than what you get going directly at a database.

Macro files are exactly what they sound like: a series of commands executed by CQ.  With a macro file you could, literally, build a CSV file on disk, name it, change the name, and copy it somewhere.  They allow you to work at the command line.  Or, you could build multiple hold files, in sequence, and combine them in a final report.  I once chained 10 hold files together to produce inventory data (I was combining sales and inventory data and needed to do each calculation separately in CQ before merging them back together).

The one real weakness I saw with the product is that I couldn’t write SQL with it.  It has a tool to show you the SQL underlying a CQ, but, it often produces nonsense.  You’ll pretty much have to turn off your SQL mind when you use CQ but, to be honest, writing CQ becomes second nature after a few weeks.

Overall, I found the product decent and reasonably powerful to work with.  The biggest problem with the product is probably that almost no one is using it.  When you compare it to virtually any other reporting tool, especially from a career perspective, you’d be much better off using, well, any other tool.  But, when you compare it to the other report writers out there, in the right hands, it can hold it’s own.