Tuesday, March 09, 2010

An Open Letter to the Disney Corporation

Dear Disney,

Thank you for the 57 mailings we received yesterday advertising your $500 per night resort rooms. Unfortunately, the various names on the mailings, presumably made up by one of my daughter's school friends as a prank (as they share a lot of their last names), do not live here. Additionally, we live in a single family home which could not comfortably support 57 people as a main residence.

I can only assume that you have a computer database with which you extract the addresses from. If it is a modern relational database that supports the SQL language, you may be able to leverage such functionality as the "group by" and "having" clauses to prevent obviously fraudulent database items, perhaps substituting a single "Seymour Butts or Current Resident" mailing.

Here is an example SQL statement to illustrate my point. My main assumption here is that it is unlikely for a single address to have more than 4 families living in it that are likely to vacation at Disney resorts, therefore you can safely cull out addresses that occur 5 or more times in the database with a statement such as:

select address,count(*) from address_tbl group by address having count(*) < 5

Please run that by your DBAs and get their take on it, and, more importantly, stop enabling mail fraud. It's a felony, after all.

Tuesday, March 02, 2010

Three table inner join vs. the staging table

or as we perl users say, "There's more than one way to do it."

I was faced with an odd situation at work a few days ago, and it led me to do a lot more raw SQL tinkering than I typically do. The situation takes a little setup. (I'll say up front that if you don't fancy yourself a geek, don't torture yourself with the rest of this post, but tune in next week for tales of back-to-back concerts at the Newport!)

My team performs application integration, and our preferred method of doing so is with web services that are modeled. "Modeled" in our parlance refers to being managed by a piece of software called webMethods Modeler. It provides pretty pictures of running services, complete with checkmarks as steps get completed, or big red Xs if they fail.

The services we provide mainly move documents around, doing any needed transformation on them. We take all the possible fields that a business process could want, and create a master document definition called a "canonical", whose definition resembles, but is not congruent with, the normal lay usage. We take inbound documents and map them to the canonical format, and subscribers to the documents get the canonical mapped to their preferred format. Because of this, a complete process definition has two models: one for the publisher, one for each subscriber.