Wednesday, November 10, 2010

C#/SQL: The Record Viewer: Quick Access To 3,000,000,000 Data Records

Build Viewer Control Panel
The products offered by Melissa Data are personal and business contact lists intended for use by marketing or search organizations. Melissa Data combines billions of partial records received from several sources into a smaller set of current and complete records. Marketers can use the high quality Melissa Data product to reach customers who are more likely to become actual purchasers and avoid wasteful effort and postage costs associated with attempting to contact previous or duplicate addresses. A search business can learn and provide the latest updates concerning the contact information for people or businesses.

There are far too many records in play for us to accurately realize: i) the effects of modifications to our development process; ii) changes in source and types of input; and, iii) software that converts the raw data to the finished output. One of my purposes at Melissa Data is to produce applications that make retrieval and viewing of records practical at any stage of their assimilation.

One application I wrote for those purposes was a C# Winform record viewer that interfaced with both XML and mySQL backends, the MD Build Viewer. The Build Viewer allows list, view, search, and jump access to every one of the billions of records at each step in their production, their heirarchy, and changes resulting from modifications in a preceding phase. The final version of the records are read from a mySQL database and the parent records are read from a very large flat text, proprietary format text file.

Handling that amount of data reflected on the Build Viewer requirements in the form of performance goals, such as: The UI loads access to the data (approx. 3 billion records) in less than 15 seconds so the user experiences minimal delay (On that count, consider that it takes far longer than 15 seconds for the comparatively simple Linux wc shell command to count the number of lines in a 3 billion line file); Switching between records, presenting parent records, and jumping to records should be rendered in either a standard or cascading sidebar form; Clicking on any record field will jump to the breakout of that record component; Right-clicking on a street address will pop up a Google map with the address shown in the map. Here are a few views:




The view above shows the treeview record list and the detail of the selected record.

The next shows the details of one of the individuals in that record. The yellow tree nodes are those that have been reviewed and the red ones have been read and marked via a right click menu selection for later review. The user can suppress all but the red nodes or store them to a serialized list that can be loaded in a later session. A mnemonic key sequence provides a means to jump to the next or previous red node or to a specified record number.

Here are views of the sidebar view that shows the contents of the parent records of the one selected in the treeview. The upper will replace the parent record present in the sidebar with the one just selected from the list or reached via the up or down arrow key.
The lower shot shows the cascade option where the records overlap as the user clicks or accesses them via the arrow keys.




And here is one that show a Google Map inset resulting from right-clicking on an address in the record field and a sidebar view showing the same.






Tuesday, April 6, 2010

C++: Support of Windows Group Policy for Attachmate

Around the time that I finished the project to rewrite the API set in Attachmate's Extra! Personal Client product I saw an increasing number of articles in the trade magazines concerning the subjects of management and distribution of applications. This was a time that straddled the shift from desktop applications to web-based. Enterprises were keen on improving their return on investment of their software dollars -- often described as reducing the enterprise-wide cost of ownership of a product.

I asked my development manager if I might devote a project's time to researching this area and perhaps develop something if I ran into a good idea. After receiving his OK I obtained and installed several products including Novell's LanDesk Manager and Citrix WinFrame. In the end I felt that Microsoft had, in its Microsoft Management Console, a good candidate that I could quickly fit with Attachmate's products.

At design time, based on the MMC capabilities, I saw my goals as:
1) prepare the Extra! product for ease of distribution on a corporate LAN or WAN.
2) prepare the Extra! product so its features list can be centrally provided or withheld from a user based on one or more of their Windows network group memberships.

First up, I wrote an Administrator file for Extra! (known as an ADM file) that mapped a user's permission to access Extra!'s menu items and configuration settings. I then added code to Extra!'s frame class that obeyed the MMC protocol configured by an MMC administrator when the new extra.ADM file was used. An MMC administrator could then assign a mapping of features that should be available, disabled, or hidden to a specific user, a group of users, or the entire enterprise from his console.

With that, a network administrator could prepare Windows roaming profiles for an individual, a group, or the enterprise that silently installed an Extra! product from a network server, that contained a preconfigured set of product session parameters, and a policy set that provides or prevents access to Extra! product features as appropriate for the user.

Apart from the development efforts using the guinea pigs in Development, I did three different demonstrations to Product Management, QA, and Customer Support in which I brought a clean Windows laptop into the conference room -- or borrowed one from somebody in the room -- and logged in to an account that was configured to use a roaming profile. I showed the start menu items that came from the profile and also that Extra! was not installed on the PC. Then I clicked on the start menu item for Extra! which resulted in a running Extra! session showing up on the desktop about 20 seconds later. When checked, everyone attending could see that several of the menu items were disabled -- corresponding to the policy settings I set earlier in the server room while impersonating an MMC administrator. I repeated the same demonstration using a different roaming user account to show that that other user had a different set of menu items enabled and disabled.

This work coincidentally was a significant contributor to Extra!'s ability to meet Windows 2000 Logo certification the next year.

Monday, April 5, 2010

SQL/Master Data: Information Integration at Attachmate

Back in the earliest days of Attachmate's history there were three employees and the one manufacturing/sales/software development/support department. As the company began to grow new departments were formed to handle specialized tasks such as Sales, Marketing, Shipping, Software Development, Testing, and Customer Support. As each department grew they separately designed a uniquely structured Advanced Revelations (AREV) relational database to handle information relevant to their specific mission in the company. For example, Sales kept data about transactions and the customer systems involved. At the same time, but on a different database, Customer Support logged data about customer systems, problems, and missing features that, if implemented into Attachmate's product would provide for compelling sales opportunities.

And as this growth was occurring Marketing was sending advertising mailers to customers for products Sales had recently sold to them and customers were frequently asked the same questions each time they communicated with a different department.  That is, information in every department's database was valuable for numerous reasons if it could be made available to the other divisions. I was given the opportunity to devise a way to organize and manage the different departments' databases so accurate, consistent, and up-to-date information would be available to everyone.

Going in, I expected that my goal was to decide the best technical design for an end-product data warehouse. However, it turns out that establishing nth level normalization is not the biggest challenge in this type of an endeavor. I realized early on that I was put in charge of this project because several months of weekly meetings attended by the nine department heads involved had failed to pound out a well-normalized corporate database. Each of the nine departments with an AREV database had a lot of stored data in their database, but each was constructed in a dialect that reflected the needs and understanding of the types of people who worked in the department. That is, while much of the data was exactly the same from one system to the next, there was a lack of what are known as "unique identifiers" shared by all of the databases to identify a field type. For example, a field in one department's database might cover multiple fields in another department's. Another common problem was where a field in one department's database might concern the same data, but be named differently, and some departments had developed a serious religion about their choice of field names.

The steps I chose to reach an agreement involved producing a candidate list of the fields that emerged from running historical queries against all of the databases using the each department's terminology. To avoid the head butting that had occurred before my entry to the project I submitted in series -- specifically not in parallel -- the candidate table and field list to each of the department managers involved. (If any of you managers involved are reading this, my strategy was to gain managers' acceptance in order from the least agreeable of you to the most. ;-) ). Each time I ran into need for a change when discussing the list with one manager I would polish the changes in and start over at the first manager, successively gaining approval until the job was done. Yes, at the start it looked like an O(n^2) endeavor, but happily the managers turned out to be agreeable and it was finished in O(n) time.

From that effort, together with the purchase of a newer web-based database engine, the goal to create one master corporate database was completed. It provided a lot of strategic information, made clear what tactics worked well from one project or product lifetime to the next, and allowed us to optimally manage our efforts in ways that certainly played a role in Attachmate's later success.