Thursday, November 4, 2010

TDWI St. Louis Chapter

For anyone who will be in the St. Louis area on November 12th, the St. Louis Chapter of The Data Warehouse Institute will be holding it's quarterly meeting.  I expect this meeting will contain some great content from renowned speakers Neil Raden and Krish Krishnan.

For more information and to register, see the TDWI STL page.

Meetings are free and open to the public.  You need not be a TDWI member to attend.

Hope to see you there!

Tuesday, November 2, 2010

Information Portfolio Components

My model for the Information Portfolio includes four components:
  • People
  • Applications
  • Processes
  • Data



People represent the individuals or teams that use data to execute business processes.

Applications are either traditional end-user applications or integration solutions that move data through a process, between processes, or to/from people.

Processes are the business activities that leverage data to fulfill the operational objectives of the business.

Data is the cornerstone of the Information Portfolio.  It is the stuff that moves through a process, between people and applications, to act as fuel in the execution of business services.

The Information Portfolio is a knowledge base or collection of metadata that links together these four concepts in meaningful ways that transform bare data, through context and meaning, into information that can be used in the delivery of business services. [Wisdom Hierarchy]

More on each of these components and how they relate to each other in the Information Portfolio as the month continues...

    Day 1 - PragProWriMo / NaNoWriMo

    It's November again, which means that it's time to spend some focus time writing. After a little bit of a slow start with day one of writing, I'm at 1,292 words. The pace that I try to set is 2,000 words per day, which leaves a chance to take one day off each week and still hit the NaNoWriMo target for 50,000 words. The official goal for PragProWriMo is 2 pages per day, which would be well under a 2,000 word/day goal.

    I've been distracted and not really planning for November, unlike last year. Last year, I had an outline developed and some sketches down on paper before I started writing The Practical Data Warehouse. This year, I spend the evening watching TV, commented to my wife that I really didn't have any clue on what to write... then took her advice and just started writing. This years concoction:

    The Enterprise Information Portfolio
    A Model for the People, Applications, Processes, and Data of an Organization


    As November rolls on, I'll be posting snippets from the text. Good luck to all the other NaNoWriMo and PragProWriMo authors out there!!

    Monday, September 6, 2010

    What's your cathedral?

    There's a classic story about understanding the purpose of work: The Story of the Three Stone Cutters
    Once, there were three stone cutters working together on a job.  A stranger came upon the first stone cutter and asked, "What is it that you're doing?"

    "Cutting this stone into a perfect square block," answered the first stone cutter, continuing to focus with great care and precision.

    The stranger moved on, leaving the first stone cutter to his craft. He came upon the second stone cutter, who was also working diligently on his pile of stones. "What are you doing?" asked the stranger, interested to see how this stone cutter would respond.

    The second stone cutter stopped to look at the stranger and engage in the conversation. "I'm working here on this job to provide for my family. I have a loving wife and two wonderful children. I work hard here to make sure we have what we need and can take time to enjoy each other." The second stone cutter reached out a friendly hand. "I'm Alexander."

    The stranger introduced himself and shared a brief story with the stone cutter about his own family. They said good bye, and the stranger moved on to another area of the project.

    Around the outer edge of the site, the stranger saw a third stone cutter who was squatting behind a large carved stone, but staring toward the horizon. The stranger approached this stone cutter and asked, "Pardon me. May I ask what you're working on?"

    "I'm building a cathedral," responded the third stone cutter without breaking his gaze toward the horizon, and into the future. "It's going to the new home for a parish that is renowned for it's financial generosity and support of the surrounding community. There will be a soup kitchen on the main level, a community garden in the courtyard, and offices for individual and family therapists. In three years, they expect to be providing services to over two thousand people a day."

    The stranger stared into the distance, picturing the bustling crowds and smiling faces of the volunteers. "Thank you for sharing that vision," said the stranger.

    Every time that I've heard this story in the past, I've identified clearly with the third stone cutter - I need to know what kind of structure I'm building and why we're doing it.  I struggle to find motivation unless I really understand the mission.

    And, I've always felt sorry for the first two stone cutters.  I mean, the second one has a noble purpose and all.  The first one doesn't need to know his greater purpose to be a good worker.  The third one, though: he's the enlightened one!

    Then, I started thinking about one of the last team building / motivational leadership meetings I was at.  We spent a long time talking about how to help other co-workers in IT connect to the fact that we are a Catholic health care service provider.  Our mission and purpose is all about providing care for patients, with a deference for those who are on the edges of society or in the greatest need.  A great purpose for the organization.  As IT leaders, we spend quite a bit of time working to help our staff understand how their day to day work of supporting servers and applications helps someone to do their job in Finance; and how that helps someone do their in Medical Records; and how that helps someone do their job in Credentialing; which helps us make sure that our physicians are qualified; which helps us make sure that patients are safe.  In the worst of cases, an IT co-worker can feel six or seven times removed from the purpose of the organization.  (I recently took a Gallup "strengths" survey and learned that "connectedness" is one of my strength, so I guess I don't really struggle with this too much.)

    Still, I recently started to wonder if all of the leaders in that training were coming at things from the perspective that the third stone cutter is the only one who's really got it right, and that we're all supposed to strive to see the same metaphorical cathedral.

    Recently, I've found at least as much satisfaction at work focusing on the quality of stones that I'm carving right now.  I've got a larger purpose in mind, though it isn't really the patient care we provide. As I contemplate how to push data management principles forward, my justification stops with "this will make our organization smarter."  Of course all kinds of great benefits will come from that, including improved patient care, fiscal responsibility, innovative service models, an end to world hunger, and peace for all.  Right now, though, it is completely satisfaction enough to have the purpose of making people better, smarter decision makers.

    So, if my cathedral isn't better health care for our patients, then is it a stretch to think that the first stone cutter's cathedral isn't that one beautiful brick and pride in his craftsmanship; and the second stone cutter's cathedral isn't the quality of life experience that he's providing for his family and for himself.

    What's your cathedral look like?  Is it right in front of your eyes, or off in the distance?

    Tuesday, August 31, 2010

    Spare Some Change?

    There's a classic joke about the difference between the IT person and the developer.  Here's my version:
    The boss comes into the IT guy's office and says "Hey, Joe, we've got a new line of business starting up and we really need to be able to do this new thing, X."  

    Joe simply says, "Sorry, Boss, that can't be done."  

    So, the boss goes down the hall to the development manager and says "Hey, Nancy, we've got a new line of business starting up and we really need to be able to do this new thing, X."  

    Nancy says, "Absolutely, Boss.  We can do anything.  It'll take us six months to plan, two years to develop, and another several months of user acceptance testing."  

    The boss goes back to his office and begins writing his resignation.

    The best quote I've ever heard about change isn't the old adage that "the only thing constant is change."  I'm a futurist at heart, so there's no deep insight for me in the constancy of change.  The best quote I've heard about change is that "change is great because every time something changes it means you're one step closer to getting it right."  That's a bit of paraphrasing and it assumes that we're primarily concerned with "good" change, of course.  The point stands: if you believe that change is a good thing, then it's natural to embrace it rather than fight it.

    Classic waterfall methodologies focus on defining specifications ahead of implementation so that the risk of change can be avoided.  That level of contractual thinking drives unnecessary conflict, especially in business intelligence projects.  One of the things we've learned through experience is that we can't know what we don't know.  Naturally, the needs being addressed by a business intelligence solution will change over time as more insight is delivered to decision makers.  If we already knew what the outcome was, then there wouldn't be a need for the project.  Business intelligence is about discovery.

    One of the core principles from the Agile Manifesto is "customer collaboration over contract negotiation" and "responding to change."

    I've worked with a number of teams that grow increasingly frustrated over changes that end users make in business logic.  ETL developers sometimes get so fed up with change and being forced to rewrite code over and over that they feel they should stop writing any code until "the users decide what they want!"  Those developers haven't recognized that every time they change their code, they're enabling the users to understand what it is that they want.  They're facilitators, not just implementers.

    So, agile BI is at least as natural a fit as agile application development.  Probably even more so.  For BI developers to be agile, though, they have to embrace change.  They have to facilitate rather than resist change.

    Friday, August 27, 2010

    Your Job? My Job? Our Job!

    I've been trying to figure out agile data warehousing for several years now.  I'm a computer scientist by training and a programmer by hobby, so I've always kept my eye on trends in traditional software development.  What I tell myself, professionally, is that it helps me have alternative perspectives on BI solutions.  (It's really just 'cause I like programming, even if I what I hack together isn't typically that elegant.)

    Several years ago, I was introduced to one of the founders of the St. Louis XP (Extreme Programming) group, Brian Button, and decided to sit down and have lunch with him.  I explained what kind of work we do to build data warehouses, and he listened very politely.  At the time, he was thinking mostly about test driven development and pair programming.  One of the things he asked me was "can you get your data modeler, ETL developer, and report developer all in a room together working on the same components all at once?"  It occurred to me, then, that separation of development responsibilities might be a serious impediment to agile BI development.

    As a former consultant, I've personally done a little bit of everything.  I can data model well enough to build something workable; I've spent a lot of time writing ETL both by hand and with GUI tools; I'm a raw SQL hacker for sure; and I can even create a reasonable report or two that some VP wouldn't just use as a coaster.  How often have I ever asked my staff to do that breadth of work, though?  In larger organizations, I usually see and hear about separation of team: data modelers, ETL developers, reporting people.  They're separate teams.  That's always been under the guise of "developing technical expertise" within the team and driving consistency across projects.  (Important goals for sure.)

    However, when I look at successful agile software teams that I know about, that same level of separation isn't typically present.  A single developer might do part of the UI, some of the dependent service modules, and the persistence layer.  They're focused on delivering a particular function, not of some component of the overall application, but an external function of the application.  This goes back to the previous conversation about sashimi, too [1] [2].

    Of course there are some developers that are better at UI and other better at ORM, just like there are some BI folks better at data modeling and other better at data presentation.  To enable more agile development, though, requires developers who are more willing and able to cross over those traditional boundaries in the work that they do.  One of the leaders I work with today articulates this very well when he says that we just want "developers."  What this does for agile is that it minimizes context switching and the spin between different developers working on the interrelated but different pieces of the same component.  If an ETL developer decides she needs five extra fields on a table because she just learned that some previous assumption about the cardinality of a relationship was flawed, should that change require the synchronous work of:

    Modeler ETL Developer Report Developer
    1 Changes data model wait/other work wait/other work
    2 Deploys changes to DB wait/other work wait/other work
    hand off
    3 wait/other work Import new metadata wait/other work
    4 wait/other work Update ETL jobs wait/other work
    5 wait/other work Unit test ETL wait/other work
    hand off
    6 wait/other work wait/other work Run test queries
    7 wait/other work wait/other work Update semantic layer
    8 wait/other work wait/other work Update reports
    NAnd loop through for every change

    There's a lot of opportunity for optimization there if one person is working on the tasks instead of several people. For about 66% of the teams time, they're working on something other than this one objective, and there's latency in the hand off between developers. (If you can create a parallel scheduling algorithm that gets the overall workload done faster, all things equal, than "one resource dedicated to completing all the steps in implementing each particular piece of functionality" and "helping each other out when there's nothing in the queue", please let me and every computer science professor on earth know.)

    I think that for some teams this will be a challenge to their skill set and require developers to grow beyond their existing comfort zone.  I'll argue that they'll be the better for it!  For some teams it might be more of a challenge to ego than to actual skills: "you're going to let anyone data model?!"

    The answer is "yes" and "we require an appropriate level of quality from everyone."  That's why agile teams are ones with pair programming, peer reviews, and an approach that not just accepts but welcomes change.

    For a team that isn't agile today, these things can't come piecemeal.  If you want to be agile, you have to go all-in.

    Thursday, August 26, 2010

    Growing a Tree versus Building a House

    When we say that we want to build "good" software, we tend to use terms that come from other engineering fields: foundation, framework, scafolding, architecture.  One of the things that the agile software movement has shown us is that good solutions can come from evolutionary models as well as construction models.  The difference comes from the fact that code is far easy to manipulate that physical raw materials.

    When building a data warehouse, we often draw traditional, stacked-tier pictures of the data architecture: data warehouse tables, semantic layer, data marts, etc.  If we start our design discussions with an assumption that anything we "build on top of" has to be "solid" then we quickly drive the overall solution away from agility.  "Solid" conjures an image of a concrete foundation that has to be built to withstand floods and earthquakes.  If we find a crack in our foundation, it has to be patched so the things on top don't come crumbling down.

    If, instead, we try to imagine a conceptual architecture that has in mind goals of adaptability to purpose (rather than firmness) and loose coupling (rather than high contact), you can begin to imagine a higher level of agility.  Look at the picture from the start of this post (from webecoist).  The trees are being shaped and molded into a purpose built structure.  If, part-way through the growth process, the structure needed to change to be another 6 inches higher or hold a second story of some kind, the necessary changes could be interwoven into the growth already complete.  If we were constructing a new art museum and decided, half way through, that we wanted a library instead, we'd have to make some major changes or compromised to account for the fact that the foundation was only designed to hold the weight of portraits, not stacks of books.

    This conceptual discussion about growing something organically rather than building it from the ground up is directly related to the sashimi discussion from yesterday.  A legacy build approach says data model, build ETL, build semantic layer, build reports.  There aren't any opportunities in that model to create meaningful yet consumable vertical slices. 

    I hear some agile BI conversations only go halfway toward the mind shift that I think is necessary.  These "think big, act small" solutions sound like a model where the only change is that you poor some of the concrete foundation at a time.  Building a house using this semi-agile approach:

    Iteration One:
    1. Pour foundation for the kitchen only.
    2. Build kitchen walls.
    3. Wire up kitchen outlets.
    4. Install kitchen plumbing.
    Iteration Two:
    1. Pour foundation for the family room only.
    2. Build family room walls.
    3. Realize you need to tear out a kitchen wall to open to family room.
    4. Reroute electricity in that wall.
    5. Rerun wiring to kitchen
    6. Run new wiring to family room
    In this approach to agile BI, you might well deliver value to customers more quickly than if you took a monolithic waterfall approach.  Since you aren't requiring yourself to plan everything up front, you run a high risk of having to do rework later, though.  In a physical construction mindset, rework is very expensive (rip out wall, rewire, etc).

    An organic build approach says plant a seed, watch it grow.  First, a sprout appears, with roots a stem and leaves.  The stem gets thicker, the roots grow deeper, and more leaves sprout.  Branches grow.  Flowers bud and fruit appears.  When requirements change some pruning and grafting is required, but you don't have to tear down the tree and plan a new one from scratch or start a new tree on the side.  The tree will grow around power lines and rocks and other trees as needed.

    There's the mindset.  I don't think it's easy to shift from a constructionist perspective to an organic one.  Success in agile BI requires this change in thinking, though.  If you're still laying foundations and screwing sheetrock onto studs, your attempt at agile BI will not be optimal.


    Good luck with that.

    Wednesday, August 25, 2010

    Sashimi (An Agile BI Lesson for Floundering Teams)

    The most recent TDWI conference generated a lot of conversation around what Agile BI means and how agile principles and practices from traditional software development can and can't be applied to business intelligence projects.  I wasn't able to be at the TDWI conference and attend the presentations, but there's been a lot of chatter.
    I can't speak broadly from an industry perspective on agile BI, but I can speak from my own personal experiences.  The organization I work for has been undergoing a move over the past year to apply an existing agile methodology used in application development to data warehouse and business intelligence solutions.  It's an ongoing study that I believe has a lot of promise and many yet unknown challenges.  So far, there are three parts to this unfinished Agile BI story: sashimi, develoment culture, and developer roles.  Tonight's post is on sashimi.

    For those of you not familiar with the use of the term sashimi in this context, the gist is that sashimi is the art of slicing up a problem space into pieces that are at the same time independently valuable as well as quickly achievable.  In an app dev project, what this means is creating a so-called walking skeleton that exercises only as many pieces of the overall solution as necessary to deliver something that is actually usable by a user.  For example, if I'm building an application that's going manage medical claim payments, maybe all the first slice does is retrieve one claim from the database and display it on the screen.  Then as work progresses toward the first 90-day release, more and more meet is built up on top of that skeleton, refactoring various pieces of the stack as necessary along the way.  Good sashimi results in ever increasing value to end users with only as little bulk on the skeleton as necessary to achieve that.

    What does good sashimi for a BI project look like?

    I think that it looks the same, but feels much harder to accomplish, especially when you have a enterprise scale strategy for data warehousing and business intelligence.  Imagine that you need to deliver a new reporting dashboard for department managers to do predictive labor modeling.  The minimal vertical slice for that solution could include:
    • New tables in a staging area from a new source system, with
    • New ETL jobs to load data into...
    • New tables in an enterprise data warehouse, and
    • New tables in a brand new data mart, and
    • New objects in a semantic reporting tool (e.g. universe or model), and
    • (Finally) your new dashboard.
    That's a lot of layers to slice through.

    In traditional BI projects that I've been involved in, the project plan would call for building the solution mostly in the order shown above: bring the data in, understand the data, build a data mart, wrap it with a semantic layer, and deliver the dashboard.  Along the way, you'd probably have a subteam prototyping and testing out the dashboard UI and maybe someone doing some data profiling to speed data analysis along; but the back-end pieces of development, especially, are likely to happen in stacked order.

    Building a walking skeleton in software requires you to be able to refactor the bones along the way.  As the analogy goes, the first version of the walking skeleton might have just one leg and one toe that attaches directly to the spine and up to the head.  As the product evolves, the leg bone gets refactored into femur, patella, tibia, and fibula; more toes get added for stability; and a new set of hip bones is created.  All of those change to the base skeleton in order to add muscles, skin, and clothing.

    As we layer things in a traditional BI project, we often try to keep a more detailed big picture in mind up front.  I know the final product is going to have two legs, that bend at the knee, need to be able to support independent orbital motion, and maintain upright stability of a 200 pound body.  That all leads to five toes, several leg bones, and hips from the very beginning.  An agile approach would ensure that we can notice early on that the business doesn't really need a biped mammal, but a fish.  That traditional approach results in a lot of wasted assumptions and potentially wasted work.  The agile approach allows for the easy reuse of what can be kept from the skeleton (spine) and a refactoring of the other pieces (leg becomes fin, toe becomes tail).

    That's a lot of metaphor there, all to say that one of the requirements of agile development the ability to picture work in those thin vertical slices of functionality that deliver as much value to users as possible with as little commitment under the covers as necessary.  That requires both a mind set as well as an architecture that will allow developers to quickly refactor components in the stack without having to deal with exorbitant dependencies.  In an enterprise BI environment where source systems are feeding many systems, data warehouses have lots of application and direct user dependencies, and semantic reporting tools are tightly coupled to database objects, this ability to refactor requires a flexible architecture with clear boundaries between components.  Examples that may be useful:
    • Nothing but the job that loads a table should ever reference it directly.  Always have a layer between physical database objects and the users or user applications, even if it's a layer of "select *" views.
    • Only one job (or an integrated set of jobs) should load a given table.  That job should have a versioned interface so that source systems don't all have to be enhanced when the target table changes.
    • Each independent application should have an independent interface into the data (read: data mart, views, etc)
    • Refactoring involves moving logic between layers of the solution stack: promote something from a data mart down to an enterprise data warehouse when an opportunity for reuse is identified; demote something from enterprise data warehouse to data mart when it's clearly application specific.  Make sure that however you build your solution, you can move things between layers easily.
    • Have each layer interface with only the next layer above/below it.  Don't allow the design to cross over abstraction boundaries (e.g. having a report directly access staging tables instead of pulling the data into the data warehouse and on up the chain to the report).
    • Build as little code as necessary to get something from one abstraction layer to the next, even if that means a simple "select *" view rather than building a full ETL job with surrogate key management, SCD Type-2 logic, and data cleansing rules.  But also make sure you've built an abstraction between the data warehouse and the report so that when you add all of those features to the data warehouse, you don't necessarily have to go update all of the reports that have been built.
    Those are just a few thoughts on what might be one way of laying out an architecture that will allow your BI behavior to be agile.

    There are probably other good architecture to support this kind of agile sashimi for BI solutions.  Remember to focus on the goal of being to deliver as much value as possible to end users with as little effort as possible, in every release.  That's what this agile lesson is about.  You have to change how you thing to get here, though.  That will be the next post.

    Monday, August 23, 2010

    Business Keys

    I'm engaged in a project that's actively using key concepts from Dan Linstedt's Data Vault methodology.  There are lots of very powerful benefits that we seem to be realizing with this methodology, but this series of blog posts won't be particularly about use of the Data Vault.  Still, I felt it was appropriate to credit the Data Vault for helping provide a structure for our own discovery.

    This first post is about the struggle to identify keys for business entities.  We set forth some fundamental principles when we started out on our latest large scale project.  First and foremost, we would "throw nothing away".  What that's meant is that we want to design the foundation of our reporting database to be a reflection not just of one department's truth, but all of the truths that might exist across the enterprise.

    As a result, the design of every major entity has run into the same challenge: "what is the business key for this entity?"  Well, from System A, it's the abc code.  From System D it's the efg identifier.  But if someone put in the xyz ID that the government assigns, then you can use System D to get to this industry file that we get updated every 3 months and link that back to an MS Access database that Ms. So-and-so maintains.  Ack!  Clearly we can't just say an apple is an apple.  And clearly there's a data governance issue at play in this scenario also.

    In one case, some of these are legacy systems that simply are what they are and aren't worth investing additional time and energy into.

    Our data modeling challenge is to determine what the one business key would be for all the different instances of this one logical entity.  When confronted with the challenge of having no clear business key, the project wanted to "just add a source system code and be done with it."  I pushed hard against this approach for a couple of weeks, insisting that the team keep going back and working harder to dig up what could be a true business key.  Eventually, I realized that I was both working contrary to one of the original goals I'd set forth and becoming the primary roadblock to progress.


    Interesting side note:  One of the better tricks of good software design is to defer decisions to the last possible minute.  If you get away without writing some piece of code, then best to put it off until you have to write it.  There's obviously some nuance and art to understanding how to leverage that.  The Strategy Pattern is a good example, though.


    What I realized I was doing was trying to put a huge and potentially very dynamic piece of logic out in front of our need to simply capture the information that was being created by source systems.  So, we instituted what felt like a completely counter-intuitive design standard:  every business key would include source system code as part of the compound key; and we would defer the need to consolidate and deduplicate instances of an entity until after all the underlying data had first been captured.

    Deduplicating is the immediate next process, but this allows to be sure that we've captured all the raw information from the source system first before throwing away the fact that there are different versions of the truth in different source systems.

    A very powerful lesson for us that felt very counter-intuitive; that we started considering for the wrong reasons; and finally decided to follow through on for all the right reasons!

    Thursday, May 27, 2010

    Reading List

    Last week, I got hear John Ladley present on Master Data Management.  I'm excited to get his new book, Enterprise Information Management, which publishes tomorrow!  I followed up with an email conversation with him and have suddenly increased my reading list by about 10 new books he suggested I pick up and read -- and I thought my shelf was pretty well stocked!

    On Tuesday, I walked into my office to find a present for me on my desk.  It was Business Intelligence, a newly published book by local UMSL professor Rajiv Sabherwal.  I know Rajiv through my membership on the UMSL IS Board of Advisors, and it turns out that his neighbor works with me, too.

    Speaking of books, I carry around a printed out draft of my PragProWriMo book from last November.  It's a good reminder about the importance of communication.  I originally had the goal of revising a couple of chapters and submitting them in January, but I've adjusted that to be revise during PragProWriMo'11 and submit after that.

    Friday, April 30, 2010

    Evolution of Business Communication

    Steps in corporate culture transformation:
    • Couriering around printed documents
    • Faxing documents to each other
    • Emailing documents as attachments
    • Emailing links to documents on sharepoint
    • Emailing links to enterprise wiki articles
    • Collaborating in the article and notified of changes via email
    • Collaborating in the article and seeing RSS updates in the enterprise portal

    Also somewhere in there is the crazy scenario where I get email a PDF that was clearly created by someone who scanned a printed copy of a PowerPoint presentation.  I guess that worse, yet, would be a PDF created by scanning a printed copy of a wiki article.

    Saturday, April 24, 2010

    Complexity = Agile Simplicity

    I'm a major advocate of traditional simplicity principles like KISS, YAGNI, DRY, the Unix Philosophy, etc.  I'm also very interested in complexity and solving large complex problems.  It occurred to me during an all day "fix this process problem" meeting today that the ideas of complexity and simplicity aren't actually antonyms.  Sure, by definition, they are, but I think the paradox in that comparison is that perhaps complexity can be defined as simplicity that changes over time.

    My perception of complex problems is that given the examples of any one particular moment in time, the situation can be easily dissected, analyzed, documented, and understood.  Take a sample from the next moment in time and the same is true.  Try to combine that collection of understandings into a generalization that can be applied to other past and future points in time and the problem suddenly becomes complex.

    One of the observations we get from agile development is that solutions will be more correct, given the flexibility to adjust to customer needs versus following a previously defined historical specification.  Agility is the ability to adjust to change.

    Therefore, complex problems should be solvable by solutions that are simple and agile.  Solutions do not have to be complex.

    We run into challenges designing agile solutions, though.  Many traditional solution design tools often call for static process flow diagrams, swim lane control charts, concrete data models, class diagrams, etc.  I think that some of the behavioral object oriented patterns give us some clues on how to introduce agility into solutions, but understanding when to apply those (and how to apply them within some technologies) takes creativity and experience. 

    I think that introducing that same kind of solution agility into human processes is also very challenging.  Often, we want clearly defined instructions and flow charts to instruct individuals on exactly what to do.  The only variation being a Madlib style fill-in-the-blank.  Perhaps we need more "and then a miracle occurs" steps in our complex processes.  And perhaps that's both acceptable and desirable in some processes.

    Sunday, April 18, 2010

    Digging Holes

    The following parable is adapted from one that I was forwarded by a friend this week...  It seemed a good analogy for a poor IT Service Management philosophy.


    Two IT directors changed jobs and were working for the city public works department.  One would dig a hole and the other would follow behind her and fill  the hole in. They worked up one side of the street, then down the other, then  moved on to the next street, working furiously all day without rest, one director digging a hole, the other director filling it in again.

    An onlooker was amazed at their hard work, but couldn't understand what they were doing. So he asked the hole digger, "I'm impressed by the effort you two are putting in to your work, but I don't get it -- why do you dig a hole, only to have your partner follow behind and fill it up again?"

    The hole digger wiped her brow and sighed, "Well, I suppose it probably looks odd because we're normally a three-person team. But today the guy who plants the trees called in sick."

    Sunday, April 11, 2010

    Personal Best



    Today's run...  The Go! St. Louis Half Marathon
    My first official race!

    Distance: 13.1 miles
    Time: 2:34
    Place: 7,520th

    It was awesome!

    Saturday, March 20, 2010

    What's the gap?

    At the office, we're working on a strategy around how we maintain and publish various types of technical information and instructions.  For instance, there's been a big emphasis around transition of system maintenance and support responsibility from project/implementation teams to support teams.  What kind of documentation is required?  Where should that documentation be stored?  What format?  Who should create it?  etc.

    One of the big cultural battles has been a solution of MS Office documents and MS SharePoint versus MediaWiki.  It'll be obvious, but just to lay it out up front, I stand firmly on the MediaWiki side of this discussion. 

    On the Office/Sharepoint side, you have an argument that "everyone knows how to use MS Office applications; cut and paste of screenshots is easy; you don't have to know how to program the wiki."

    On the MediaWiki side, you have an argument that "Sharepoint organization doesn't make any sense; searching across different sites is awkward; you always have open a separate document in a separate application to see what you really want to see; and it just doesn't feel webby enough."

    This argument from the SharePoint side that you have to program the wiki is the one their leadership is most adamant about.  "Our analysts aren't programmers," they say.  We're using a wysiwyg editor!  Is a little wikitext markup really programming?  I think the ability to pickup on a little wikitext is just like the ability to learn how to use formulas in MS Excel... and if you can't write a simple SUM() or =A1+B1 is MS Excel, then you don't have any business being in an IS job... or really any business support job for that matter.

    Perhaps that's a strong statement, but I think that any IS person should feel comfortable picking up a little HTML or wikitext markup.  I often hold up my wife, an English major / office worker / writer, as an example of "if she can do it, then an IS person should be able to do it!"  But perhaps I'm looking at the wrong set of characteristics. 

    Maybe the gap is a generational/cultural one rather than an educational/cultural one.  There's probably a more articulate way to describe that, but what I'm getting at is that my wife is also comfortable with blogging, Facebook, and the online / social community in general.  I wonder what percentage of people in the SharePoint camp are regular contributors to blogs, Facebook, Twitter, or other social networks?

    What is the gap between someone who thinks the business world is a collection of MS Word documents and someone who things the world is a more directly accessible, public, and transparent collection of content?  And how do we get people across that gap?

    Monday, February 8, 2010

    Living on the Edge

    I've been working for a while on explaining the value and importance of Enterprise Information Management.  While I usually get some good ideas from Wikipedia, the article there today has let me down.  As referenced in the Wikipedia article, Gartner and Forrester have some valuable things to say on the topic.  But I believe that it can really be boiled down to a simple and practical explanation, given one assumption: in a business organization with optimal management of information, no individual business unit will appear to be fully optimized even though the overall organization is optimized.  Here's why:

    Imagine the information flow through the business units of an organization as a connected network of nodes.  Each edge in the graph represents some type of process (automated or manual) that moves information from one department to another.  The Information Supply Chain model describes how each of those edges has a cost associated with it.  Each edge, to be worth while, must also create some added value (either through reduced effort via automation or added meaning).  The cost / benefit sides of those edges, though, don't typically come from / contribute to the same departmental bottom line.  Typically it is a matter of the originating business unit paying the cost of additional data collection or manipulation so that a receiving business unit can benefit.


    In the VERY simple diagram above, the argument is obvious.  The Admitting department in a hospital has the purpose of collecting information from a patient that other departments will need in order to do their job effectively.  Admitting collects patient information (e.g. contact information, primary care physician, insurance information) once so that other departments can all benefit from it.  Surgery uses the information collected during Admitting to retrieve the patient's medical record and orders.  Billing uses the same patient information plus the additional information about what procedures were performed by Surgery to create invoices to payers (who will use similar information to try to avoid paying the bills).

    It would be inefficient if the Surgery department had to collect from you the information it needs to find your medical record and orders; then have your surgical procedure followed immediately by a visit from the billing department to collect the same information about you so that they could proceed with coding and billing processes.  This clearly does happen sometimes.  Occasionally with good cause, but often because of redundancies between systems, and sometimes because the process doesn't think to collect a piece of information up front.  For instance, Admitting may not have any reason to ask "do you have any allergies" because that isn't necessary to complete their assignment of "log that the patient arrived and notify surgery."  So, Surgery has to ask the additional questions that are important to it "have you eaten," "do you have any allergies," etc.  With some of those questions, significant time and safety risks could be avoided if they are asked as early in the encounter as possible.

    So, it seems to me that Enterprise Information Management is most importantly about the management of the edges of that graph and has a direct impact on the efficiency of the Information Supply Chain.  Those edges between the nodes of the diagram aren't merely straight lines that magically move information from one business process to another.  They are interfaces and systems and business processes that cost significant time, money, and risk to quality.

    Why can't we expect each business unit to simply do what will result in an optimal collection and movement of information?  Not because they're maliciously selfish about their time or resources, but because individual business units don't usually have the perspective to fully understand the down stream value of their own operations.  Enterprise Information Management is the work that lives in between business units and drives overall optimization of the edges between them.  Enterprise Information Management is something that lives between and outside of individual business units.  Business units can be counted on to optimize their own internal operations.  Enterprise Information Management has to be planned and managed explicitly, above and beyond departmental objectives.

    Thursday, February 4, 2010

    How Not to Clean Data

    PREFACE: Even if you don't have some familiarity with the inside of a hard drive, you'll probably still be troubled by this story from my past.  To answer the obvious questions that you'll have after the story: Yes, I really do have a legitimate degree in Electrical Engineering.  No, don't worry, I've never been employed in a way that uses that degree in a significant way to design or build any products that you might own.


    I once had an old hard drive that started making a bit of a racket.  It didn't stop working right away, but I was concerned that there was something wrong with it.  I thought that maybe I could figure out what was wrong if I opened it up and poked around inside.  At that point I'd never seen the inside of a hard drive in real life.  So, I invested in my first set of star-point screw drivers and carefully disassembled the case of the hard drive.  Even after the screws were out, the metal cover stuck a bit.  It seemed like there was some kind of seal that had it closed, so I used a flat head screw driver to pry it open.

    Wow.  Shiny.  Really clean.

    I plugged the hard drive back in, while both the computer case and the hard drive case were open, and booted up the computer.  Cool.  It spins!  I watched it spin up, and the computer boot.  Everything working great.  That little moving arm is really neat, too.  It bounces back and forth really quickly!  So, I ran some programs on the computer and started listening for the noises that I thought were signifying an imminent disaster.  The hard drive just sounded rough.  Something like ball bearings worn down or the spindle just getting sticky.  Logically, I got out my WD-40, with the little red straw to make sure I could target the center of the spindle.

    Squirt.... Squirt... drip, drip.

    Well, let's see if this works for a while.  Maybe that was enough to quiet the drive down.

    Things working fine.  Then the head did a seek and ran right through a drip of WD-40 and smeared across the platter.  Is that bad?  Then the actuator arm started thrashing back and forth, clicking hard against the center of the spindle and back against the outer wall of the case.  Clunk.  Clunk.  CLUNK.  Whirrrrr..rr...r....  Quiet.  Computer locked up.  Hard drive stopped.

    Uh oh...

    Maybe if I clean that WD-40 off of the platter it will work again?


    So, I got out my trusty Goo Gone and a soft rag to remove the extra drips of WD-40 that were now smeared across the top platter of the hard drive.  Rub, rub.  Wipe.  Rub, rub.  Polish.  That looks pretty good.  Let's spin it back up and see.  W....h...i.rrrrrrrrrrr.  OK, that sounds pretty.... CLUNK.  Clunk.  Clunk.  Clunk.  Unplug the computer.

    I worked on this for a couple of hours.  I used more Goo Gone.  I used alcohol -- both the rubbing kind to clean the platter and the drinking kind to calm my frustration.  In the end, I was able to get the drive spun up long enough to retrieve some files.  This was still an age when most of my working documents were on floppy disks, because I needed to carry those between different computers.  So, luckily, there was no important data lost.

    I've been thinking a lot lately about how "broken" business processes impact data quality and data integrity -- thinking about the ways we look at trying to keep the data inside those disks clean and running smoothly.  Sometimes we look at things from a perspective that is too distant, with a too limited understanding of the context of the processes that we're examining, and act too quickly and too inexpertly without taking time to understand the nuances of the systems and business processes involved.  We do things that we think will help (implement governance processes and quality screens) and end up sending the system into a tailspin.  Things do recover from that dive, but not without a major investment in time and energy.

    Wednesday, February 3, 2010

    New Magic Quadrant for DW DBMS

    Gartner released it's new Magic Quadrant for Data Warehouse DBMS platforms.  I tend to think that Gartner does a reasonably good job with its Magic Quadrant results.



    Here's my personal summary.
    • Oracle holding on to 2nd place for enterprise data warehousing with its "reference configurations," Exadata product, and 11g upgrade as prominent strengths; downside being claims of higher DBA FTE cost than some other DW platforms.
    • Teradata remaining the clear leader and probably expanding their market presence with newer pricing model and platform options.
    • Microsoft lagging far behind the other big vendors while it tries to integrate the technology it purchased with DATAllegro in 2007.
    • My favorite aspiring open source vendor, InfoBright, sitting in the middle of the of the niche quadrant as a brand new entrant.

    Full disclosure: I don't own stock or have a personal stake in Gartner or any of these companies.  I've personally developed or managed large data warehouses on DB2, Teradata, and Oracle.

    Tuesday, January 26, 2010

    Simplifying != Lying


    The Unified Modeling Language defines a lot of different mechanisms or views for documenting the structure and behavior of a system.  As you move your way through them, they provide greater levels of specificity or higher levels of abstraction.  I worked on an effort several years ago that employed a set of UML-like standard views to describe a set of key applications.  In that effort, the set of views were labeled Contextual, Conceptual, Component, and Deployment.

    The Component and Deployment views came straight from the UML standard.  The Conceptual view followed somewhat from this Wikipedia definition, but extended beyond the entity-relationship domain to include processes in something of a data flow.  The Contextual view was used to describe the surrounding business processes, actors, and external systems that interacted with the application or system being described.

    It was a challenging exercise to work through the application systems that I was responsible for and document them appropriately, without overdoing the documentation.  Personally, I found the exercise very satisfying.  It helped me prove out my understanding of the systems I worked with.  Often, I'd start with a middle-layer view, like the Conceptual or Component view, then work my way either down or up, iterating back and forth a few times to make sure the depictions remained consistent, honest, and valuable.
    As a side note, that sort of iterative process of working non-linearly, both up and down between different levels of abstraction reminds me a lot of the YouTube videos I watched a while back when learning how to use an abacus.  If you aren't familiar with how to use an abacus for simple math, take a look.  If you're trained in traditional Western math, it'll be an eye-opener.

     When introducing the idea of these four views to subsequent teams, one of the biggest surprises has been the level of dishonesty that tends to appear in the higher levels of abstraction.  I've also found it hard to explain what it means to be dishonest about an abstract view of something. 

    For instance, in a Conceptual view, I've seen people draw a box that is intended to represent one conceptual system when the box actually comprises two or more entirely independent processes.  For example, by this definition, it would be dishonest to diagram a single box labeled Item Master that has arrows sending data to separate operational systems if there is, in fact, no system or business process that actually manages a conceptually unified source for the item information in those systems.  Not that there has to be some physical thing or even one single source of truth, but if the item information is built separately in those two systems with no regard for the the other system, then it would be disingenuous to suggest there's even a conceptual idea of a single Item Master.  That's not to say that there should or shouldn't be -- but to document a conceptual view of that in the current state would be dishonest. 


    I've seen this happen enough times that I believe there's some common confusion about what it means to build abstract models.  Simplifying how we look at something, using abstraction, should never result in a lie about how things actually work.

    It's a lot like scientific mathematics 101.  Remember the lessons about significant digits and the difference between precision and accuracy?  Precision defines the degree of specificity and detail to which the answer is given.  A component diagram is more specific about the actual construction and inner workings of a system than is a conceptual diagram.  In order for them to be correct, both must be highly accurate depictions, though.  Losing accuracy during abstraction will only get you in trouble.

    Friday, January 22, 2010

    Slow Processing


    When I was in grade school and junior high school, I was considered to be pretty good at math.  I went to math competitions all across the state and won 1st place more often than I didn't.  In most of the competitions, time didn't matter to the score.  It was a timed test rather than a race.  In my head, though, I needed to be one of the first to finish.  Being right was fine.  Being right and first?  That was really winning!  As you'd expect, that got me in trouble every once in a while.  I'd get sloppy on a couple of questions and not win 1st place.  Going back over the tests, I'd kick myself for not simply rereading the question a second time through.

    My mother was the one who kept coaching me to slow down.  Read the question a couple of times.  Check your work.  I don't know if I've ever really learned the lesson to slow down and double check my work as much as I should.  If you've been reading many of my blog posts, I'm sure you've caught a plethora of typos.  Lucky for me, Firefox usually catches my misspellings.

    Same lesson applies to how we process data for data warehouses.  Consider a somewhat typical batch ETL process:
    • Process input files:
      • Surrogate key lookups
      • Field level transformations
      • Calculations
      • Mapping to target tables
    • Load files into staging tables.
    • Apply updates and inserts.
    Then, after the loads are complete, run a series of validation routines to ensure that all of the data that was received was actually applied to the target tables.  Total the input by some grouping; total the output by some grouping; and compare the results.  If things don't match, then send someone off to start digging through the data.

    Certainly, that load process was very efficient.  Doing all those block level operations and applying large chunks of updates all at once, rather than doing one-record-at-a-time processing gets the data loaded to the warehouse much more quickly.  Bulk operations, reducing some of the transaction management overhead and improving I/O throughput, definitely do that.  Don't forget, though, that in order to feel truly confident that the processes completed successfullly, you have to read all of that data a second time to balance out the totals.

    A slower alternative is to process the data more sequentially and validate that a single row makes it to each intermediate staging area or directly into the target table.  Create a process that watches every row get transformed and loaded, and reports on exactly which records failed to load rather than merely telling you that the final insert/update somehow came out 10 rows short of what was expected.

    So, it seems to me the options are these:
    • "Fast" - load in bulk very quickly; do batch validation; if totals don't match, kick out for research and troubleshooting.
    • "Slow" - process each row more methodically and kick out exactly which rows fail any piece of the process.
    There are complexities to this choice, clearly, and they depend on things such as the level of complexity in the ETL, the size of the batches, the available windows for processing, system usability during processing, etc.  The most important "slow" lesson is to examine the situation you're in and make a rational decision about how to process data and validate it is loaded correctly.  Don't make snap-assumptions about the right way to ensure data integrity on any particular process.

    Wednesday, January 20, 2010

    BI for MBA students

    I've got the opportunity in 6 weeks to guest lecture for one eventing to an MBA class about Business Intelligence systems.  The course is a general Information Systems class that the MBAs are required to take, so I'd like to make sure the conversation does an effective job engaging them and helping them understand the role systems and technology play in supporting decision making.  Any thoughts or advice on important things to make sure I cover in 2 hours?

    Post a comment or tweet @paulboal.

    Tuesday, January 19, 2010

    Stand Slow

    As the title of this blog shows, I'm a big R.E.M. fan.  They're great performers, and I've always loved their lyrics.  Thinking about what it means to me to slow down, the lyrics to "Stand" seemed appropriate.
    Stand in the place where you live
    Think about direction
    Wonder why you haven't before
    Now stand in the place where you work
    Think about the place where you live
    Wonder why you haven't before


    If you are confused check with the sun
    Carry a compass to help you along
    Your feet are going to be on the ground
    Your head is there to move you around


    If wishes were trees the trees would be falling
    Listen to reason
    Season is calling


    Stand in the place where you are
    Stand in the place where you are
    Your feet are going to be on the ground
    Your head is there to move you around, so stand.
    I want to highlight a few of the lyrics in the context of what it takes to do effective technical analysis and design activities.


    "Stand in the place where you live"
    Start the work at hand by examining it from an internal perspective.  Look at the data, process, or system from the context of someone who really knows it well.  Learn from them some of the intricacies of how it's put together and how it behaves.  These are the people who really live with the system day in and day out.

    "Stand in the place where you work.
    Think about the place where you live."
    Now look a the data, process, or system from the perspective of someone who is a user or consumer of it.  Get their perspective on how things look from the outside of day-to-day life of supporting and maintaining the thing being analyzed.  The important this is to get outside of your own skin and try to look at things from alternate perspectives.

    "If you are confused, check with the sun.
    Carry a compass to help you along."
    Remember that the title of this song (and every third word) is "stand."  We need a vision and guiding principles in order to know that, as we dig deeper and deeper into the weeds of a project, we don't lose sight of what the driving purpose is.  Remember, when you get down into the weeds, it may seem easier to follow the goat path that someone else already pushed down rather than following the sun (vision) and your compass (guiding principles).

    "Listen to reason; reason is calling."
    As Jim Harris (ocdqblog) reminded us at the beginning of the year with, "Shut Your Mouth," there's high value to be found in learning to listen.  It can be difficult to know how long to listen before starting to draw conclusions and respond.  It's good advice to to listen even more closely as things seem to sound ridiculous.  For example, I was in a conversation once that described how a payroll process worked.  I was flabbergast by the amount of special logic that went into the business process.  It seemed ridiculous.  Then, I listened harder and further into the conversation (and asked some open-ended questions) to really understand.  As I understood more and more about what I thought was a ridiculously complex process;,

    "Your head is there to move you around."
    Key emphasis on head.  You should make decisions based on the thoughts in your head, not on your gut feels running around in circles.

    Thanks to Stipe and company for the song!

    Monday, January 18, 2010

    Slow and Agile

    I first read the Agile Manifesto nearly 8 years ago.  Anyone familiar with agile principles understands that one of the underlying goals is to produce better solutions, faster, with less wasted energy along with way.  Anyone familiar with introducing agile practices to an existing team knows that agile works best with strong individuals.  Read the manifesto with that in mind (their emphasis, not mine):
    Individuals and interactions over processes and tools
    Working software over comprehensive documentation
    Customer collaboration over contract negotiation
    Responding to change over following a plan
    I want to be clear - I'm am a strong support of the agile philosophy.  For those of you out there who, like me, would rather spend your time interacting with a computer rather than a human being at work, I want to to point out that agile relies heavily on human-to-human interaction and a high degree of trust in team members and stakeholders.  Key words: individuals, interaction, collaboration.


    This is where being slow becomes so important.  Clear communication requires both the sender and receiver of the communication to be very intentional.  "Need the system to do X."  "Got it!"  Is not an effective interaction.  Both parties have to slow down the pace of their interaction in order to be intentional about what they're communicating and make the practices effective.



    Speakers:
    • Studies show that we have to repeat ourselves in order for effective comprehension to take place.  Tell the listener what you're going to say; say what you're saying; tell the listener what you just said.
    • Look for comprehension, both verbal and non-verbal: head nodding, eye engagement, note taking, clarifying questions.
    • Provide opportunities for questions.  Some listeners may not be great at asking questions up front, so give them ample time.  Pause after significant points or topics that may be unclear.
    • Validate the receiver's understanding.  Ask for comprehension about specific points, not just a general "does everything I said make sense."
    • Listen and watch for when the receiver flinches, raises an eyebrow, takes extra notes, or tries to break into the conversation.  What you think is already clear is the mostly likely place for miscommunication with someone unfamiliar with the topic.


    Listeners:
    • Listen actively.
    • Confirm the assumptions that you have about the topic going into the conversation.  You and the speaker may be starting from different places.
    • Validate assumptions you have with what the speaker is telling you.
    • If something the speaker says doesn't sound right, that's a clear sign that there's some misunderstanding going on.  Either one of you is actually wrong in the information (in which case, facts need to be validated and clarified for both parties) or the communication is not working effectively.  In either case, ask questions and clearly state your assumptions in order to clarify.
    • Repeat what you're hearing back to the speaker in your own works.  Become the speaker and look for their honest confirmation that you've understood them.
    • State any conclusions that you've drawn from the conversation.  They're likely based on both the information in the conversation as well as other unintentional assumptions.
    • Confirm the action items, changes, or impacts of the conversation.  Describe what it is that you think this conversation means to the project; and ask the speaker that same question.
    Agile is pro-change (so to speak) in that it doesn't fear changing needs.  I believe that the message in that point of the manifesto is to encourage teams not to fear change and to be willing to build what is needed rather than what is written in out-dated specifications documents.  The key there is that the team is working to build what is needed.  Whether we're building from technical specs or from personal interaction, effective communication and collaboration are required.

    So, make sure you take communication slowly and intentionally.  Emphasize communication performance, not just responsiveness.[1]



    [1] As a little more detail on that footnote, I think the Wikipedia entry for "responsiveness" makes a great point about the difference between performance and responsiveness.  In that example, the point made is that for usability, it makes sense to put the mouse driver at a higher priority than background processing so that the user experience is more pleasant.  In an interview, though, the background activity of comprehension needs to have plenty of processing time.  The verbal back-and-forth of the conversation can take a decreased priority to ensure that good understanding is happening.

    Friday, January 15, 2010

    The Power of Slow

    I spent part of my work day today doing mid-year reviews with my staff.  For us, it's merely a mile marker in the year, not any large event.  I meet with each member of my staff on a weekly basis formally as well as in ad hoc drive-bys throughout the week.  It works well to maintain that consistent and continuous communication.


    During one of these 1-on-1 meetings today, one of my architects was voicing some frustration about the attitude a few project leads were taking with him.  They were pushing the team to just hurry up build something.  Whatever they could do quickly that worked.  He was saying that he had a hard time describing to these project leads how a touch more patience and even a different attitude about the definition of progress would help them see the risk in their approach.

    I mentioned to him that I've been slowly reading Carl Honoree's The Power of Slow since last summer.  Only in that moment did I realize there was a way of expressing "slow" to addictively "fast" people.  Slow isn't about taking more time to get things done.  Slow is about the internal pace at which things are done.  From a software architecture perspective, slow doesn't mean spending months in design before anything is built; it means taking the time as your write each class and method to think through how best to write that particular class or method.  It isn't about wastefully over-engineering the solution to be something that will "be scalable into the future for unknown other uses" while the immediate problem is left unsolved.  It's about being methodical and intention in each action we take.  That creates a slower appearance, because the process becomes more efficient and there's less waste moving up and down the rate of progress curve.




    The "Slow" line represents a steady and consistent stream of work.
    You can think of the dips in the "Fast" line as any number of things:
    • Hurry up, then wait
    • Work fast, then go back to fix your mistakes
    To reach the 100% mark at the end of the chart, both approaches may get there in the same amount of time, but the "Fast" approach consumed a lot more effort.  Effort being the length of the line itself.


    One of the fears that a "fast" people have, especially "fast" managers, is that "slow" implies developing ultra-sophisticated solutions, most of which will never actually be used; over-engineering; or architecture-creep.  For some "slow" people it probably does, but the right kind of "slow" is merely a thoughtful, methodical implementation that gives everyone enough time to make sure they're doing the next step in the project correctly.  I may feel slow, but that's only because it's efficient.


    Ironically, we see this in the world around us all the time.
    • When starting your car from a stopped position on slick slow or ice, you have to do so slowly or the wheels will merely spin.  In both cases, you'll move forward some, but when accelerating too quickly, you'll burn a lot more fuel and tear on your tires.
    • In cartoons, we see the antagonist rushing around in a hurry from place to place, rushing; while the protagonist slowly and thoughtfully moves through the chaos and reaches the goal first.
    • In action/comedy movies, we see a brash martial artist execute a flurry of activity in a dramatic attack, only to be stopped by a few small simple motions of the hero.
    So many people incorrectly think that looking productive is the same as being productive.

    In my own writing, my wife points out to me that there is power in brevity.  Sentences should say all that they need to say and no more.  I tend to add too many unnecessary adjectives, or flourishes, or sentence complexities.  More words often doesn't imply more meaning.

    Tuesday, January 12, 2010

    The Role of Health Informatics

    Interesting question that's always puzzled me is what is the differentiation between the term "informatics" and "information management."  In my admittedly limited experience, informatics is used primarily in scientific and medical fields, such as "health informatics".  Information management is a more general business term.  Why different?

    For one thing, I suppose, the etymology of "informatics" explains part of it.  The "-ics" ending means "the science of."  So "informatics" is the science of information rather than the management of information.

    The Wikipedia article defines "health informatics" as having these key aspects:
    • Architecture for electronic medical records and other health information systems
    • Clinical and other health related decision support systems
    • Standards for integration
    • Standards for terminology and vocabulary

    Food for thought - In the health care industry, does it make sense to co-locate business roles like master data management, data quality, information integration, and business intelligence with the traditional informatics departments; rather than building them within business management units or IT/IS units?  Does asking health informatics to work with other non-clinical business functions somehow risk or distract from patient care responsibilities?

    (As a side note for those across the pond, it turns out that "business informatics" is a European term, closely related to "information systems" study, but with some subjective differences pointed out in the Wikipedia article.  It is not a term commonly used in the U.S.)

    Friday, January 8, 2010

    Organizational Optimization


    I'm not an expert in business organization design or organizational optimization, though have been an active decision-maker in a few organization redesigns and layoffs (I'm sad to say the latter).  It seems to me that the rules of what a business looks like, organizationally, could follow some of the same basic guidelines that enterprise architects and software architects use for the design of a large system of applications.

    Clarity of Purpose
    A business unit must have a well defined purpose.  It should be clear to the rest of the organization what the function of the business unit it is, so that other business units know when to engage with it, what information it has/creates/uses, and what purpose it serves.

    Well Defined Interface
    A business unit must be clear to the larger organization on how to interact with it and expectations the organization may have with regard to service completion time or delivery schedule.


    Value
    A business unit must be able to readily respond to questions about its value to the bottom-line performance and goals of the larger organization.  If it's value is not sufficient to outweigh alternatives, then it is irresponsible to continue to use it.  The lower cost alternatives should be used.

    Reliability
    A business unit should either be self-sufficient enough to achieve its purpose alone, have plentiful access to external services such that they are unlikely to be a limiting reagent, or have sufficient influence over the priorities of external services such that they will not hinder progress.  If the business unit is unable to make those conditions occur, then it will suffer from inefficiencies and risk not being able to achieve it's goals.

    I think that's very consistent with system/software engineering principles laid out in the Unix philosophy.
    • Do one thing and do it well.
    • Write programs that work well  together.
    • Data dominates. If you have chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming.
    • Fold knowledge into data.
    • Design for simplicity.  Add complexity only when you must.
    The time-value curve also could be use to help design organization principle.
    • Organizational structure should optimize the amount of time between an event and the decision making and action that will result from that event (opportunity is lost in the interim).

    Tuesday, January 5, 2010

    Canned Solutions Canned

    Or: 
    Why we need more data professionals employed with health care providers rather than health care consultancies.



    I'm a member of a health care provider industry association that shares information across organizations about our experiences delivery business intelligence and data warehouse solutions.  Recently, one of the members of that organization posted a forum question about several packaged industry-specific data warehouse solutions that they'd had pitched to them.  Here was my response.  I thought it was worth sharing here as well.

    My personal experience (as a BI/DW consultant and professional for the past 10 years) is that prebuilt data warehouse solutions for any industry come in three flavors:
    • Logical-only models that come with an implementation "project."  The physical implementation will be built to suite your organization and usually use language and terminology that your organization is comfortable with.  These are full blow projects.  Though some ideas come in the box, very little runable code is ever included inside the box.  That takes the time and money of professional services.
    • Canned data warehouses / data marts that come as a full blown, predefined solution.  These might seem like a good idea because so much is already built for you.  In my experience, though, they return only a fraction of the business value they seem to promise, though.  The model in the canned solution doesn't typically match your own business model closely enough to be as valuable as it appears during the sales meetings.  Another challenge with these solutions is the integration of data into them.  They're either so high level that they don't have a lot of value; or they're so different from your own organization that the mapping of data into their model is overly complicated.
    • Marketing hype - in some cases, the solutions really are just a marketing message.  In reality, the vendor is merely saying "We have some people from health care and we have some BI skills.  We've put them together on a team for you."


    That's not to say vendor solutions aren't a reasonable way to go for you.  When you talk with them, be very specific in your questions and very critical.  Ask to talk with other client references that are similar in size, systems, and market to you.  Do a proof of concept to implement some small portion of the solution free of charge.

    I hope I don't sound too harsh on vendor solutions in that response.  I don't believe that vendors intentionally oversell their solutions.  In some cases, they're ignorant as to what it takes to really build, deploy, (and most often) run, maintain, and enhance a data warehouse for several years after the initial implementation.  This was one of my personal realizations as I transitioned from consulting into a corporate job.  Other vendors may simply being putting the cart before the horse, and using consulting opportunities to build up their solution offering.  No harm in that as long as they're upfront and honest about it.

    On December 30, 2009 HHS presented the much anticipated details of the definition of meaningful use of electronic health records.  This definition provides more information about the level of functionality and use health care organizations must achieve by certain deadlines in order to receive certain types of government assistance and maintain the highest levels of medicare reimbursement.  (I have not read the details.)  Soon there after @theEHRguy tweeted a couple of great predictions:



    I think it would be better if we paid those of us who actually work for health care organizations $450/hr.  Unlikely.

    Still, I think he makes a very good point about consulting services, and I'd extend that to certain types of packaged vendor solutions by nature of the services they usually imply.  The high demand in the area of health care systems right now is going to be shortly followed by a high demand for BI/DW solutions in the same space.  In the rush, I think the industry risks implementing poorly designed packaged vendor solutions will result in a lot of wasted time, money, and effort; and lost opportunities for growth and optimization.  Time will tell, but I think it's a long road ahead.