Friday, August 27, 2010

Your Job? My Job? Our Job!

I've been trying to figure out agile data warehousing for several years now.  I'm a computer scientist by training and a programmer by hobby, so I've always kept my eye on trends in traditional software development.  What I tell myself, professionally, is that it helps me have alternative perspectives on BI solutions.  (It's really just 'cause I like programming, even if I what I hack together isn't typically that elegant.)

Several years ago, I was introduced to one of the founders of the St. Louis XP (Extreme Programming) group, Brian Button, and decided to sit down and have lunch with him.  I explained what kind of work we do to build data warehouses, and he listened very politely.  At the time, he was thinking mostly about test driven development and pair programming.  One of the things he asked me was "can you get your data modeler, ETL developer, and report developer all in a room together working on the same components all at once?"  It occurred to me, then, that separation of development responsibilities might be a serious impediment to agile BI development.

As a former consultant, I've personally done a little bit of everything.  I can data model well enough to build something workable; I've spent a lot of time writing ETL both by hand and with GUI tools; I'm a raw SQL hacker for sure; and I can even create a reasonable report or two that some VP wouldn't just use as a coaster.  How often have I ever asked my staff to do that breadth of work, though?  In larger organizations, I usually see and hear about separation of team: data modelers, ETL developers, reporting people.  They're separate teams.  That's always been under the guise of "developing technical expertise" within the team and driving consistency across projects.  (Important goals for sure.)

However, when I look at successful agile software teams that I know about, that same level of separation isn't typically present.  A single developer might do part of the UI, some of the dependent service modules, and the persistence layer.  They're focused on delivering a particular function, not of some component of the overall application, but an external function of the application.  This goes back to the previous conversation about sashimi, too [1] [2].

Of course there are some developers that are better at UI and other better at ORM, just like there are some BI folks better at data modeling and other better at data presentation.  To enable more agile development, though, requires developers who are more willing and able to cross over those traditional boundaries in the work that they do.  One of the leaders I work with today articulates this very well when he says that we just want "developers."  What this does for agile is that it minimizes context switching and the spin between different developers working on the interrelated but different pieces of the same component.  If an ETL developer decides she needs five extra fields on a table because she just learned that some previous assumption about the cardinality of a relationship was flawed, should that change require the synchronous work of:

Modeler ETL Developer Report Developer
1 Changes data model wait/other work wait/other work
2 Deploys changes to DB wait/other work wait/other work
hand off
3 wait/other work Import new metadata wait/other work
4 wait/other work Update ETL jobs wait/other work
5 wait/other work Unit test ETL wait/other work
hand off
6 wait/other work wait/other work Run test queries
7 wait/other work wait/other work Update semantic layer
8 wait/other work wait/other work Update reports
NAnd loop through for every change

There's a lot of opportunity for optimization there if one person is working on the tasks instead of several people. For about 66% of the teams time, they're working on something other than this one objective, and there's latency in the hand off between developers. (If you can create a parallel scheduling algorithm that gets the overall workload done faster, all things equal, than "one resource dedicated to completing all the steps in implementing each particular piece of functionality" and "helping each other out when there's nothing in the queue", please let me and every computer science professor on earth know.)

I think that for some teams this will be a challenge to their skill set and require developers to grow beyond their existing comfort zone.  I'll argue that they'll be the better for it!  For some teams it might be more of a challenge to ego than to actual skills: "you're going to let anyone data model?!"

The answer is "yes" and "we require an appropriate level of quality from everyone."  That's why agile teams are ones with pair programming, peer reviews, and an approach that not just accepts but welcomes change.

For a team that isn't agile today, these things can't come piecemeal.  If you want to be agile, you have to go all-in.

1 comment:

  1. Paul,

    I wholehearedly agree with you. The problem you describe is inherent in any reasonably complicated system.

    In all the projects I'm working in I try to be able to cover as many pieces as possible (and this includes technical and "business" areas). I may not be the best in all areas, but I will be one of the very few that at least understands all the pieces. It is surprising how much easier (and also faster) you can solve issues if you can figure out the appropriate piece first.

    However, I have not found too many people who share this interest, most often people just like to dabble in their area of expertise. (Maybe they are just uncomfortable outside of it.) This is good for me as an independant consultant, but I find it very strange that not more people are at least interested in broadening their knowledge.

    Thanks for a good article