Steve Hoberman (of various forms of data modeling fame) had an article on Information Management recently that poses the question: "what is the difference between a data modeler and a data scientist?" I think that most people who have been hearing the term "data scientist" going around realize there are certainly differences, but the article does a nice job spelling that out in a gap analysis.
This got me thinking, though, about how rarely people on BI projects play multiple roles. It's been one of the biggest challenges we've had adopting an agile methodology, I think. The data modeler becomes a bottleneck for developers. Analysts have to finish mapping documents before data modelers can finalize attribute names. If a user decides that something really needs to be many-to-many instead of one-to-many, it trickles down through data modeler, developer, reporting, and testing.
Why don't we put stronger emphasis on one person having the breadth of skills to play multiple roles on a given project? If a requirement changes, the number of system components impacted may not be any less, but the number of people who have to understand the nature of the change could be significantly reduced. I'm not saying that business analyst, data modeler, ETL developer, and UI developer aren't different and equally valuable skills to develop; but I am suggesting that a single person should be able to play more than one role on any given project. I think the result would usually be leaner and more efficient projects.
The challenges - leadership trusting that team members can take on multiple roles and develop those skills, and helping team members break out of their shell and prioritize the development of new skills.
Back to Steve Hoberman's discussion -- can we change it to "what's the difference between data modeling and data science?"