"COMPUTER POWER TO THE PEOPLE! DOWN WITH CYBERCRUD!" - Theodor Nelson
Sunday, October 02, 2005
Shared vs. Mobile Data
Alex Bunardzic writes about "Shared Data and Mobile Data" in response to his article on "Should Database Manage The Meaning?" He makes an important distinction in the debate. There's people who think the database should be the central point and then, there's those of us who think data should be mobile (not centralized). I think the folks that believe in centralized databases should read William Kent's "Data And Reality" book to see the problems with that view. It basically boils down to meaning. Different projects will view what might seem to be the same data differently. We might even use the same terms, but have slightly different meanings. It's the slight difference that kills the shared database model. Not to mention, that over time shared database models become unchangeable and unmaintainable because of the massive amounts of coordination it takes to make a change (even with volumous documentation which is usually out of date anyway). Who wants to make a change and be held accountable for every system they break? It becomes a "let's just hack this field that only we use so we don't have to go pull teeth". Ouch. It's easier to just change your code than burden evryone else (who may not want to make ANY changes to their code at all...it's not in their budget) I've seen this scenario more times than I care to count.
Mr. Kent's book exposes these problems and lays them out. The book is great for exposing the problems that basically boil down to humans having ambigous language. Even though our computer languages are strict, our meaning still is king and that's the disconnect. It's the same reason that common business objects that cross enterprise divisions are problematic as well. They each have their own meaning of various words used to label things. It's best to have each system to have their own database and have distinct boundaries where they exchange information. Everyone keeps their definitions (which makes sense to their users) and makes conversions on the boundaries.
Eric Evans' excellent "Domain-Driven Design" offers the solutions to the problem. In fact, he has an example in the beginning of Chapter 14 (Maintaining Model Integrity) that explains it. Different systems might be working on what seems to be the same model, but they are looking at it from different contexts. Each system should have its own "bounded context" of the model. Here's an excerpt from "Domain-Driven Design" that explains it clearly:
Explicity define the context within which a model applies. Explicity set boundaries in terms of team organization, usage within specific parts of the application, and physical manifestations such as code bases and database schemas. Keep the model strictly consistent within these bounds, but don't be distracted or confused by issues outside.
He continues to make a point of keeping the boundaries distinct (don't allow them to blur) and be explicit in communication with outside systems (by using "context maps" for conversion to/from outside systems).
The bottom line is keep your systems distinct and avoid the allure of sharing common domain/data models. Keep the boundaries distinct and exchange information with outside systems. I wouldn't even allow common exchange formats across systems (It runs into the same exact meaning problems). Define the exchange for each system (A different DTD for each one if you use XML, I'm not saying to not use standards, just don't define a common DTD for all system exchanges across the enterprise). Don't be lazy. Let your system keep the meaning and don't let it get watered down. Sharing allows meaning to be blurred (and misused) and this lets the bugs in. Data warehouses sound like good ideas, but really your duplicating the meaning to pull everything together. An agent system would be a better answer where it gathers the information from each of the systems at their boundaries. Since the agent would be its own system. It's a harder path to follow, but it's the one that allows your systems to continue to grow and not stagnate. The minute you buckle a system together with another, the stagnation will begin and the cost curve will start to rise.