Tuesday, September 25, 2007

SOA and So Much Metadata

Everyone probably knows that metadata is "data about data". This concept has been around throughout the short history of information technology and before. The library catalogue is the archetypal embodiment of metadata. Data structures and then databases need description and these were more technical type declarations, schemas and Entity-Relationship diagrams. XML deserves special mention in this one paragraph of metadata history as it has revolutionised the way metadata is done. Now those library catalogues, those data structures alike can be described in XML.

Now metadata and XML has a particularly important role in SOA. Metadata is the glue that holds services together in SOA. WSDL and SOAP which are both based on XML are used at runtime to communicate with services. This is not the only importance of metadata in SOA though. Metadata has an increasing role during design time if you are going to get the most reusability from your SOA.

The amount of metadata require at design time is huge. To appreciate this it is instructive to have a look at the Zachman architecture framework grid. This 30 cell grid is full of architectures that use metadata models. Some might argue that some of the material from the top rows of the Zachman grid (scope and business model) is pretty far removed from working systems and should not be bundled in with metadata more directly relevant to implemented systems. Program code might not be considered by some as metadata but nevertheless it is an abstraction of data submitted to a CPU for execution. The point is that there is an awful lot of design time metadata in enterprise architecture. It might be XML, or UML, or other diagrams or lists but it should all be recorded and it all is interrelated.

The challenge has always been to store this metadata in a way that allows it to be easily retrieved, in a way that makes the connections to other metadata explicit and a way that can be discovered for new projects. Reusability has not been something that was discovered with SOA. Business process and data models have always been relevant over multiple projects and it has always been important to be able to find metadata when the need arises or when maintenance work needs to be done on old applications.

Finding a repository for all this metadata is not easy. Where these artefacts are stored with the tools that create them it makes it difficult to combine into cohesive store of information. Taking my organisation as an example, we use Telelogic System Architect for the higher level enterprise architecture document and Rational Software Architect for some of models used in our applications. We also have and issue register, a call logging system and diagrams done in Microsoft Visio. For data warehouse we use Oracle Designer and Oracle Data Warehouse builder. We store system information in CVS (software and document versioning), our file server, our intranet and our Wiki. No wonder we sometimes have trouble finding something.

How can I be sure that a developer has looked up all the previous work that has been done previously before the developer proceeds to develop a new service or interface? We could buy a metadata repository to hold all this information, but without good governance of the process of creating metadata this would be just one more data store in which to hide our artefacts. If it is clear what goes where then multiple repositories is not such a problem although it is obviously easier, the fewer different repositories you have.

In a sequel to this posting I will write more about registries and repositories, which are technical aids to solving the metadata flood. In this posting I hope I have established a need for dealing with metadata effectively.

No comments: