Having finally gotten around to watching Greg Young’s E-Van presentation, I want to take a little interlude to talk about something that may or may not be interesting for a number of applications, but is typically a central requirement for many ‘enterprise’ applications (I have no idea how to define ‘enterprise’ here…I leave it as an exercise for the reader for now). And that ‘something’ is auditing.
Though it should go without saying, nothing I’m going to describe here is new. It could be viewed as a very unsophisticated description of the Event Sourcing pattern. I should also point out that I’m not advocating what I’m going to be describing. At least, not necessarily. I don’t have enough experience with messaging at all to claim that I know the ins and outs of it. I do have a lot of experience with auditing from a traditional SQL background, so that is what I am going to be contrasting it with. Consider this an experiment in thinking out loud about the topic.
Auditing Basics
In almost any enterprise application, the auditing requirement is usually pretty central. Who made what changes when and where (‘why’ is usually important but almost impossible to track, so I’ll leave that aside). I’ll use Account as the basic example.
Suppose an Account is created. People usually want to know when this was created, and from what process. Sometimes it is done from the UI of an application, sometimes it is done from some background process, such as a loader. After an Account is created, people usually want to know when and how it is modified and by whom. Which properties were modified? Who modified them?
A very common and traditional way to manage tracking this audit information is through the use of SQL triggers.
In my experience, this is typically done by creating shadow tables. These are tables that have the same columns as the source table, with the addition of columns designed for tracking the source of those changes. These shadow tables often reside in a separate database, but this isn’t a requirement.
Typically, separate triggers are created for inserts, updates and deletes. When, for instance, an insert occurs, columns such as RecordCreatedBy, RecordCreatedDate, and the like are populated. For this to happen smoothly, the application that is doing the insert usually needs to be modified to supply the identification of the user doing the insert. The same can be said for updates, with columns like RecordLastModifiedBy, etc.
For inserts and updates, the trigger system is painless and transparent. It just works. Deletes are a little trickier, since unless the application knows how to update the shadow tables directly, the trigger itself has no way of knowing who made a change, since it only has access to the ‘deleted’ table within itself, which would only have the RecordLastModifiedBy record of the last insert/update. Additional work is required to supply this information.
Additionally, there is usually other processing that takes the audited data from the shadow tables and modifies it to make it more amenable for reporting purposes. I leave that aside for now.
Regardless, this is a common system, and it works fairly well. I’m not trying to convince anyone to use it or not use it, so I will not go into the more sophisticated descriptions of how it might work. In general, though, I will say that it does work and is pretty scalable.
Auditing using Messages
If one accepts the entire architectural shift required to use messages, there is another way to do auditing.
Suppose one has a UI screen that allows you to create an Account, or update an already existing Account. When the ‘Save’ or ‘Submit’ button in that UI screen is pressed, a message that contains the data for that insert or update (or delete) is created and sent to the messaging infrastructure. The message will typically be in XML format, though that isn’t a requirement. Regardless, the message is sent.
With an architecture like this, one can imagine an auditing component that exists and is registered to handle the existence of those messages, over and above the domain components that also exist and are registered to handle them. You still might have shadow tables that exist to persist those messages. The actual infrastructure might be very similar to the trigger-based one.
A central difference is that you get the entire ‘action’ within a message. Triggers cannot store the action, but only the results of a change. The value of Column A went from this to that. But the context of the change is lost. An audit system that used messages can store and persist the entire context of the change, because it is encapsulated in the message that contained it.
On the surface, this isn’t that big of a deal, until you consider the possibility of rolling back a change. Although it is possible to do this with a trigger-based system, it is very hard to do. If you have an audit repository that can store the entire history of an Account, from its creation to its changes to its potential deletion, you can reset an Account to any point in its history by replaying the history of the messages involved. If I want to know the state of an Account at Time X, all I have to do is to run through the messages involved with that Account from creation to Time X.
Obviously, this doesn’t happen by magic, you have to write code to accomplish this, but at least the infrastructure is there to enable it.
Wrapup
SQL-based auditing is pretty common, and when done correctly, works, and works pretty well. I don’t imagine that it will disappear anytime soon. But if you design your architecture to use messages, you may be able to develop an even more robust system.