Design Data First

One thing I do pretty consistently when adding something to a web application is design the entity objects first, along with any associated value objects, without really thinking (or caring) about how those entities will be persisted or retrieved.

For example, I recently had to design and build an audit log system for a project at work. Having never built an audit log before, I did some research and started to design the audit log record itself which ended up containing:

  • A time when the event occurred
  • Information about what the event was
  • Who or what performed the event — a a principal
  • Where the event originated — web request (with IP address) or command line

Before even thinking about how this would be stored, I design the actual log record object. Some things needed value objects: the principal for instance would probably need a type and some sort of description of the who or what. For instance, a principal could be anonymous for something like a password reset request, or it would be an application granted access via machine-to-machine credentials, or it could be an actual user (or at least an access token belong to an actual user).

That was already a lot to sort out without even getting into how it would be stored.

By designing the data first, I had time to sort through my own thoughts and research to come up with something coherent on which I could act. The database code itself, when I wrote it later, ended up with much less churn because I wasn’t designing the entity while trying to persist it.

This doesn’t just apply to database-related code either. Almost everything requires a data structure of some sort. Thinking through those structures first before integrating them means much less churn overall — though named constructors can help with that.