Database Refactorings on Large-Scale Projects
It was six weeks before the big deployment, and tensions were running high across the five feature teams on the project. Just as a team thought they were approaching the finish line, a slew of big and small database refactorings stalled progress. Tables were re-structured. Columns were added or renamed.
One team’s tech lead sent out an email in frustration: “This kind of refactoring of our data model has got to stop. Why are we adding these columns at this point? What is the business requirement that is driving all of these changes? Why did we not anticipate the need for these columns sooner?”
One of the database architects responded: “These changes were not anticipated sooner because this is an agile, not a waterfall, project. “
The tech lead was right to be frustrated some of the refactorings, and the database architect was partially correct that not all detailed requirements are known until close to the end of the project. The database architect was incorrect, however, about agile being the cause of the major changes to the data structures.
- Major structural changes to key entities in the database should not be necessary late in the project – even an agile project. The database architect does not need to know every detailed requirement in order to determine the structure of the data for key business entities given the nature and volume of the data and the technical platform. The data structures for a major business entity should be established before significant development begins using that entity. Some frameworks for scaling agile projects have the concept of an architectural team providing sufficient “architectural runway” for the feature development teams. The data structures are part of this runway.
- Some database refactoring, such as adding new columns, is typically what emerges late in a project when sufficient attention is paid to major structural concerns earlier. These are usually relatively easy for the teams to implement.
- Perform a cross-team assessment of the costs and benefits of refactorings before moving forward with them. Data architects do not always have insight into all of the implications of a database change across all of the layers of an application. If the schedule is tight, things like renaming columns or tables should be deferred. Yes, this is technical debt, but sometimes this is tradeoff that must be accepted in order to meet business goals.
- Build time into the project schedule to do refactorings. Expect that these will happen. Also expect that business stakeholders, especially those that control project budgets, will consider this work unnecessary and wonder why the team couldn’t do all of the database design up front. But of course that’s why we are agile – because we know that as hard as we try, we can never learn all of the requirements up front and we expect some degree of change throughout the course of the project.