Identifying and Combatting Technical Debt

Technical debt is something you should always be conscious of in the software industry. Basically it is the result of doing a rushed or “unclean” job and leaving code or infrastructure in a state that makes it harder and slower to do your job the next time you are working on that application.

The external result of tech debt is that outside the development organization, other departments view development as being extremely slow to deliver. Projects either take a very long time, or seem to become stagnant and are in a constant state of rework.

We accrue technical debt constantly in our day to day software development process. It can be fairly difficult to avoid, but with some planning and awareness we can identify major causes of tech debt and work to alleviate them.

I’m currently on a project at work that is the poster child for technical debt. I am also the 3rd or 4th person to pick it up and work on it to try to make it better. For each of the sources of tech debt that I discuss I’ll give you a “real world” example of how this one project was affected.

So let’s look at some of the major causes of technical debt and how we can combat them…

mess

Developer Discipline

Yep, we’ll start with the biggie. Developer discipline is probably the single biggest cause for technical debt. In my opinion it largely stems from experience, and to a lesser degree, care. Since this is a big one, let’s break it down into sub items:

DRY / SOLID / Clean Coding practices

Simply unclean code that is difficult to maintain, difficult to understand, and time consuming to fix. Largely caused by lack of experience.

Real world:

The project I’m on is littered with “dead code” which (at least I think) is not used anywhere. There are still ActiveRecord model files for DB tables that no longer exist. Entire files are commented out. Some folders and files have the name ...-test or ..._refactored appended to them, and those are apparently the “correct” ones to run. I’ve also run into a few places where a function parameter will take in a parameter named record_id but it actually needs the record itself, not the ID. At some point someone probably changed the parameter type without renaming the parameter.

Ways to combat:

  • Training – Learn common principles and patterns
  • Pair programming with a more experienced team member
  • Code or Pull Request reviews
  • “Boy Scout Rule” – leave it cleaner than you found it. Other devs should always be looking for and cleaning up bad code.

Junior devs working independently

Directly related to the above. If you have a junior level dev work solo on a project, they aren’t going to have the experience or knowledge of best practices for a given technology, and aren’t going to set up code to be extensible.

Real world:

This project was originally written by a pair-programming duo of interns, then given to a dev with around 2 yrs of experience to add on more features. There are a lot of odd decisions. For example, there are 3 actual applications within the project, and one of them is a Ruby on Rails app. However, they didn’t use ActiveRecord DB migrations until 1/2 way through the project. The very first migration in the project drops a table (which wouldn’t have existed yet) so running rake db:migrate doesn’t work. Anyone new inheriting this project (that would be me) would be very confused by this.

Ways to combat:

  • Pair programming with a more experienced team member
  • Truly “agile” teams should be self-organizing. Senior level devs should be jumping on the opportunity to pair with and mentor other devs.
  • Code or Pull Request reviews

Lack of Testing

When time becomes an issue and a deadline looms, everyone looks at testing as an easy thing to cut. “Don’t have time to write tests!” We’ve all heard it. Guess what; that time you “saved” by not writing tests? Yeah, you’ll be spending that time and more tracking down, fixing, verifying, and regression testing issues later. How do you even know your code works?

Real world:

I found 5 test files. 4 of them were the empty test stub auto-generated when having Rails make you a new db migration, so tested nothing. The 1 remaining file did have a total of 2 unit tests.

Ways to combat:

  • Write tests!
  • Practice TDD (test driven development – write the tests first)
  • Any time you need to fix a bug, start by writing a (failing) test that verifies the issue exists. Then fix it. Now you have a built-in regression test.

Lack of automated testing

How many times have you checked out a project and run the tests, and they fail? Your “master” branch should always be in a state of working, with passing tests. However if your tests aren’t automated then devs likely won’t run them and just commit potentially broken code. A continuous integration or build process should also be running these tests. Yes, it takes a bit of time to set up, but knowing that your “master” branch is “green” is priceless.

Real world:

Quite simply no build process at all, and almost no runnable tests anyway.

Ways to combat:

  • Use a well known / understood test framework.
  • Make the act of writing tests easy (build testing infrastructure or code helpers)
  • Tests should be run on code commit (on GitHub push)
  • Have visibility into the state of the “master” branch. If the automated tests fail, someone (ideally everyone) should know.

Technology Decisions

Moving on to the next major topic. There are constantly decisions being made over what to build, how to build it, and with what technologies. These decisions can often cause churn later. Making a new project? Cool! What language? What DB? What CI server? What test framework?

Build vs Buy

Often devs want to reinvent the wheel. It can be fun and challenging to write your own DB abstraction layer; but should you? You could write your own user authentication system so that it’s super robust and fits your project owner’s every whim; but should you? Evaluate the costs of “build vs buy” for these subsystems. With the current state of open source software, you can likely get a project delivered faster, cheaper and easier with open source libraries, or purchasing libraries (like UI widgets or graphing packages). These is still a cost to using these libraries though, and that is hidden bugs and potential security issues. Still, that cost of use still usually outweighs having to write something yourself.

Ways to combat:

  • Up front research on open source or available purchasable libraries.
  • Stop reinventing the wheel. Before you build something, see if it’s already been built.
  • If you do build your own, make it a reusable library that other projects can use.

Mysterious tech choices

Quite simply choose technologies that promote rapid development. The technology choices you make shouldn’t hinder your development team. In addition, the team should understand not only the reasoning behind using a package, but also how to use it.

Real world:

A couple examples here:

It was decided to write an application in Erlang. It needed to expose a REST API and talk to a DB. Erlang is pretty interesting in terms of crash resilience and process management, and it was thought that this would fit well into a microservice architecture. In the end it was discovered that there were very few good choices for web REST API and DB ORM libraries. Believe it or not the open source Erlang community isn’t all that big. The team ended up spending a very long time (several months) rolling their own layers. Meanwhile on a Ruby on Rails project, a could command line commands had that all set up in minutes vs months.

Another example; our devops team setup a new Jenkins v2 CI environment which support docker containers, and did a presentation on it. Afterwords, I asked how I get started converting my existing project to Jenkins 2 and Docker pipelines to which I was told to “read the jenkins and docker documentation” or “just google it”. So now every dev that wants to use this has to go do their own research (duplicated effort).

Ways to combat:

  • Do upfront research or a proof of concept for major technology decisions.
  • The output of a tech decision should be clear usage instruction, docs, or example code. Establish guidance and best-practices for your team for using that technology.
  • Understand the permanence of your decision. It’s likely going to cause massive churn if you decide to change tech stacks later.

Inconsistent use of technologies

With every project you start green-field, there is a desire to pick the coolest, latest, greatest, hottest new technology. However this causes a learning curve for everyone who has to transition to that project. The whole team already knows SQL statements, but I really want to use a document storage DB this time… The whole team knows Angular and we have 4 projects using it, but I really want to get React on my resume… These differing technologies across projects cause more ramp-up time when context switching between projects, and can even result in an inconsistent user experience (if one web app uses Angular and another uses React, there are bound to be differences in UX in terms of field validation, error notification, screen refresh, UI widget behavior…)

Real world

I mentioned before that there are 3 applications in this project that I’m working on. 2 of those are web servers. One is Ruby on Rails, the other is Ruby Sinatra. Why in the world would the same project use 2 different web frameworks? It gets confusing to remember how to start each app because they have different commands to get them running.

Ways to combat:

  • Pick a well established tech and stick with it.
  • Don’t be tempted by “shiny new toys”.

Deployment Knowledge and Repeatability

Much like unit testing, proper automated deployment and continuous integration processes are often thrown out at the first whiff of a time crunch.

You should almost never have to ssh into a server. You code deploy shouldn’t be an rcp command. Really, devs just shouldn’t have ssh access to production servers and databases anyway.

Setting up a new server should be easy and repeatable. If it isn’t, then setting up a new instance is going to be a time consuming headache. Need a new QA environment? Could take days!

Real world:

When asking how to set up a server for this project, I was emailed a .txt file that had (most of) the steps for setting up a new instance… apt-get commands, iptables commands, etc.

To “deploy” the code, you ran deploy.sh which was just a couple rcp commands to copy the code to the production server.

Ways to combat:

  • Use a CI server to build your deployable application (Jenkins, Travis CI, etc.)
  • Use Docker containers to have pre-configured environments.
  • Use Vagrant to build pre-configured virtual machines.
  • Use Ansible to script the setup of new servers.

Documentation

Let’s face it, no one likes writing documentation. Even when there are docs for a project, there is often a problem of where they are located, and if they are even up to date. Out of date or incorrect documentation can be worse than no docs at all, leading a dev to waste time thinking about something incorrectly because the docs led them astray.

Real world:

Docs for a project shouldn’t be spread across emails, Slack conversations, README files, and tribal knowledge.

Some of this project’s ActiveRecord model files have the DB table schema in a comment block at the top of the file… Guess what! They aren’t correct! No one is going to come back and change the comment if they add/remove a column from the table. Why even put it there? If you want to know the DB table layout, look at the actual source of truth; the DB and the migration files.

Ways to combat:

  • Put documentation in a predictable location. For example a root-folder README file. If there are external docs, add links to them.
  • Code should be self documenting, and work as expected – If it’s a Rails project, it should follow Rails standards. If you have to write a document to explain folder or DB table names, you probably just named them poorly.
  • Focus on what is important. What do I need to know to work on this project that I can’t find out elsewhere?
  • Keep any docs you do make up to date.

Project Planning

This one is for the non-programmers.

Changing Requirements

Technical debt can be caused by unclear requirements and changing requirements. Sometimes this is less “debt” and more just wanted time and effort. The more time we have to spend figuring out what a project feature should do, the less time we are actually working on implementing that feature. If requirements change mid-implementation, then the completed work might be thrown away, or need to be reworked.

Real world:

I’ll diverge from my commenting on this particular project I’m on to share a better story from a past employer.

We used to have 2 week sprints on the dev team, but at the corporate level there was a board meeting every week. The company itself didn’t have much of a mission plan for upcoming quarters or years, so the direction of the company tended to shift by the whims of the owner on any given week. All the time we would be half way through a sprint and someone would come back and tell us to drop everything we are working on because the company now needs “Super awesome feature X” instead. As a result, we were constantly working and completing code, but it felt like half of it would just get shelved. The view from outside the dev organization was that we were never delivering anything.

Ways to combat:

  • Plan ahead.
  • Project and Dev teams should work by a common timeline (scrum sprints, SAFE Scrum iterations, etc).

Deadlines

The real monster. If something has to be done by a certain “drop dead date” then it’s going to be shoddily assembled at the end of that timeline.

Not saying that deadlines can’t exist, but they need to be reasonable and actually take the scope of the work done correctly into account.

The example I like to make is, when you take your car in for repairs do you tell the mechanic “I MUST have by car back in exactly 2 hours no matter what!” If you did, do you think that mechanic was going to do the best job they could? Well, no time to do it right, so just slap together a fix and ship it! Then when your car is back in the shop next month to be re-repaired, you’ll just be spending more time and money to hopefully do it right the second time.

Real world:

The last person to work on this project was given “no more than 40 hours” to make it work, because a sales person decided to basically sell what we didn’t have (a thing that was build shoddily by a couple interns that sort of worked well enough for internal use, but was never intended for a customer to have). As a result things were left very unclean. There are basically 2 copies of everything in the code repository; one for the old internal deployment and one for what we shipped to the customer, because we also couldn’t risk breaking the old/existing internal build. No testing or deployment automation was done. Lots of dead code was left in place. Very few documentation artifacts exist. Even the “master” branch in GitHub isn’t actually the correct code because the dev was pulled off the project the second it worked and not even given time to merge the feature branch back to “master”. Now you just have to ask around to figure out which branch is the one actually deployed.

Ways to combat:

  • Dev team should be involved in project planning.
  • Factor in time to do the job correctly.
  • Staff the project with people who know how to do the job correctly.
  • Build in time for testing and automation up front.

Tech debt makes everyone’s job suck

I bet all these resolutions to tech debt sound like a lot of work. It is! Because tech debt is essentially that; not doing it right the first time to save time, and having to waste time later to rework it. Instead, you need to put in the effort up front!

For existing code bases, the team needs to feel empowered to clean up code (pay down debt) as they go. Don’t wait for permission or a project plan for refactoring. Just do it. Make is a separate “clean up” pull request so that it doesn’t get mixed in with other feature work (which could later get reverted, thereby undoing your code cleanup).

For new code bases, plan on doing it right the first time. It might take as long as it takes to write the code as it does to set up the CI, build, and deployment strategy, but it’s worth it in the long run!

Advertisements
Posted in Programming

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

CodingWithSpike is Jeff Valore. A professional software engineer, focused on JavaScript, Web Development, C# and the Microsoft stack. Jeff is currently a Software Engineer at Virtual Hold Technologies.


I am also a Pluralsight author. Check out my courses!

%d bloggers like this: