3 Reasons Why You Need to Version Your SQL Code

Based on my experience working as a Business Intelligence Professional, I have a suspicion that many SQL Development teams do not keep track of changes to their code using a version control system. If this is you or your organization, you need to consider the following benefits of implementing version control. Here are three major reasons:

  1. Ability to Track Changes
  2. Ability to Rollback Changes
  3. Promotes Better Collaboration

Ability to Track Changes

If your organization doesn’t currently use version control on SQL projects, you probably have folders that look something like this:

. 
└── Loans/ 
	├── Portfolio v1.sql 
	├── Portfolio v2.sql
	├── Portfolio v2.1.sql
	├── Portfolio v3.sql
	├── Portfolio v4.sql
	└── Portfolio v5.sql

To some this might look like an organized folder, but in reality, it’s chaos. Sure the developer saved different versions of the file, but that doesn’t help us answer some critical questions such as: which version is currently used in production? Who was the last person to make a change to these files? What is different about these versions?

Version control gives us the ability to answer all these questions by tracking changes to query within a repository. Version controlled folders are much simpler. The above folder could be consolidated to look like this:

. 
├── .git 
└── Loans/ 
	└── Portfolio.sql

History is then kept within the git folder and all versions can tracked with information about what has changed and who changed it. Here we can see that our production version is Portfolio v4 with version 5 in development.

Ability to Rollback Changes

When I was a brand new Business Intelligence Analyst, our team didn’t have version control. Instead, we overwrote the scripts in our ETL tool. Some analysts kept copies of the queries on their own machines, often with the naming conventions discussed in the introduction. Unfortunately I was not as diligent as some of my coworkers.

On a few occasions, I remember making changes to some ETL scripts that started producing inaccurate results. Usually these were noticed after a couple of days. And because business intelligence analysts tend to jump around on various projects, I found it hard to remember all of the changes I had made. This resulted in a few hours of troubleshooting the query to find what change had broken it.

Our team eventually implemented version control on all of our SQL scripts. Doing so made this situation a non-issue. Rather than forgetting what changes I had made, I could look at the differences between versions and quickly see what lines of code were causing problems. I could also easily rollback the changes so that the query could run without defect as it had before.

Continuing with our example from above, we can imagine a situation where version 5 of the query was merged into the main branch and pushed into production as follows:

After determining that the new version broke production, you can easily revert the commit and set things back to the way they were as follows:

Promotes Better Collaboration

Collaboration is difficult when you don’t version code. Usually scripts are siloed and decentralized on personal computers. Many companies recognize this issue and try to accomplish better collaboration through the creation of shared drives. These centralized locations are a step in the right direction, but they don’t address other important collaboration issues specific to SQL code development such as accidentally overwriting changes and attributing credit (or blame) to team members.

Version control systems allow you to store code in a centralized location with the ability for team members to keep a local copy on their own computer. When one user makes a change and commits it to the remote repository, other team members can manually pull down the latest changes and keep their version current. In addition to these manual pulls, you can easily automated this process. IDEs like Visual Studio Code enable you to fetch changes periodically in the background.

Remote repository service providers like GitHub and GitLab extend the collaborative capabilities of version control systems. These platforms create a place for developers to manage projects, conduct code reviews, and push changes into production.

Conclusion

When starting out with new teams, I insist on getting all legacy SQL code versioned as soon as possible. There are so many quick wins that come by such a simple implementation of a version control system. After reading this post, you should now have an idea of what those advantages are. You may even want to start implementing a version control system ASAP! I recommend using Git. Its free, user friendly, and universally considered the gold standard. Git integrates with remote repository service providers such as GitHub and GitLab.

Review Your Cart
0
Add Coupon Code
Subtotal