The Modern Developer Part 5: Provisioning, Deployment, and Maintenance
In today’s post, we will take a look at the stages of a software product subsequent to the quality assurance (QA) stage, specifically provisioning, deployment, and maintenance.
This is the final post in the series—a big thank you to everyone who has stuck with me for the past few months. Hopefully, you’ve enjoyed the ride.
When I was starting out in software development, I thought that once the software was released, the job was done. Boy, oh boy was I wrong. Often, it is just the beginning.
Before you release something, you need to make sure that you have the necessary infrastructure to do so. This is where provisioning comes into play.
Provisioning: Getting Infrastructure and Licenses
Despite being in the final post, provisioning can be done at any point after the design. In a nutshell, provisioning is getting all the infrastructure and licenses that you will need when launching something to production.
The two main options for provisioning are owning or renting.
Which one should you choose?
Some projects require you to own your infrastructure. Cloud storage and banking software are key examples. Usually, this type of requirement is dictated by different legislations in the specific domain. The legislation differs by country, so this is something you want to get a head start on. Another example of a situation in which you will need to own your infrastructure is for government contracts.
You can get a special agreement with some providers about data security, but the general rule is that if you don’t own the infrastructure, you don’t have complete control. This isn’t all bad— there are benefits, as we will discuss later.
What if you don’t have any security constraints or contractual obligations to own your infrastructure? Should you still own it?
The first thing to consider is cost. Owning a server costs thousands of dollars, and having the required infrastructure in terms of heat management, electricity, internet, and so on is very expensive as well. Although you can place your server in a data center, it still costs a lot of money each month to maintain.
In contrast, rented infrastructure (e.g., DigitalOcean, Amazon, Google Cloud etc.) bills per minute. Furthermore, they usually offer tools to better manage everything. You can start, stop, clone, destroy, or create machines with a few clicks, and they appear nearly instantly.
This is a far more flexible solution. If you want this flexibility for owned infrastructure, you will need one or more sysadmins, and the speed will probably be lacking in comparison.
Another pro for rented infrastructure is that you can replace a lot of the services you are using with premade solutions. For example, instead of setting up your own database server, you can use a database-as-a-service. Instead of hosting your code, you can use cloud functions, and so on.
On the other hand, an advantage for owned hardware is that you have complete control over everything. Need a specific RAID combination? Sure! Need to have a specific network-attached storage to suit this specific application’s needs? No problem. The potential performance can’t be matched by any rented hardware. The control on the hardware that builds your server is unparalleled.
The key advantage of owned infrastructure is that you have complete control of the hardware and virtual machines. You can change and modify it as you wish. This can offer significant performance benefits that you can’t get from rented hardware. The key disadvantage is that the cost is high, and you may need additional people to monitor and maintain it.
The key advantage of rented infrastructure is that everything is automated and billed by the minute. The initial costs are very low, and you can scale as you need. Many providers offer automated backups and disaster recovery. The key disadvantage is that you don’t have complete control. Although this shouldn’t be an issue for most projects, it can be a dealbreaker for some.
In my work, I have dealt with both types of provisioning. Personally, I use rented infrastructure for my personal projects, as I can create and destroy infrastructure as I need it.
If I had to choose one, I would choose rented infrastructure for computation-heavy projects with inconsistent load. That way, I can make the application add more servers when the load is high and destroy them when they aren’t needed. For storage-heavy, high-investment projects, I would choose owned hardware, since small configuration and setup changes can have great effect on a system’s performance and throughput.
Other than the hardware, this is what else you should consider when provisioning the infrastructure:
- Are all domains bought, and do you have https certificates for all of them?
- Have you set up the internal network between your servers? Is it secured?
- Have your bought licenses that scale to your needs? (Usually, the license for services during development is cheap but requires an upgrade for production.)
Provisioning is key to a project’s success. It can be done in parallel with development. As software requirements change so will the requirements for provisioning. Just like in software, it is important to build things with the mindset that they may change.
Deployment: Get the Code, Libraries, and Services Online
Let’s dissect the typical software product:
- Source code
- Third-party libraries and frameworks (static files, binaries)
- Third-party services (living processes)
A deploy can be any combination of the three. The second and third bullets each can have subcategories:
- Add or remove a dependency
- Update its version
- Change the configuration
Each scenario will require slightly different steps for execution. It’s paramount to be aware of exactly what must be done for the changes to take effect immediately; otherwise, you can find yourself running a broken build.
In theory, deployment should be one of the easier things, since it is about moving source code around and restarting services, but in practice it is hard. Very hard.
The main reason is development culture. Although now we have automated solutions, and deployment is considered a first class priority, this wasn’t always the case.
To this day, a lot of companies do manual deploys or use custom, homebrew solutions.
But why is this ineffective? What’s so bad about manual deploys and custom solutions?
Let’s consider the consequences of a bad deploy.
You can have downtime and lose money because of it. Worse yet, you upgrade half of your servers and have two running versions of the same software, which can lead to really nasty bugs that are sometimes impossible to debug. Or you can lose all your data and be forced to restore from a backup.
There are many more reasons I could list, but they lead to the same conclusion: Issues during deployment must be avoided at all costs, since they will hurt the end users—which, in turn, will hurt the organization.
Manual labor is prone to error. Not only that, but having to deploy the new version of the software on 50 or 100 machines will take a lot more time if done manually than it would if it were automated. The possibility of error is also much smaller when a program executes the deploy.
Homebrew solutions are awesome, since they are suited to the needs of the specific software. My main issue with them is that they usually aren’t thoroughly tested and aren’t consistently supported.
Changes are made when they are needed because the deploy won’t work, or there is a new service that needs to be added. This means that no one will be 100% sure of the code and can introduce more errors when adding the new small feature. I know that in a perfect world, you would research the code before you edit it, but in reality, this is rarely the case.
The last option is to use a standardized solution. One of the best options currently available is Ansible. In Ansible you describe your servers, then you add some metadata to them and what groups the servers belong to, and finally you execute tasks on these groups/roles.
The tasks themselves are in two categories: On the one hand, we have standardized tasks such as restarting a service, pulling code from a repository, installing a package, creating and editing a file, and rotating logs, and on the other hand we have custom tasks, e.g., when you want to run a cleanup script.
That is pretty much everything you will need for most deployments. Having the deploy procedure in a script makes it reproducible, traceable, loggable, and monitorable, all things that you want to eventually incorporate into your software to ensure maximum uptime.
Since you aren’t developing the solution, most of the maintenance work for this software will be done for you. Furthermore, the user base of the product you are using will sometimes be hundreds of thousands of users, so most bugs will be fixed during the testing, and you will have an overall much more stable deployment when compared to a homebrew solution.
Maintenance: Bug Fixing, Security Patches, and Future Proofing
The project is finally released. The end users are satisfied. Whenever a bug appears, it is addressed, and the fix is deployed within a few days. Even if there are no change requests for further development, there are several things that must be monitored and maintained.
The first thing is the hardware itself—without it you won’t have a running system. This doesn’t require constant work; rather, when something breaks, it must be fixed quickly.
Next, we must make sure that as the system grows, it can sustain its growth. This means that more powerful databases and application servers must be added. Or perhaps a higher bandwidth network is needed. Regardless of what is necessary, the system must support the extra hardware that will be added. Scalability is one of the main reasons that microservice architecture is favored these days.
Another thing to consider is operating system health. One of the most important things is being up to date with security patches. Also, some of the core metrics such as disk space, ram, cpu, and network bandwidth should be monitored. Keeping the operating system healthy and the environment secure is a topic spanning several books. What’s important is to keep it in mind and not to ignore it.
Moving one layer up, we get to the supporting services and libraries—databases, web servers, and frameworks. They need to be up to date with security patches and bug fixes. This is the bare minimum; the optimal solution is to always update to the newest version.
Unfortunately, many projects have a lot of dependencies—so many that when the dependencies are upgraded, several parts of the software break. If that’s the case, having an automated test suite is an easy way to pinpoint the issues and fix them without introducing regressions in production. The other way is to use manual QA to find the bugs, but this is slower, and there is a higher possibility of missing bugs.
Lastly, we have software maintenance. Ninety-Nine percent of the time, bug fixing is part of the initial deal, at least for the first 12-24 months. The change and feature requests are a bit more complicated. They are part of the maintenance phase despite requiring development. If the code is unmaintainable and without tests, there is little you can do post-release without spending immense resources. That is why writing quality and maintainable code are key.
Quite often, the team that writes the change requests and features post release isn’t the same team that wrote the initial version of the software. This is primarily due to the high turnover in software companies. Documentation, build scripts, and tests are what will reduce the onboarding resources needed for new developers and help get the new features out in a reasonable amount of time.
Software Is Constantly Changing
Today’s requirements will inevitably change. The work on a software product will never end unless all technology freezes, which is highly unlikely. When working with software, it is crucial to make the code, subsystems, and hardware extendable.
Having a stable release process greatly reduces the risk when deploying bug fixes. Tests and high quality standards all-around ensure that the software will have the maximum possible uptime.
It might feel challenging and complicated, but it's an investment of time and effort that will be worth it long-term and will have a very positive effect on your career