The Twelve-Factor App Methodology — Key to Amazing Distributed Web Applications
With the advent of cloud computing, the dynamics of developing web applications has changed astronomically. This transition has not been an easy journey.
Developers who were before comfortable with deploying to one server, now had to tackle multiple servers and the distribution of load. They also had to take into consideration the combination of different files with each other—and with those files originating from different servers.
In order to resolve the confusion among developers, Adam Wiggins, the co-founder of Heroku and his team proposed a process called “The Twelve-Factor App Methodology.”
As the methodology is a result of developing a large number of applications, its guidelines ensure both productivity and scalability. The format is primarily influenced by Martin Fowler’s books Patterns of Enterprise Application Architecture and Refactoring.
So why should you follow it, you might ask?
Not following it means you would go through the same mistakes the creators of this methodology and other senior developers have already gone through, wasting your time in solving systemic issues.
Conversely, following this methodology allows you to take advantage of existing, tried experience, which can save you time in the long-run.
With these reasons in mind, let’s answer the most obvious question: What is the Twelve-Factor App Methodology, exactly?
What is Twelve-Factor App Methodology?
The Twelve-Factor App Methodology is a set of guidelines that provides the best practices and an organized approach to developing modern complex web applications.
The principles it suggests are not constrained to any particular programming language or database. These principles are flexible enough to be used with any programming language or database.
This set of guidelines was designed to fulfill two particular goals; it is important to focus on them before discussing the actual twelve factors. They are:
- What specifically does the Twelve-Factor App Methodology try to achieve?
- How does it try to achieve it?
Let us see what the official website has to say about these goals.
“Use declarative formats for setup automation, to minimize time and cost for new developers joining the project”
Minimize the time and cost for new developers setting up the project and joining its ecosystem by automating the development setup through declarative syntax.
This practice helps when the services provided start to increase in size.
“Have a clean contract with the underlying operating system, offering maximum portability between execution environments”
This aims to offer maximum portability among different execution environments by decoupling the software elements from the underlying operating system. Since the software has become platform-independent, it gives a great amount of power.
“[The apps should be] suitable for deployment on modern cloud platforms, obviating the need for servers and systems administration”
It offers flexibility if anybody wants to set it up on their own infrastructure off of the cloud, while still easing the process of deployment.
“Minimize divergence between development and production, enabling continuous deployment for maximum agility”
This minimizes the differences between production and development environments by using the concept of continuous deployment (CD). Doing this makes the process of debugging easier for developers.
“[The apps] can scale up without significant changes to tooling, architecture, or development practices”
This allows for scaling the software up and down, without any hassle. It is a cardinal feature in today’s software world.
Now, let us discuss how these goals are manifested through the factors, which are the core of this methodology.
The Twelve-Factor App
Each factor in this methodology plays a unique and important role in its proposed practices and architecture.
Let’s start by discussing each factor on its own, briefly.
“One codebase tracked in revision control, many deploys”
This means your project source code should be maintained and tracked by a central version control system. It makes the accessibility easier for developers.
One should have one codebase with different environments (deploys).
The advice contained in this first factor proposes the principle of not building another codebase, just for the sake of setup in different environments. For example, one should not have two repositories (production and development) for a single codebase. It is a bad practice.
Different environments represent different states. Whatever the state may be, they all share the same codebase. Technically speaking, a state can be considered a branch in the context of Subversion Control Systems like Github.
Consider another example, to get a better understanding: you are building a Todo App with 3 team members. A bad approach would be to make three folders in each member’s system (todo-app-development, todo-app-staging, and todo-app-production), as it would be difficult to track the project and progress by each team member.
A good approach would be to create a single repository named todo-app in any central version control system like Github, Gitlab, etc, then creating 2 branches, development and staging. Then name the “master” as the production/release branch (your opinion on branch names may differ, which is perfectly fine).
“Explicitly declare and isolate dependencies”
This primarily means that one should not assume the target system holding the codebase would already have all the dependencies (external libraries) installed.
It is a bad practice to upload the dependencies along with codebase, as it could produce issues such as the underlying platform dependency. For example, if the modules uploaded had been from a Windows PC perspective, and a developer with Mac OS downloaded the codebase, they would without a doubt have trouble running the project.
It is always a good practice to use a package manager like npm, YARN, or similar —depending on the technology stack—to download the dependencies on your respective system by reading a particular file representing the dependencies’ names and versions.
“Store config in the environment”
Configuration information should never be stored in files. We do not want sensitive information like database connection information, or other secret data, to be given out publicly. Also, the configuration we have is primarily dependent on the environment it’s in.
For instance, if you are developing a hospital management system—and for the production you have cloud-based database service credentials like AWS Dynamo DB, but for testing on your local system, you want to test with local MongoDB server—would you keep on changing the code again and again, or set the variables in such a way that you would only have to change its environment (like dotenv in Node.js) to get your codebase to run perfectly?
I’d go with the latter.
4. Backing Services
“Treat backing services as attached resources”
First of all, a backing service is any resource which needs a network connection to run successfully. It could be a database like PostgreSQL or any resource.
For the sake of simplicity, let’s assume that we are working on a project with Nodejs and PostgreSQL. During development of the environment, we have a local PostgreSQL server running on our machine. Now we want to go into production by running this database on another server.
What should we do?
Instead of changing the code, we should change the environment where the configuration information is available. The only difference would be the URL which differs from the developmental one.
5. Build, Release, and Run
“Strictly separate, build, and run stages”
This promotes the separation of concerns for the methodology by compartmentalizing the stages for every group. It promotes three stages, described as follows:
- First, the build stage. The developer is in complete control of this stage, tagging new releases and fixing any bugs. It is best to change the code only in the building stage and not interfere with it in other stages.
- Second, the release stage. Here the build stage’s code is executed on the target environment, offering a real-world feel. Tests are performed to check if everything works correctly or not.
- Third, the run stage. This is the final stage, which runs the application. It should not be intervened by any other stage.
Imagine you are building a management system for a restaurant. If you ignore these stages and start developing, you would be wasting time solving issues between your system and the target restaurant system.
“Execute the app as one or more stateless processes”
The state part of the codebase should be separate from the processes or application instances. It should only live inside a database and shared storage.
This is useful when multi-node deployments are done in cloud platforms for scalability. The data is not persisted in them, as data would be lost if any one of those nodes crashes.
For example, the session data should not be stored in an application’s process, as the app would require you to login again anyway, if it is cloud-based and/or one node crashes somehow. It is always a good practice to store session data in a datastore such as Redis.
7. Port Binding
“Export services via port binding”
Your application service should be accessible to other services through a URL. In this way, your service can act as a resource for other services when required. This promotes making the app self-contained.
You can build REST APIS for other applications using this concept.
For instance, you want the posts in your developed social media web application to be accessible to some other party, you can provide them the API URL (something like: https://www.xyz/api/posts:5000) and some request tokens for the other party to access your data. This is the best approach as now your web app has become a backing service to the other party.
“Scale-out via the process model”
Each process in your application should be able to scale, restart, or clone itself on the basis of requirement. Doing this would improve the scalability.
Using the approach mentioned above, you can build your app to be capable of handling diverse workloads by assigning each workload to a process type (PID). For example, HTTP Requests could be delegated to a web process, and long background tasks to the worker process.
“Maximize robustness with fast startup and graceful shutdown”
This suggests that the processes should be less time-consuming, able to start and stop quickly. In addition to that, they should also be able to handle failures. Nowadays, containers like Docker are useful for performing this functionality.
For example, a robust queueing backend system like RabbitMQ could be used to handle sudden deaths, shutdowns of the processes. In this case, when the client is disconnected or shutdowns, the task at hand is returned to the queue.
10. Dev-Prod Parity
“Keep development, staging, and production as similar as possible”
Teams working on a project should use the same operating systems, backing services, and dependencies to keep differences between development and production minimal.
As a result, less time is needed for development. This also promotes the idea of rapid application development (RAD).
By reducing the amount of differences between the development and production stages, the process of continuous deployment becomes hassle-free.
“Treat logs as event streams”
Your application should not be responsible for the storage and management of logs. It should just print accordingly to check the flow of your application.
12. Admin Processes
“Run admin/management tasks as one-off processes”
This final factor recommends that your Admin tasks should execute from alike production servers.
Admin tasks could be doing database migrations and collecting analytical data from the application to get insights from it. These tasks run on the application live through application code being shipped.
The Benefits of the Twelve-Factor App Methodology
We have seen how the “Twelve-Factor App” methodology can make development hassle-free and can speed up productivity. Additionally, the time invested in understanding and implementing these guidelines can help us save a huge amount of cost on the software.
There are scenarios where it makes sense to deviate from a few of the above factors, such as Logs, but it is best to adopt all Twelve-Factors as much as possible.
If you are designing microservices architecture, do consider these factors and take them seriously.
What may appear trivial now, might be of great importance when you are running 10+ services across five separate environments. If you are already running microservices, see if there are factors that you missed and maybe they can help you solve problems that you could not note earlier.
All in all, these guidelines provide a great foundation for building distributed web applications and microservices. This methodology helps you scale and maintain the application smoothly in the long run.
Good Luck and Happy Designing!