How many times have you tried to use an API only to find that you had to fill in some ridiculous number of parameters with values that you had no idea about?
If you’ve ever done Windows programming and had to call into some of the Win32 APIs, I’m sure you’ve experienced this pain. Do I even need this parameter? What the heck is this value for?
Even if you aren’t writing an external API, it makes a lot of sense to make it as easy as possible to use your API without having to specify a huge list of parameters and it is also a good idea to restrict the options for those parameters as much as possible.
In this post, I’m going to show you three ways you can take a complicated API and make it much more simple and easy to use.
Our offensive API method
Let’s start off by taking a look at an API method that could use some cleaning up. (Examples are in C#, but the idea applies to many languages.)
Granted, this method signature doesn’t look so bad, but it is not the easiest to use. You have to fill in 4 different parameters and you have to know what customerType is and have to make a decision about the numberOfTries value and what that even means.
It is pretty common to see methods like these throughout a code base. While they seem innocent enough, they end up introducing complexity, because they require the caller to do more work than is necessary to call the method and to understand more of the system than they might need to know.
In this scenario, I’m purposely not explaining what customerType or numberOfTries is, because you, as a developer trying to use this code, wouldn’t know either. You would have to go and figure it out.
Let’s see if we can improve this method signature a little bit.
1. Reducing parameter counts by creating objects
The first thing we can do to make this method signature a little simpler is to reduce the number of parameters it has.
The only problem is, we need all those parameters otherwise we wouldn’t have them, right?
That is probably true, so what can we actually do?
Well, one really common refactoring step–which probably won’t be much of a surprise to you is to reduce parameters by combining them into an object.
Now, most developers do this the wrong way. They right click and select the refactor option in their IDE and they take all the parameters and make them into a single class that gets passed in.
This is wrong, because it is simply hiding the problem and now requiring the caller of the code to first create an object that has a constructor that takes those same parameters. If you do things this way, you are just kicking the can up the stream.
Instead, you should focus on grouping related parameters together, parameters that would always be right next to each other.
In our case the parameters that make sense would be userName and password. Together those parameters make up a user’s login. So, it makes sense to create a LoginInfo class that would contain those two parameters. We might even find that class has some additional functionality of its own or contains more data. But, for now we can simply refactor to make this method signature simpler.
It now requires more code to call our method, but the code is simplified and easier to understand. When someone tries to call the method and they see they need a LoginInfo object, they can very easily see that the LoginInfo object requires a userName and password and it is clear what the purpose of it is.
We’ve only slightly reduced the complexity by reducing the parameter count by one, but we’ve made the API more purposeful and clear.
2. Using enumerations to limit choices
We aren’t done yet though. We still can do quite a bit more to improve this simple API.
The next big issue with the API is that the customerType parameter is very vague. We can’t immediately deduce what it means and what value should be put in there. Is there a restricted list of values? Is it free form? Does it even matter? Or can I just put a blank string there?
We can solve this issue by restricting the possible values using an enumeration. In this example, we’ll assume the customerType can be one of 3 different values. We were requiring the caller to type the string representation of those values manually–and not make a mistake–but, we can improve the experience–and reduce the error rate–by limiting the choices to just what is valid.
Again, a fairly simple change, but a powerful one, because it really has reduced the complexity for the caller. Now, the caller doesn’t have to figure out what should go in that string parameter for customerType and can instead select from one of three possible options. By making this change, we’ve also enabled the caller to utilize the auto-complete feature of the IDE to assist in making a choice and we’ve greatly reduced the chance of error from a typo, since we are now validating the values at compile time instead of run time.
For this reason, anytime you can restrict choices for a parameter in a method, you should. Enumerations are the easiest way, but you can utilize other techniques as well.
For example, suppose you had a method that took a parameter maximumAge. You could make this an int, but, in most cases you would be better served by creating an Age class that had its own validation to make sure the actual age was an integer between 0 and say, 130. You wouldn’t catch errors at compile time, but you’d be greatly restricting the possible values and you’d make the intent of the parameter even more clear through the name of its type.
3. Using default values to reduce required parameters
In many cases there are reasonable defaults that can be provided for method parameters in an API. Whenever possible, it makes sense to use default values and optional parameters to further reduce complexity and only require the caller of an API to worry about what is important to them.
If your API is mostly used in one way or there is a reasonable default, provide the default and make the parameters optional. Callers who need extra functionality can always override the defaults.
In our example, we can get rid of the need to specify a customerType and the numberOfTries by providing a reasonable default for both.
Now, calling our method is dead simple for most cases. We can just provide the loginInfo and if we need more control, we can override the customerType or the numberOfTries.
It’s not rocket science
What we did here was pretty simple, but don’t let the simplicity fool you. We have made small–and arguably obvious–changes, but we’ve greatly reduced the mental burden for the caller of this method.
We would see a much larger cumulative benefit from applying these techniques to an entire API or code base. When you start restricting choices and grouping related data into objects instead of raw parameters, you end up with a synergistic effect in your code, because you are able to know more about the data that is being passed around.
For example, changing our customerType to an enumeration made it easier to call the method, but it also makes our logic inside the method simpler, because we don’t have to check for invalid values. Anytime we can rely on data being within certain bounds, we can simplify the logic for handling that data.
If you found these tips useful…
I’ll deliver you all the Simple Programmer posts to your email every week and include some additional content that I don’t publish on the site. Over 4000 developers have already subscribed.
Test automation framework architecture efforts are often complete failures.
It’s true. I’ve worked with many companies who have given up on creating a good test automation framework architecture, because after investing a large amount of time and money and resources in doing it the wrong way, they have incorrectly assumed the entire effort is not cost effective.
In this post, I’m going to simplify the process of creating a test automation framework architecture and show you that—if you do it right—it can be a very good investment.
Test automation framework architecture basics
Let’s start off by talking about what the goals of a successful test automation framework architecture should be.
There are two goals I am interested in when creating a test automation framework:
- Having the ability to easily create simple automated tests that use the framework.
- Decoupling cosmetic application changes from the tests so that when the application changes, the tests do not all have to be updated
A good test automation framework architecture should at the very least provide these two important services. We create a test automation framework to allow us to support the creation of tests and to keep those tests from being dependent on the actual UI of our application—as much as possible.
I’m going to break down these goals just a little more to make sure we are all on the same page.
Creating simple automated tests
First, we’ll start with having the ability to easily create simple automated tests that use the framework. Why do we care about this goal? And why would this be the responsibility of the framework?
I have found that one of the biggest failure points of test automation efforts is test complexity. The more complex the tests are, the harder they are to maintain. If tests are difficult to maintain, guess what? They won’t be maintained.
So, we really want to make sure that when we create a test automation framework architecture, we focus on making sure that test automation framework makes it as easy as possible for someone to create tests using it.
It is always a better choice to put the complexity into the framework instead of into the tests. The framework is usually maintained by developers and does not contain repeated code. But, tests are often created by QA or even business folks and the code found in tests is usually repeated across many tests, so it is vital to reduce the complexity on the tests, even if it means a large increase of complexity in the framework itself.
When I create a test automation framework architecture, my goal is to make it so the tests can read as close as possible to plain English. This requires some effort to pull off, but it is well worth it.
Decoupling the UI from the tests
Next, let’s talk about what I mean by decoupling cosmetic application changes from the tests.
This is also a vital goal of any test automation framework architecture, because if you fail to decouple the UI of an application from the tests, every single UI change in an application will cause possibly hundreds or thousands of tests to have to be changed as well.
What we really want to do is to make it so that our test automation framework architecture creates an abstraction layer that insulates the tests from having to know about the actual UI of the application.
At first this seems a little strange, especially when I tell automation engineers or developers creating an automated testing framework not to put any Selenium code into their actual tests.
All of the Selenium examples show you creating tests that directly use a web browser driver like Selenium, but you don’t want to write your tests that way—trust me on this one.
(By the way, I haven’t found a good book on creating an actual test automation framework architecture—I’ll probably write one—but, for now if you are looking for a good book on Selenium that starts to go into the topics I discuss here, check out Selenium Testing Tools Cookbook.)
Instead, what you want to do is to make it so the test automation framework is the only code that directly interacts with the UI of the application. The tests use the framework for everything they want to do.
So, for example:
Suppose you are creating a simple test to check to see if a user can log into your application.
You could write a test that looks something like this: (C# and Selenium in this example)
var loginInput = driver.FindElement(By.Id("txtUsername")); loginInput.SendKeys("Joe"); var passwordInput = driver.FindElement(By.Id("txtPassword")); passwordInput.SendKeys("$ecret"); var loginButton = driver.FindElement(By.Id("btnLogin")); loginButton.Click();
But, what is going to happen when you change the ID of your “User Name” field? Every single test that uses that field will break.
On the other hand, if you properly create a test automation framework architecture that abstracts the UI away from the tests themselves, you would end up with a much simpler and less fragile test, like this:
Now, if the ID of your “User Name” field changes, you’ll just change the code in your automated testing framework in one place, instead of changing 100 tests that all depended on the specific implementation of the UI.
A simple test automation framework architecture
Once you understand those two very important goals and why they are important, it is easier to think about how you should design a test automation framework architecture.
I’ve created quite a few test automation framework architectures, so I’ll give you a basic design I use for most of them.
Take a look at this diagram:
Here you can see that there are four layers to my test automation framework architecture.
Frist, we have the browser layer or the web app itself. This just represents your actual application.
Next, we have the Selenium or web driver layer. This layer represents your browser automation tool. Selenium is basically just a framework for automating a browser. It doesn’t know anything about testing, it is an API that lets you interact with the browser programmatically. (By the way, you don’t have to use Selenium. I just use it as an example here, because it is the most popular browser automation framework.)
After that, we have the framework layer. This is the actual framework you create which uses Selenium, or whatever web driver you want, to actually automate your application. The framework is acting as an abstraction between your application and the tests. The framework knows about the UI of your application and how to interact with it using Selenium. It is your job to create this layer.
Finally, we have the actual tests. The tests use the framework to manipulate your application and check the state of the application. These tests should be simple to understand and should not have to know anything about the actual implementation of the UI of your application. These tests should rely on the framework to give them the ability to do whatever they need to do in your application.
Obviously, creating a test automation framework architecture is more complicated than what I have shown you in this article, but with these basics you should be off to a good start.
If you want a more in-depth and detailed explanation as well as an example of how exactly to create a full test automation framework architecture, you might want to check out my Pluralsight video on the topic:
I also frequently blog about test automation topics and other topics related to becoming a better and more profitable software developer.
If you have any question, post them in the comments below.
My daughter is learning how to read right now. As I was thinking about this blog post, I just walked past my wife and her working on some very basic reading skills. It is quite a bit of work to teach her everything she needs to know to read and write the English language.
In fact, it will be years of hard work before she’ll actually be able to read and write with any measure of competence—at least by our adult standards. We tend to take language for granted, but spoken and written languages are difficult—exceptionally difficult.
Even as an adult, writing this post is difficult. The words don’t flow perfectly from my mind. I strain to phrase things in the proper way and to use the proper punctuation.
But, even though it is difficult to learn a written language, we make sure our kids do it, because of the high value it gives them in life. Without the skills to read and write a language, most children’s future would be rather bleak.
The more and more I thought about this idea, the more I realized how simple programming languages are compared to the complexity of an written or spoken language.
The argument for more complexity
The irony of me arguing for more complexity and not less doesn’t escape me, but even though I strive to make the complex simple, sometimes we do actually need to make things more complicated to achieve the best results possible.
I’ve thought about this for a long time and I believe this is the case with programming languages. Let me explain.
Before I get into programming languages specifically, let’s start off by talking about human languages.
I speak and write English. English is considered to be the language with the largest total vocabulary and also one of the most difficult languages to learn, because of the flexibility in the ways in which you can compose sentences with it.
It is very difficult to learn English. I am fortunate that I am a native English speaker and grew up learning English, but for many non-native English speakers, the language continues to be a challenge—even years after they are “fluent” in the language.
There is a huge benefit though, to being fluent in the English language—expressiveness. I don’t profess to be an expert in foreign languages—I only know a little bit of Spanish, Brazilian Portuguese and Japanese, myself—but, I do know that English is one of the most expressive languages in existence today. If you want to say something in English, there is most likely a word for it. If you want to convey a tone or feeling with the language—even a pace of dialog, like I just did now—you can do it in English.
As I said, I can’t speak for other languages. But, having lived in Hawaii, I can tell you that Hawaiian is a very small language and it is difficult to express yourself in that language. Sign language is another example of a very small language which is fairly easy to learn, but is limited in what it can convey and the way it can convey it.
I say all this to illustrate a simple point. The larger the vocabulary of a language and the more grammatical rules, the more difficult it is to learn the language, but the greater power of expressiveness you have with that language.
Breaking things down even smaller
I promise I’ll get to programming languages in a little bit, but before I do, I want to talk about one more human language concept—alphabets or symbols.
The English alphabet has 26 letters in it. These 26 letters represent most of the sounds we use to make up words. 26 letters is not a small number of characters, but it is not a large amount either. It is a pretty easy task for most children to learn all the letters of the alphabet and the sounds they make.
The text you are reading right now is made up of these letters, but have you ever considered what would happen if we had more letters in the alphabet? For example, suppose instead of 26 letters, there were 500 letters. Suppose that we made actual symbols for “th”,”sh”,”oo” and so forth. Suppose we made the word “the” into a symbol of its own.
If we added more letters to the alphabet, it would take you much longer to learn the alphabet, but once you learned it you could read and write much more efficiently. (Although, I’d hate to see what the 500 letter keyboard would look like.)
My point is that we are trading some potential in the expressiveness we can pack into a limited number of symbols for some ease in learning a useful set of symbols.
As you were reading this, you might have thought that this is exactly what languages like Chinese and Japanese do—they use a large number of symbols instead of a small alphabet. I don’t know enough about these languages to know the answer for sure, but I’d bet that it is much easier to read a Chinese or Japanese newspaper than it is to read an English one—or at least faster.
We could take the same exercise and apply it to the number system. Instead of using base 10, or having 10 symbols in our number system, we could have 100 or even 1000. It would take a long time to learn all our numbers, but we’d be able to perform mathematical operations much more efficiently. (A smaller scale example of this would be memorizing your times tables up to 99 x 99. Imagine what you could do with that power.)
What does all this have to do with programming languages?
You really are impatient aren’t you? But, I suppose you are right. I should be getting to my real point by now.
So, the reason why I brought up those two examples before talking about programming languages is because I wanted you to see that the vocabulary and grammar of a language greatly influence its expressiveness and the basic constructs of a written language, greatly influence its density; its ability to express things concisely.
Obviously, we can’t directly map human written languages to programming languages, but we can draw some pretty powerful parallels when thinking about language design.
I’ve often pondered the question of whether or not it is better to have a programming language that has many keywords or few keywords. But, I realized today that was an over simplification of the issue.
Keywords alone don’t determine the expressiveness of a language. I’d argue that the expressiveness of a language is determined by:
- Number of keywords
- Complexity of statements and constructs in the language
- Size of the standard library
All of these things combined work together to make a language more expressive, but also more complicated. If we crank up the dial on any one of these factors, we’ll be able to do more with the language with less code, but we’ll also increase the difficulty of learning the language and reading code written in that language.
Notice, I didn’t say in writing the language. That is because—assuming you’ve mastered the language—the language actually becomes easier to write when it has more constructs. If you’ve ever run across someone who is a master of Perl, you know this to be true. I’ve seen some Perl masters that could write Perl faster than I thought possible, yet when they came back to their own code months later, even they couldn’t understand it.
Looking at some real examples
To make what I am saying a little more concrete, let’s look at a few examples. I’ll start with C#, since it is a language I am very familiar with. C# is a very expressive language. It didn’t start out that way, but with all the keywords that have been added to the language and the massive size of the base class libraries, C# has become very, very large.
C# is an evolving language. But, right now it has about 79 keywords. (Feel free to correct me if I am wrong here.) As far as languages go, this is pretty large. In addition to just keywords, C# has some complex statements. Lambda expressions and LINQ expressions immediately come to mind. For someone learning C#, the task can be rather difficult, but the reward is that they can be pretty productive and write some fairly concise code. (At least compared to a more verbose language like C or C++.) Java, is pretty close in most of those regards as well.
But, take a language like Go. Go is a language with only 25 keywords. It makes up for this by having some fairly complex language constructs and having a pretty robust standard library. When I first learned Go, it took me perhaps a week to feel like I had a pretty good grasp of the language. But, it took much longer to learn how to use Go properly. (And I still have plenty to learn.)
At the far end of the spectrum, we have languages like BASIC. Different BASIC implementations have different keyword counts, but most of them are pretty low and the constructs of the language are very simple. BASIC is a very easy language to learn. But, because it is so easy to learn BASIC and BASIC is so simple, the average programmer quickly outgrows the capabilities of the language. BASIC isn’t very expressive and it takes many more lines of code to write the same thing you could write in a few lines of C# or Go.
For a much more comprehensive overview of differences between programming languages, I’d recommend Programming Language Pragmatics. It does into details about many different languages and the differences between them.
What more complex programming languages buy us
It feels really weird to be arguing for something to be more complex, since the mission of this blog is to make the complex simple, but in the case of programming languages, I think the tradeoff of increased complexity is worth the cost of extra learning time.
Consider how much more complicated the English language is than any programming language. To be able to read the very words you are reading now, you have to understand a vocabulary of several thousand words, recognize most of those words on sight, and understand a very complicated set of mechanics which govern the grammar of the language. There aren’t even concrete rules, much of what is “right” or “wrong” is based on context.
Yet, even with all this complexity, you are able to do it—our brains are amazing.
Now, imagine what would happen if we decided that English was too difficult of a language and that we needed to dumb it down. What if we dropped the vocabulary down to say 200 words and we got rid of the complex rules. What you would have is basically a Dr. Seuss book or some other early reader type of children’s book. It would be very difficult for me to convey the kinds of thoughts I am conveying to you right now with those restrictions.
When you compare even the most complex programming language to the English language, it is no contest. The English language is far more complex than any programming language we have ever conceived of. I don’t know of a programming language that the average person couldn’t learn reasonably well in a year’s worth of time. But, if you were to try and teach someone written English in a year—well, good luck to you.
If we created much more complex programming languages, we would have a much larger learning curve. But, in exchange, we’d have a language—that once mastered—would allow us to express algorithmic intent at a level we can’t even imagine now.
Not only would we be able to express our intent more clearly and more concisely, but we’d also greatly reduce the total lines of code and potential for bugs in our software. Less code equals less bugs.
Now, I’m just playing around mentally here. I “half” believe what I am saying, because I am just exploring ideas and possibilities. But, even in this mental exercise of thinking about what would happen if we created programming languages as complex as written languages, I can’t ignore the drawbacks.
Obviously, the biggest drawback would be the learning curve required to learn how to program. Learning how to program—at least how to do it well—is pretty difficult now. I still think people make it more complicated than it needs to be, but software development is a much more difficult vocation to pick up than many other career choices.
If we created more complex programming languages, we’d have to count on many more years of learning before someone could even really write code or understand the code that is already written. It might take 4 or 5 years just to understand and memorize enough of the language to be able to use it effectively.
We could of course combat this to some degree by starting beginners on easier languages and advancing them up the chain to more complex ones. (In fact, writing this article has convinced me that would be the best way to learn today. We shouldn’t be starting developers with C# or Java, but instead should teach them very simple languages.)
We would probably also be forced down a smaller path of innovation, as far as programming languages go. The world can support 100’s of simple programming languages, but it can’t easily support that many complex languages. We might end up with one universal language that all programmers used. A language of this size would be very unwieldy and hard to advance or change. It would also take a massive effort to create it in the first place, since written languages developed naturally over hundreds of years.
That’s enough fun for now
After writing this article my brain is hurting. I’ve been considering writing this post for awhile, but I wasn’t sure exactly where I stand on the issue. To be completely honest with you, I still don’t. I do think that more complex programming languages would offer us certain benefits that current programming languages do not, but I’m not sure if the drawbacks would be worth it in the end or even what a significantly more complex programming language would look like.
What about you, what do you think? Am I just crazy? Is there something significant I missed here?
Oh, and if you found this post interesting and want to hear more of my crazy thoughts about software development—and a few sane ones as well, sign up here and you’ll get a weekly roundup of my posts and some other content I only send out to subscribers.
Let’s just get right into it, shall we?
What is an abstraction?
Before we can talk about leaky abstractions, and why they are bad, let’s define what exactly an abstraction is.
An abstraction is when we take a complicated thing and we make it simpler by generalizing it based on the way we are using or talking about that thing.
We do this all the time. Our brains work by creating abstractions so that we can deal with vast amounts of information.
We don’t say “my Dell Inspiron 2676 with 2 GB of RAM and a 512 GB hard drive, serial number 12398ASD.” Instead we say simply “my computer.”
In programming, abstractions let us think at higher levels. We create abstractions when we give a bunch of lines of code that get the username and password out of a textbox and log the user into our system, a name; login. And we use abstractions when we utilize APIs or write to the “file system.”
Without abstractions we’d have a really hard time managing the complexity of any software system. Without abstractions, it would be pretty difficult to write any kind of non-trivial application.
If you want to feel what it is like to program with fewer abstractions, try serving up a web page with assembly code. If you want to feel what it is like to program with almost no abstractions, do the same exercise, but write your code in machine code… 0s and 1s.
How do abstractions “leak?”
This means the details of the lower level concept that are being abstracted away are leaking up through the higher level concept.
Ever get a cryptic error message when an application you are using crashes? That is a leaky abstraction.
“Memory location 00FFE133 could not be written to” is just bad news. It means that in order to understand what is going wrong with your application, you now have to understand the inner workings of the application. Your abstraction has effectively “leaked.”
Abstractions leak whenever you are forced to pull back the curtains and see the gnomes running in hamster wheels spinning the cogs.
Why “leaking” is bad
I’m often surprised when I hear someone say “I like how you didn’t hide the details of what is really going on” about some code or some technology.
Not being able to hide the details of what is really going on points to an inability to create the proper abstraction, which in my opinion, should be water tight.
It is not a good thing that we can see what is really going on. No, it is actually a very bad thing, because it forces us to think at the higher level some of the time, but have to switch gears to drop down to the lower level some of the time.
Leaky abstractions prevent us from being able to comfortably drive the kids to school; we have to constantly keep looking in our rear view mirror and checking to see if those rowdy kids who insist on sitting at the back of the bus are causing trouble again.
A good abstraction hides the details so well that we don’t ever have to think about them. A good abstraction is like a sturdy step on a ladder, it allows us to build another step above us and climb up it. A leaky abstraction is like a wobbly rotted wood rung; put too much weight on it and you’ll fall through—it traps you at the current step—you can’t go any higher.
Shouldn’t we understand how things really work?
No! Stop already! Enough with the guilt trips and unnecessary burdens. You are defeating the purpose of abstractions and stunting further growth.
You only “need” to know the details if you want or need to know the details.
I know that many people want to know how things really work, myself included for many things, but in many cases I don’t need to understand how things really work.
For example, I can buy and use a watch without understanding how the watch works. It really is better that way. I don’t need those details floating around in my head. I get no benefit from knowing what is going on inside my watch and frankly, I don’t want to know.
When I buy a car, I just want to drive it. I don’t want to remember when to rotate tires, check fluids, etc. Waste of my time. I don’t want to know how the engine works or anything about it besides I have to put gas in, turn a key, and press a few pedals to make the thing go. I know I could save money servicing my own car, but you know what? I’d rather pay money than have to do it myself. And since that is the case, as long as I can find someone I trust to work on my car, I don’t want that abstraction to leak.
The same with machine code and compilers. Yes, it is interesting to understand how memory mapping and CPUs and logic gates work and how my C# code gets compiled into MSIL which is interpreted by the CLR and so forth and so on, until it ends up as electrons operating on registers in my CPU. And I learned about how much of that works, because I was interested, but let’s be completely honest here. That abstraction is really good. It is so good and so air-tight, that I don’t need to know all of that crap in order to write a web app. If I did, I’d probably find another profession, because my brain would explode every time I tried to write the simplest piece of code.
By the way, if you do want to understand what is happening behind many of those abstractions, I highly recommend Charles Petzold’s book: “Code: The Hidden Language of Computer Hardware and Software”. (This is the kind of book you read for “fun,” not because you need to know this information.)
Leaking abstractions in technology
Too many technologies and frameworks today are leaky, and this is bad– really bad.
Want an example? Whatever happened to ORM (Object Relational Mapping?)
Nothing. Nothing happened. It was invented, oh, like thousands of years ago and while lots of other technologies and frameworks grew up and got jobs and left the nest, ORM is still a fat unemployed balding 30 year old man playing WOW all day in his mum and dad’s basement.
Don’t get me wrong. I use ORMS; I sometimes like to think I like ORMS. But, I can’t use them well because I have to understand SQL and what is going on in my relational databases anyway. What good is that?!
Seriously. This is a completely failed abstraction. The leakiness of it makes it actually more work to learn an ORM framework, because in learning one you must also learn hairy details of SQL and understand what N+1 means and so on. You can’t just fire up your ORM and put in the Konami code, you have to blow the cartridge out and fiddle with the reset button half a dozen times. Useless.
I could go on and on with examples of leaky abstractions in technology and how they are holding us back, but I think you get the point by now.
Leaky abstractions are bad. They don’t do anything good for us besides require us to know more and add more points where things can go wrong. We’ve got to seal up these abstractions, if we want to be able to keep climbing the software development ladder. You can’t keep building a tower if you have to worry about every single block beneath you. You eventually get to a point where you can’t keep it all straight and your tower comes crashing to the floor.
Leaky abstractions and you
So, what can you do about it?
You really have to make a choice. Am I going to hide the details so that no one needs to know about them, or am I going to expose the whole thing? Either make a really good abstraction that doesn’t require someone using your code to understand what is going on underneath that abstraction or don’t attempt to make an abstraction at all.
It takes more effort to create good abstractions. Sometimes the effort required is orders of magnitude greater. But, if other people are going to use your abstraction, to save the mental burden of understanding what is happening beneath it is going to go a long way in making up for your extra effort.
10 Steps to learn anything quickly course
By the way, I just released by 100% 10 Steps to Learn Anything course. If you want free access to it, just signup to my newsletter here. It is a totally free video course and optional workbook that reveals the process I use to quickly learn technologies.
I have another new course on Pluralsight, check out:
I am very excited to finally get this course out. Many viewers have been asking me to make a comprehensive course that actually shows you how to build a real automation framework, and I finally did it in this course.
I reveal all my tricks and tips here and tell you exactly what I have done in the past to build successful automation frameworks. It was a lot of fun to create this course and it brought me back to the glory days of creating automated tests. (This is some serious fun!)
Anyway, check out the course and let me know what you think.
Here is the official course description:
Learning how to use a tool like Selenium to create automated tests is not enough to be successful with an automation effort. You also need to know how to build an automation framework that can support creating tests that are not so fragile that they constantly break. This is the real key to success in any automation effort.
In this course, I will reveal every secret I know from creating several successful automation frameworks and consulting on the creation of others. I will show you exactly, step-by-step how to create your own automation framework and I will explain to you the reasoning behind everything we are doing, so you can apply what you learn to your own framework.
We’ll start off this course by going over the basics of automation and talking about why it is so important as well as discuss some of the common reasons for success and failure.
Then, I’ll take you into the architecture of an automation framework and show you why you need to pay careful attention to the structure of any framework you build and give you some of the underlying design principles I use when creating an automation framework.
After that we’ll be ready to start creating a framework. In the next few modules, I’ll show you how to create a real automation framework capable of automating the WordPress blogging platform administrative console. We’ll start off by creating smoke tests and using those smoke tests to build out our initial framework.
Then, we’ll expand the capabilities of our framework as we create more tests and learn how to use techniques like dummy data generators to make our tests as simple and easy to read as possible.
Finally, I’ll take you through some best practices and tips that cover topics like scaling out, working in Agile environments and other important issues you are likely to face. If you are responsible for an automation project for a web application or you want to start using automation, you’ll definitely want to check this course out.
(Update: My view is starting to change from when I first wrote this article. Thanks to numerous people who have pointed out many things I overlooked and have helped me to understand what responsive design really is and what goals it is trying to achieve, I am getting a better understanding of why it may indeed not be a waste of time. There are still some valid points that I brought up in this article, but I don’t stand by my original blanket statement that it is a waste of time. It especially seems like responsive design may be worth the “trouble” for ecommerce sites where you are more much likely to make the sale if someone can easily purchase and browser your products from a mobile device. Thanks everyone who has been giving me great advice and resources for better understanding responsive design and for mostly doing it in a “nice” way.)
Response design seems like a good idea.
At first look, it makes sense. Why wouldn’t you want your same website to scale down to a mobile device and display differently?
For example, there are many WordPress themes now that offer responsive design baked right in. If you are using one of those themes, you might as well take advantage of the hard work someone else has put into making the design responsive.
But, if you are creating a new website or you are considering converting your existing website to a responsive design, I’d suggest you think twice before making the plunge.
The extra work of responsive design
The biggest problem with responsive design is the extra work involved. Sure, there are frameworks and tools to make responsive design easier, but no matter how good those frameworks and tools get, to make a site adapt from the desktop to a mobile device, it is still going to require significant work.
If you have a really simple site with just a few pages or pages that pretty much all use the same layout, but just differ in content (like a blog for example,) then the work might not be that much, so go ahead, make it responsive.
But, if you have a website of any considerable size, you’ll probably find that responsive design is going to add a non-trivial amount of extra work to your project.
It actually turns out the adding responsive design to a web site can add more work than creating a separate desktop and mobile site, because you are creating a user interface that morphs or adapts to the mobile form factor–which, by the way, is ever-changing.
The problem with one-size-fits-all
Consider this. Which is more difficult to create: a car that turns into a motorcycle, or a car and a motorcycle?
Here is another one. Which is more difficult to create: a multi-function printer or a printer and a scanner?
When you try to make a product that can do multiple things, the complexity of creating that product increases exponentially. It is almost always easier to create two separate products that do two different functions or have different form factors, than it is to create an all-in-one product that is able to do many things.
Besides the increased difficulty, trying to create products that do more than one thing usually results in them not doing either function that great.
I’m not saying you can’t design a great responsive website that serves desktop clients and mobile clients wonderfully and makes their eyes light up with joy every time they see your beautiful site. I’m just saying you won’t.
I’d also rather not be handcuffed to my original design because changing anything will incur the cost of making changes to the reactive layout as well, that scales down to mobile.
The reuse fallacy
What are you really trying to do by creating a reactive design, instead of just creating a desktop version and mobile version of the site? Reuse some HTML and CSS?
When you think about it, it doesn’t make much sense. I’m all for code reuse, but responsive design is like putting ifdefs all over your code base. Are you really saving much time and effort by reusing your front end code for both platforms?
It seems to me, that if you have to have your site display differently on a mobile device you are better off just forgetting about trying to reuse the HTML markup and CSS, and instead focus on reusing the backend code for both the mobile and desktop versions of your site; that is where you’ll actually get the biggest bang for your buck.
Again, I don’t know about you, but I’d much rather maintain two front end codebases that are simple than one monstrous complicated front end codebase.
It is good enough already?
Most desktop websites actually look fine on a mobile device without any reactive design. To me it makes much more sense, in most cases, to focus on just creating a desktop design that looks good on mobile or even a mobile design the looks good on the desktop as well.
Sure, responsive design is pretty cool, but do you really need it?
It is the browser’s responsibility anyway
Aside from the practical reasons I’ve talked about so far, there is another major reason why responsive design might just be a waste of time anyway: this should all be the browser’s responsibility, not the web developer’s.
This is even already starting to take place. Consider the special input types in HTML5, like the number input type. On a mobile device it brings up the numerical keyboard in mobile browsers.
I predict as more and more websites are viewed from mobile devices, the types of things that web designers are doing to make desktop sites responsive right now will be incorporated directly into the HTML markup and the browser itself. So, creating a responsive design manually might end up being a big waste of time.
I’ll leave you with one final point, then I’ll get off my responsive design soapbox.
We are in a time of constant flux right now
Things are changing rapidly. Mobile devices are getting bigger screens and higher resolutions and becoming much more powerful. While two years ago it was a pain to visit a desktop site from a mobile browser, it is much less painful now. With my new Galaxy S4, I hardly feel the difference at all and I even find myself specifically requesting the desktop version of a mobile site from time to time.
The trend says that mobile browsing will eventually take over desktop browsing as the main source of visitors to websites. If this becomes reality—which I see no reason to doubt—responsive design won’t make much sense at all. In that case, it will make much more sense to build websites optimized for mobile and not really worry so much about desktop.
If you are going to do responsive design, at least get a good book on the subject. There are many subpar books floating around, because responsive design is so hot right now.
What do you think? Am I off my rocker? If you have a convincing argument as to why responsive design is worth the effort or why it isn’t as much extra effort as I have painted it to be, I’d love to hear it. Let me know in the comments below.
The goal of software development is to solve problems.
At its heart, software development is really about solving problems through automation.
Many times we have the tendency to make software development about creating solutions. It may seem that these are the same thing, but there is a subtle difference.
The difference is the focus
When we are trying to solve problems, we are obsessed with the question of why. It is only by understanding why that we can know what to build.
When we are trying to build a solution, we are obsessed with the question of what. We want to know what to build so that we can build it.
This is fairly natural for software developers since the what is something we can control. Any fairly skilled software developer can build any what you can describe to them. Just like any skilled carpenter can build any type of wooden furniture you desire.
The difference is that software development is about more than just building things. A carpenter doesn’t have to focus so much on the why, to build you a piece of furniture. But a truly skilled carpenter will ask you why. A skilled carpenter might make a house call to the project to custom build an entertainment center suited to your needs.
A truly skilled craftsman of any craft, needs to know why.
It is odd that we constantly seem to neglect the why when building software (at least I do), primarily because we think we don’t have time for the why, when in the end solving the why is the only thing that really matters.
Do you really want a carpenter to build you a perfectly crafted entertainment center with plenty of custom shelves for all your music CDs when you don’t own any music CDs?
Why focusing on what doesn’t work
Consider for a moment what would happen if your GPS system stopped showing your route and the directions you were going to take ahead of time, but instead abruptly told you to turn right when you were supposed to turn.
One of the early GPS systems I used for navigation did exactly this. It was very frustrating! I would constantly shout at it that telling me what to do exactly when I am supposed to do it wasn’t any help. I needed some forewarning so I could switch lanes and prepare to turn.
Also, have you ever asked your GPS the question “why are you taking me this way?”
Focusing only on the what results in split-second decisions and avoidable mistakes. When you are focusing on the what you are not thinking, just doing.
Contrast this with a good GPS navigation system.
What makes a good navigation system good?
I’ve found that good systems will alert me far in advance to what my next move is. This gives me time to switch lanes or to have a better understanding of the bigger picture of the trip.
Now this allegory won’t carry us very far in the software development world. You don’t really need to know why your GPS system is taking you a particular route in order to get there, but it does demonstrate how focusing only on the immediate what can lead you astray.
Perhaps a more direct analogy related to software development is outsourcing.
If you’ve ever worked on a software development project that has had a portion of its work outsourced, you may have felt the pain of what happens when oftentimes perfectly competent programmers focus completely on the what and might not even have the slightest clue about the why.
I’m not trying to knock outsourcing, and I’m not even specifically talking about outsourcing to different countries. (The same problems exist in outsourcing whatever the conditions are.)
Oftentimes outsourced projects are treated like a set of blueprints that need to be built. The poor developers working on the outsourced project tend to get thrown under the bus when they build something based off just the what and can’t anticipate the why.
Why focusing on why is better
Focusing on the why is focusing on providing holistic cures to software development problems rather than treating each symptom individually.
Just like you’d like to have a doctor set a broken leg and put it in a cast so that it can be wholly cured, rather than give you some pain medicine to manage the pain, give you crutches to help you walk, and send you to therapy where you can learn to live without the use of your leg, your customers would probably rather you built something that actually is focused on solving their problem, not something that makes their symptoms easier to live with.
If we start building software from the what that is given to us, versus the why, we are at a distinct disadvantage because we only have the ability to treat the symptoms of a problem that we don’t even understand.
Focusing on the why is vitally important because it helps us to better think about and design the what.
Reuse is also much more likely to be present on why focused solutions than what focused ones.
Which software component is more likely to be adopted for reuse: a component that addresses a specific set of account balancing steps or a component that addresses the general problem your customer has of doing accounting?
Why focused solutions are much more likely to be shared across many customers, while what focused solutions are often very specific to a single customer.
Let’s not forget one of the most important reasons why it is important to focus on the why.
Missing the why is costly!
You can build the what someone is asking for, and you can do it perfectly, but at the end of the day if what you built doesn’t solve the problem (the why,) it’s going to have to be redone or thrown out completely.
You may think you are saving time by starting with the what, and you may actually build your solution faster by not taking time for everyone on the project to understand the why, but chances are you’ll build the wrong thing, and when you do, you’ll lose any time benefit you may have accrued.
It is very hard for a team to focus on the what. Focusing on the what tends towards segregating the responsibility of the team. The team members end up only having ownership for their own parts of the solution, rather than solving the problem at whole. So, when a team that focuses on the what ends up with a problem, finger pointing a shirking of responsibility inevitably ensue.
By instead focusing the team on the why, every member of that team becomes responsible for solving the problem rather than solving their part of the problem.
The most compelling reason to focus on the why?
For many problems, just having a thorough understanding of the problem takes you 90% of the way to the answer. I have spent countless hours trying to design and architect a clever solution to a problem without having a really good grasp of the problem that I’m trying to solve, only to find that 20 minutes spent actually fully understanding the why of the problem resulted in the answer becoming completely apparent.
How to know if your focusing on what
I’ve got a pretty good idea of how to know when I am focusing on what instead of why. Not because of any special wisdom on my part. No, it is because I fall down in this area quite a bit myself.
I’ve learned the hard way, and I am still learning how to identify when my focus needs to shift from what to why.
Here are some simple questions that you may find helpful to evaluate your own situation. I’d be glad to hear any others that you can can think of or use routinely yourself.
- Do you’re backlogs, stories or general work items state a solution instead of a problem?
- Do you know how to actually use the thing you are building? (Whoops, I am so guilty of this one so many times.)
- Can you define in simple terms the problem you are trying to solve? (Einstein said “If you can’t explain it simply, you don’t understand it well enough.”)
- Can you clearly define why the what your are building is needed and what purpose it will solve?
- Can you walk through the manual steps of the process you are automating, understanding each step and why it is necessary?
Taking the time to focus on the why instead of jumping into the what can be difficult to get accustomed to, but it is worth the effort. You’ll find much more satisfaction in your work when you understand the purpose it is going to serve, the true problem you have solved. In addition your customers will thank you!
What slows down the development of software?
Think about this question for a bit. Why is it that as most software evolves it gets harder and harder to add features and improve its structure?
Why is it that tasks that would have at one point been simple are now difficult and complex?
Why is it that teams that should be doing better over time seem to get worse?
Don’t feel bad if you don’t have an immediate answer to those questions. Most software practitioners don’t. They are hard questions after all.
If we knew all the answers, we wouldn’t really have these problems to begin with.
Regardless though, you’ll find many managers, business owners, customers and even software developers themselves looking for the answers to these questions, but often looking in the wrong place.
Process is almost always the first to be blamed. It stands to reason that a degradation of process or problems with the software development process are slowing things down.
Often there is some merit to this proposition, but I’ve found that it is often not the root cause. If your team is not sitting idle and the work that is important is being prioritized, chances are your process is not slowing you down.
Now don’t get me wrong here. I am not saying that these are the only two important aspects to judge a software development process, but I am saying that if generally your team is working hard on important stuff most of the time, you can’t magically improve process to the point of increasing productivity to any considerable order of magnitude. (In most cases.)
Often questions are asked like:
- Should we pair program or not pair program?
- Should we be using Scrum instead of Kanban?
- Should we be changing the way we define a backlog?
- Should we use t-shirt sizes or story points or make all backlogs the same size?
- Do we need more developers or more business analysts?
- Do we need to organize the team differently?
Now these are all great questions that every software project should constantly evaluate and ask themselves, but I’ve found over and over again that there is often a bigger problem staring us in the face that often gets ignored.
Let’s do a little experiment.
Forget about process. Forget about Scrum and backlogs and story points and everything else for a moment.
You are a developer. You have a task to implement some feature in the code base. No one else is around, there is no process, you just need to get this work done.
It might help to think about a feature you recently implemented or one that you are working on now. The important thing with this experiment is that I want to take away all the other “stuff” that isn’t related directly to designing and implementing that feature in the code base.
You will likely come to one of these conclusions:
1. The feature is easy to implement, you can do it quickly and know where to go and what to modify.
Good! That means you don’t really have a problem.
2. It is unclear what to do. You aren’t sure exactly what you are supposed to implement and how it fits into the way the system will be used.
In this case, you may actually have somewhat of a process problem. Your work needs to be more clearly defined before you begin on it. It may be that you just need to ask more questions. It may be that half baked ideas are ending up in your pipeline and someone needs to do a bit more thinking and legwork, before asking a developer to work on them.
3. Its hard to change the code. You’ve got to really dig into multiple areas and ask many questions about how things are working or are intended to work before you can make any changes.
This is the most likely case. Actually usually a combination of 2 and 3. And they both share a common problem—the code and system do not have a design or have departed from that design.
I find time and time again with most software systems experiencing a slow down in feature development turnaround that the problem is the code itself and the system has lost touch with its original design.
You only find this problem in successful companies though, because…
Sometimes you need to run with your shoelaces untied
I’ve consulted for several startups that eventually failed. There was one thing in common with those startups and many other startups in general—they had a well maintained and cared for codebase.
I’ve seen the best designs and best code in failed startups.
This seems a bit contradictory, I know, but let me explain.
The problem is that often these startups with pristine and well maintained code don’t make it to market fast enough. They are basically making sure their shoes laces are nicely tied as they stroll down the block carefully judging each step before it is taken.
What happens is they have the best designed and most maintainable product, but it either doesn’t get out there fast enough and the competition comes in with some VB6 app that two caffeine fueled not-really-programmers-but-I-learned-a-bit-of-code developers wrote overnight or they don’t actually build what the customer wants, because they don’t iterate quick enough.
Now am I saying that you need to write crap code with no design and ship it or you will fail?
Am I saying that you can’t start a company with good software development practices and a clean well maintainable codebase and succeed?
No, but what I am saying is that a majority of companies that are successful are the ones that put the focus on customers and getting the product out there first and software second.
In other words if you look at 10 successful companies over 5 years old and look at their codebase, 9 of them might have some pretty crappy or non-existent architecture and a system that departed pretty far from the original design.
Didn’t you say something about pulling up your pants?
Ok, so where am I driving at with all this?
Time for an analogy.
So these companies that are winning and surviving past year 5, they are usually running. They are running fast, but in the process of running their shoelaces come untied.
They might not even notice the shoelaces are untied until the first few times they step on one and trip. Regardless they keep running. And to some degree, this is good, this is what makes them succeed when some of their failed competitors do take the time to tie their shoelaces, but those competitors end up getting far behind in the race.
The problem comes pretty close to after that 5 year mark, when they want to take things to the next level. All this time they have been running with those shoelaces untied and they have learned to do this kind of wobble run where they occasionally trip on a shoe lace, but they try to keep their legs far enough apart to not actually step on a shoelace.
It slows them down a bit, but they are still running. Still adding those features fast and furious.
After some time though, their pants start to fall down. They don’t really have time to stop running and pull up those pants, so as they are running those pants slip further down.
Now they are really running funny. At this point they are putting forth the effort of running, but the shoelaces and pants are just too much, they are moving quite slow. An old woman with ankle weights power walks past them, but they can’t stop now to tie the shoelaces and pull up those pants, because they have to make up for the time they lost earlier when the pants first fell down.
At this point they start looking for ways to fix the problem without slowing down and pulling up the pants. At this point they try running different ways. They try skipping. Someone gets the idea that they need more legs.
I think you get the idea.
What they really need to do at this point though is…
Stop running, tie your shoes and pull up your pants!
Hopefully you’ve figured out that this analogy is what happens to a mature system’s code base and overall architecture.
Over time when you are running so fast, your system ends up getting its shoelaces undone, which slows you down a little. Soon, your system’s pants start to fall down and then you really start to slow down.
It gets worse and worse until you are moving so slow you are actually moving backwards.
Unfortunately, I don’t have a magic answer. If you’ve gotten the artificial speed boost you can gain from neglecting overall system design and architecture, you have to pay the piper and redesign that system and refactor it back into an architecture.
This might be a complete rewrite, it might be a concerted effort to get things back on track. But, regardless it is going to require you to stop running. (Have you ever tried to tie your shoelaces while running?)
Don’t feel bad, you didn’t do anything wrong. You survived where others who were too careful failed. Just don’t ignore the fact that your pants are at your ankles and you are tripping over every step, do something about it!
I worked on an interesting problem this week that might have looked like I was running around in circles if you just looked at my SVN commits.
The problem, and the eventual solution, reminded me of an important part of software development—of building anything really.
Sometimes you must tear it down!
No really, sometimes you build a structure only to tear it down the very next day.
It’s not a mistake. It is intentional and productive and if you are not doing it, you very well might be making a real mistake.
That is highly illogical
Imagine for a moment that you are tasked with the job of repairing the outside walls of a 2 story building.
There are of course many ways you could go about doing something like this.
Use a ladder and just reach the part of the wall the ladder allows you to. Then move that ladder, repeating the process as needed to complete the repair to the entire wall.
Try to lower yourself down from different windows to reach as much of the wall as possible.
Tear down the entire building and rebuild the building and walls.
I am sure there are plenty of other methods besides what I listed here.
Yet, a very simple approach would be to build a scaffolding.
A scaffolding is basically a temporary construction used just to help repair or initially build a building which is built for the very purpose of eventually being torn down.
Software is different, we don’t contend with physics!
You are absolutely right!
We contend with a much more powerful force…
Conceptually anything you can create in software could be created without any kind of software scaffolding. Unfortunately though, the complexities of the logic of a system and our abilities as humans to only contain so much of it in our brains impose a very real limitation on our ability to even see what the final structure of a metaphysical thing like software should be.
So what am I saying here then?
I’m just saying that it sometimes helps to remember that you can temporarily change some code or design in a way that you know will not be permanent to get you to a place in the codebase where your head is above the clouds and you can look down and see the problem a little better.
Just like there are physical ways to repair a wall on a 2 story building that don’t involve wasting any materials by building something that will be torn down, there are ways to do build software without ever building scaffoldings, but in either case it is not the most efficient use of time or effort.
Don’t be afraid to take a blind hop of faith
Source control is awesome!
Source control lets us make changes to a code base, track those changes and revert them if needed.
Working on a code base without source control is like crawling into a small crevice in an underground cave that has just enough room to fit your shoulders—you are pretty sure you can go forward, but you don’t know if you can crawl back out.
But, with source control, it is like walking around a wide open cave charting your trail on a map as you go. You can get back to where you want to any time.
So as long as you have source control on your side, and especially if you are using a distributed source control where you can make local commits all day long, you don’t have to be afraid to step out on a software ledge and see where things go.
Oftentimes I will encounter some problem in code that I know needs to be fixed and I know what is wrong with it, but I just don’t know how to fix it.
When I encounter a problem like this, I will often have an idea of at least one step I could take that should take me in the direction I believe the code needs to go.
A real example
To give you a real example, I was working recently on some code that involved creating a lot for a product. (Think about the lot number you might see on a box of Aspirin that identifies what batch of ingredients it came from and what machine produced it when.)
Up to the point in which I had been working with my teammate on refactoring this code, there had only been only one way to produce the lots.
We were adding some code that would add an additional way to create the lots and the labels for those lots. We also knew that there would be more additional ways in the future that would need to be added.
Well, we knew we wanted to have a generic way of specifying how lots should be created and labeled, but we didn’t know what the code that would do that would look like.
We could have rewritten the entire lot handling code and made it generic, but it was quite complex code and that would be equivalent to tearing down a building to fix a wall.
It was apparent there were some parts of the existing code that were specific to the way that the current lots were being produced, so we knew those parts of the code had to be changed.
So what did we do?
We started with what we knew needed to be changed and what we thought would move us in the direction we wanted to go, and we built some structures to allow us to refactor those parts of the code, knowing that we would probably end up deleting those structures.
The scaffolds that we created allowed us to have a midway point in the long journey across the Pacific ocean of complexity in our refactor trip.
The point here is that we had to take a bit of hop blindly in a direction we thought things needed to go and part of that hop involved creating some scaffolding type structures to get that code to a place where it still worked and we could examine it again for the next refinement.
The final refinements ended up deleting those scaffolds and replacing them with a more elegant solution, but it is doubtful we would have been able to see the path to the elegant solution without building them in the first place.
Show me the code!?
You may wonder why I am talking about code in such an abstract way instead of just showing you a nice little “scaffold pattern” or a real code example.
I’m not going to show the code, because it isn’t relevant to my point and it is so situational that it would detract from what I am trying to say here.
The point is not what our problem or solution was, but how we got there.
There isn’t a “scaffolding” pattern that you can just apply to your code, rather it is a tool that you can use and shouldn’t be afraid to use in order to move your code forward to a better design. (To its happy place.)
Are you really sure you want to create the cancel button for your application?
You know, I might click it.
Not only might I click it, I might click it at the most inopportune time.
I might click it right in the middle of that large file that you are copying, right after you spun up that 2nd thread.
Cancel is a commitment
The next time you consider creating a cancel button, I suggest you think of it as a commitment.
In my world the cancel button has two promises.
- Stop what is currently happening in a non-destructive way.
- Stop RIGHT the F NOW!
I’ve been encountering a good deal of cancel button which don’t obey the basic laws of cancelling that any user would expect.
I have actually been randomly clicking “cancel” whenever I see it, much to my coworker’s dismay.
I started doing this, because I wanted to see how widespread the cancel cancer is.
And it is pretty widespread. A majority of cancel buttons I have been testing neither stop right away or prevent destruction.
I found instead that clicking the “cancel” button in most programs is sure to hang that program for an extended period of time, ultimately causing you to kill the process, and it tends to leave processes half done without rolling them back.
Clearing things up
Let me be a bit clear here. I am not talking about cancel buttons you see in an “Ok / Cancel” dialog. Most of the time those cancel buttons actually work, because they are really operating as “back” buttons, they aren’t actually cancelling a process that is happening live.
I am talking mainly about cancel buttons that cancel an active ongoing activity. For example, cancel buttons in an installer or during an SVN update.
We could call these kinds of cancel buttons “asynchronous cancel buttons.”
But, I need to provide a way for the user to cancel
Good, but don’t lie about it.
There are certain things that just can’t be cancelled.
When I get on a plane, I can cancel my trip when I am sitting there waiting for the door to close. I can even cancel my trip while the plane is taxing to the runway, if I yell “I have been RAPED by a BOMB that is on FIRE!”
But, I can’t cancel my trip, once the plane has lifted off the ground. If I try to cancel it then… well, bad things would happen. Very bad things.
So how come when I am installing your software, or your software is updating its database, I have this shiny little cancel button I can click at any time during that process?
Surely I cannot cancel just any time!
Surely there are parts of the process that cancelling would be fatal or it is too late to rollback.
My point is, if the user truly can’t cancel, don’t present a button that says they can. More specifically, if you can’t obey the laws of cancel 1 and 2 from above, don’t even show a button. Just say “sorry, you can’t cancel this process at this time.”
I don’t even need to know why I can’t cancel. I mean, it will make me feel better if you tell me that the Unicorn Glitter Engine is in a critical state and any disruptions could end life as we know it, but I’ll settle for you just greying out that cancel button or not displaying it at all.
Putting back on the developer hat
I’m guilty of it myself. I know I have created cancel buttons in the past that have caused pain and anguish.
But what can we do about it as developers?
First off, we should be thinking carefully about breaking a long running process into steps. At each step of the way we should consider if it is conceivable to cancel that step without destroying data and hanging the application.
In any long running process, we should be able to identify certain parts which are cancellable and those which do not make sense to cancel.
It is your job as the developer to ensure that if you decide to allow for cancelling that the cancel undoes the existing process and work immediately.
I cannot stress this enough!
This is the behavior that most users expect and the very meaning of the word cancel.
To do this might take extra work. You might have to think a bit more about the threading situation of your application. You might have to think about rollback situations. But, if you don’t do it, your cancel button will become just like the boy who cried wolf and no one will believe them.
And if you’re not willing to go through this effort, at the very least, be kind enough to your users to just remove the cancel button, because you can bet I’ll be clicking it!