Recently I wrote code for a sorting problem. At the time I didn’t realize it but it’s a called the “Dutch Flag Problem” and Dijkstra came up with it back in 1976 in his book “A Discipline of Programming”.
The problem is as follows:
Given an array known to contain only the numbers 0, 1 and 2 sort it in O(n).
So for example if you had the input:
Then the output would be:
The time complexity requirement rules out the possibility of using an untuned general sort from most languages standard library (most sorts of that sort are O(n log n) or thereabouts).
There are actually 2 ways of solving the problem.
The first method of solving it is sort of cheating but is simpler; Just iterate the array and maintain a hash of (digit => count). This will get you the information you’re after but you didn’t really sort the data. (In a real world scenario this is probably sufficient though since you probably aren’t going to actually want to iterate the same digit N times if you know there are N of them in the array.)
The 2nd method of solving the problem is the more interesting one; Given that we have known constraints on the incoming data we can develop an algorithm specifically tuned for the situation. The problem becomes one of partitioning:
1 def sort(data) 2 return nil if !data 3 answer = data.dup 4 return answer if answer.length <= 1 5 6 left = 0 7 mid = 0 8 right = answer.length - 1 9 10 while mid <= right 11 suspect = answer[mid] 12 if suspect == 0 13 answer[mid], answer[left] = answer[left], answer[mid] 14 left += 1 15 mid += 1 16 elsif suspect == 2 17 answer[mid], answer[right] = answer[right], answer[mid] 18 right -= 1 19 else 20 mid += 1 21 end 22 end 23 24 answer 25 end
The general idea here is that we iterate through the array once and during that iteration if the current element under inspection is a 0 then we kick it to the left just beyond where a 0 was last placed. Otherwise if it’s a 2 then kick it to the right just before where a 2 was last placed. In the end you get the partitioned data.
Tonight was the monthly Seattle.rb meetup, held monthly on the first tuesday of the month at Substantial’s office in Seattle’s Capitol Hill neighborhood. Something occurred to me while watching tonight’s talks; It seems obvious now but the talks that I took the most away from were generally the ones that used comedy to educate.
I come from the .NET world (I’m new to the Ruby world) and one of the louder figures from that world is Scott Hanselman and he definitely uses comedy to make his talks more interesting. I’ve met Scott a few times at local Nerd Dinner meetups and his comedy is actually just part of his personality. I suspect others who give talks don’t have this intrinsic comedy personality but try their best to include it because it really is a great way to keep your audience engaged.
It’s 2013 so I hope most of you by now believe in having clear method/class/function/variable names, but in case not I present to you the following arguments.
The point of naming is abstraction for humans. Naming is not abstraction for compilers because the compiler isn’t going to care about names at all. Naming is not just some procedure invented by language designers to make you box up your code. I’ll say it again: The point of naming is abstraction for humans.
Let’s say someone on your team submits a code review to you and you open it up and find this method call:
Your teammate probably didn’t give this line of code a 2nd thought. Why would he? To him it’s entirely obvious what’s going on because he’s the one who wrote the code. But is it obvious to you? Do you know what ‘process’ means in this case? What are we doing to ‘foo’? What is being done to the system with our ‘foo’ that we passed in?
The problem with the above code is that it failed to abstract the inner concepts from humans. The practical implication of this poorly named method is that you and every other programmer who comes across it will have to dive inside it to understand what it’s doing. The method name wasn’t clear so you don’t know what you’re getting into when you call it. It’s as if there is no difference to having no method at all and just inlining it’s contents.
What if the above code instead looked like this:
Do you need to dive into the method to understand it now? No, because the abstraction is clear. You don’t need to know the specifics of the internals of the method because the person who wrote it wanted you to understand what was going on by expressing it in a clear name. Time is saved because you now have less surface area of the codebase to grok and the implementation of that method is free to be refactored so long as it upholds the contract.
Lot’s of people bandy around this idea of Technical Debt which sounds great because it lines up to the metaphor of using credit wisely. It’s become such a common phrase that we hardly even think much about its true meaning and expect everyone to nod in agreement at its wisdom. But what does technical debt really mean? Does it mean that it’s ok to write a bunch of crappy code and add a bunch of complexity because “we’ll clean it up later” ? That’s the commonly accepted meaning. I think that’s wrong.
Let’s look at the analogy of credit usage again. When you use your credit card, do you just spend it irresponsibly on a bunch of stuff you ought not to? Do you charge that on your credit card do you just say ”I’ll clean this up later” ? Well you might if you’re used to making poor financial decisions, but if you are smart you make sure that you are using your credit for wise purchases and that you’ll actually be able to pay that debt off quickly (because you know carrying a balance forward is really bad for your credit score). From the time you incur the debt until the time its payed off you are extremely diligent about paying it to zero, and you have an active plan for doing so.
Are there parallels that we can see in software? I think so. If it’s not OK for the person who has maxed out their credit cards on bad purchases, is it ok for you to do the same with your code? I don’t think so. All you’ll end up doing is bankrupting your code base (and who knows, as a startup potentially your business).
I’ve been meaning to write this blog post for a very long time because I spend a pretty significant amount of my waking hours pondering the subject. This is a post about The Great Divide in software development between the 2 types of programmers in the world. I’ll nickname them Bob and Joe.
Bob is a smart guy. He’s got intimate knowledge of a great deal of frameworks, libraries, algorithms, and data structures. His natural mode of operation is to find smart or clever solutions to problems and to use lots of tools and techniques to do so. He is prolific with abstractions and grand solutions but light on testing. It’s sometimes hard to read and understand his code, but nobody can argue that its usually correct and performant.
Joe is also a smart guy. He’s got a lot of experience in building software and working with other people’s code. His natural mode of operation is to write clean code that’s optimized for human readability and long term maintainability. When choosing tools and libraries he tends to go with the simplest solution possible. When submitting code it’s not uncommon to see his commits remove more code than they add. His code is always tested because he uses a test-first (TDD) approach.
How valuable is smart/clever/complex code if the author, Bob in this case, is the only one who can adjust the value it adds to the organization? It’s certainly valuable for Bob because any time the code needs adjusting or maintaining he gets pulled in to deal with it, but that’s not a scalable situation. What if the code was written for humans instead of the compiler? What if the code was written in such a way that 2012-09-25-software-craftsmanship-and-company.html 2012-09-25-software-craftsmanship-and-company.html.md developer could come along and know inside 30 seconds exactly what it did and how to extend or change it? The value of that code is, in my humble opinion, orders of magnitude more valuable to the organization, and it’s author (Joe) much more valuable in the long run.
The thing about Bob and Joe is that there are vastly more Bobs in the world than there are Joes. Most companies today have interview processes that are heavily geared towards measuring ability with data structures and algorithms but very light on finding out if the candidate writes code that others find easy to work with. I’ve almost never heard of an interview where the interviewer wanted to find out how well the candidate writes tests, or decomposes functions, or determines if the candidate prefers simplicity over complexity. I think that’s a real shame. I also think it results in hundreds of thousands of wasted man-hours spent unwinding Bob-authored code, or maintaining systems that are so convoluted that its impossible to extend or change them.
So why does this trend of hiring and dealing with the Bob’s of the world continue if it’s so damaging? We as an industry haven’t made the connection between quality code (simple and easy to work with) and competitive advantage. The reason is because organizations don’t feel the pain of complex/poor code until years after the code is authored by a Bob, and therefor do not connect the pain with that type of developer.
Our industry is very young (relatively speaking, to other industries) and we have a lot of room to grow yet. Movements like Software Craftsmanship are starting to steer us in the right direction. I think we’ll eventually figure this stuff out, with people like Uncle Bob (blog), Martin Fowler and Ward Cunningham leading the way. I think in the meantime we’ll just be working towards bridging The Great Divide and trying to get organizations to connect the pain to its source.
I’d love to hear your thoughts on this in the comments.
“Any darn fool can make something complex; it takes a genius to make something simple.” -Pete Seeger
“If you can’t explain it to a six year old, you don’t understand it yourself.” -Albert Einstein
“Simplicity is the ultimate sophistication.” -Leonardo da Vinci
“Our life is frittered away by detail. Simplify, simplify.” -Henry David Thoreau
“Any intelligent fool can make things bigger, more complex, and more violent. It takes a touch of genius — and a lot of courage to move in the opposite direction.” -E.F. Schumacher
“The greatest ideas are the simplest.” -William Golding, Lord of the Flies
“Besides the noble art of getting things done, there is the noble art of leaving things undone. The wisdom of life consists in the elimination of non-essentials.” -Lin Yutang
“Nature is pleased with simplicity.” -Isaac Newton
“Simplicity is prerequisite for reliability. ” -Edsger Dijkstra
“The sculptor produces the beautiful statue by chipping away such parts of the marble block as are not needed – it is a process of elimination.” -Elbert Hubbard
“Three Rules of Work: Out of clutter find simplicity; From discord find harmony; In the middle of difficulty lies opportunity.” -Albert Einstein
“Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away.” -Antoine de Saint-Exupery
(c) 2013 Ben Lakey
The words here do not reflect those of my employer.