Code should be short, simple, and to the point. This is particularly true of methods on a class. Each method should be as short as possible and it should be obvious what is happening. This is a combination of using the simplest code constructs and naming. A metric you can use is the number of lines of code in the method. I frequently use seven or eight as a limit, but ten can work pretty well too. Sometimes it depends upon the language in use.
Why is this?
I don't know about you but, I have a very short attention span. I like simple things I can plug into my brain with other simple things to make more complicated things. If I have too many complicated things it becomes difficult to use the pieces.
How do you get there?
A technique I've used for a very long time is to try to place every statement in its own space. So given a condition, the then clause is a method, the else clause is a method etc. If I have a loop, the loop body is (where possible) a method. If you think about your code in terms of 'If this then that', 'that' must be one simple consequence, and therefore in its own method. A loop should say 'While this then that' and again, 'that must be one simple method.
Naming Things
You've certainly hear by now that naming things is one of the hardest parts of programming (an computer science). When you have lots of little methods, naming things gets harder. Harder only because you have to name more things. If you structure things well however, the names should be pretty obvious. If they are not obvious, this is a smell you should pause to consider. If you can't name the simple thing, maybe it isn't that simple. That probably means you've got something complicated on your hands and maybe you need to consider the Single Responsibility Principle. Or possibly you've just found an overly complicated way to go about what you are doing. Either way, this is a good place to pause and reconsider what you are doing.
Visibility
When I test drive, I only write tests for the public or protected interface of a class. Since I don't use inheritance that often, mostly I'm only testing a public interface. Since my classes are very small, I'm often only testing one or maybe two methods. As a result, all those small methods I mentioned are usually private. This gives you a great deal of flexibility in breaking things down. It also allows you to move and rename things easily while your are looking for maximum understandability.
This is a mini-blog. I'm working to find a compromise between a tweet and a lengthy essay. I find it difficult to complete longer documents because of an obsession with perfection. So this little experiment is to see if I can create a blog of mini articles. Herein I will talk about many technical things generally related to software development and Agile practices.
30 June 2017
28 June 2017
Comments are not welcome
Comments are not welcome.
Think about this, a programmer, who knows their language and environment, shouldn't really need your commentary on the code. You code should be clear and expressive. When it isn't clear or expressive, your test suite should explain it. There should really be no need for your comments.
If there is a need for comments it should be to express something that is so phenomenally complicated that there just wasn't an easier way. Or may an expression of non-intuitive behavior in the code as expressed by business requirements. That said, the tests contain the answer of what is true.
Naming things, good method names, good class names, good variable and argument names, is where the clarity of what the code does come from.
There are cases when you have some code you aren't sure about. Code you aren't 100% sure you can delete. SCM is your tool here. Delete it, get rid of the noise. If you need to restore the code you deleted, use your source control tools to bring it back from the dead. Don't ever check in code that is dead or commented out. It just creates confusion.
Thanks to Mike Gantz for reminding me of this.
Think about this, a programmer, who knows their language and environment, shouldn't really need your commentary on the code. You code should be clear and expressive. When it isn't clear or expressive, your test suite should explain it. There should really be no need for your comments.
If there is a need for comments it should be to express something that is so phenomenally complicated that there just wasn't an easier way. Or may an expression of non-intuitive behavior in the code as expressed by business requirements. That said, the tests contain the answer of what is true.
Naming things, good method names, good class names, good variable and argument names, is where the clarity of what the code does come from.
There are cases when you have some code you aren't sure about. Code you aren't 100% sure you can delete. SCM is your tool here. Delete it, get rid of the noise. If you need to restore the code you deleted, use your source control tools to bring it back from the dead. Don't ever check in code that is dead or commented out. It just creates confusion.
Thanks to Mike Gantz for reminding me of this.
26 June 2017
The Free in Freelancing doesn't mean Free
So a friend on Facebook recently asked me about an article he saw. In the article the author describes a situation in which he was asked to build some software (website basically) for a product. He responds with a bid of something like $600. Seems legit right? Well, the person he was talking to wanted him to do it for free, for the exposure. Of course the programmer refused.
You should always refuse to work for free.
I've been at this for a while. I've had plenty of offers to do work for various amounts of money and other reasons. I have never worked for free. I have worked for equity, also not the best idea, but never for free. Working for free is a fools game. If someone values your work enough to approach you and has enough confidence in their idea to think your contribution will help it succeed, you deserve to get paid.
If you choose to work for equity (as I have done on occasion) then go right ahead. If there is merit in an idea and you think it will turn into a profitable business, then get yourself a good lawyer and make an agreement to get some of the equity. Or, just take cash up front. Your downside on cash upfront is missing out on a percentage of the next Facebook, but that is a risk you can at least take under your own control. Giving away the fruit of your labor is just non-sense.
Final note, if you do decide to do something on the cheap, don't sell yourself short. Tell your customer what you would normally charge and then what kind of discount you are willing to give them in exchange for a piece of the action. If their idea is real and they believe in it, you are negotiating. If they don't have that confidence, move on to the next opportunity.
You should always refuse to work for free.
I've been at this for a while. I've had plenty of offers to do work for various amounts of money and other reasons. I have never worked for free. I have worked for equity, also not the best idea, but never for free. Working for free is a fools game. If someone values your work enough to approach you and has enough confidence in their idea to think your contribution will help it succeed, you deserve to get paid.
If you choose to work for equity (as I have done on occasion) then go right ahead. If there is merit in an idea and you think it will turn into a profitable business, then get yourself a good lawyer and make an agreement to get some of the equity. Or, just take cash up front. Your downside on cash upfront is missing out on a percentage of the next Facebook, but that is a risk you can at least take under your own control. Giving away the fruit of your labor is just non-sense.
Final note, if you do decide to do something on the cheap, don't sell yourself short. Tell your customer what you would normally charge and then what kind of discount you are willing to give them in exchange for a piece of the action. If their idea is real and they believe in it, you are negotiating. If they don't have that confidence, move on to the next opportunity.
23 June 2017
0xDEADC0DE
Deadcode is perilous. When we work we should be careful to avoid letting deadcode accumulate. I had a recent experience where my pair partner and I modified some deadcode thinking it was the correct place to inject our new feature. We then tested it thoroughly and deployed it into our development environment. Then we tested it more there to make sure it was working right, and finally we had an acceptance review of the code. Everyone gave us the thumps up and we moved on to the integration environment. We thought we were done.
Turns out, this was deadcode and it never gets run when you follow the process for deployment. Of course we didn't know this and we'd forced its execution in our testing, causing us to break several machines. Fortunately not critically. Once we discovered our mistake we back tracked, unwound the change in each environment we'd touched, and reworked the code in the correct place. Along the way, we removed the deadcode.
We were really glad we discovered this in the integration environment and not production. We would have exposed ourselves to some pretty significant risk. Granted, I don't think anyone would have been hurt, but it would have been egg on our faces for having done this; and it would not have worked correctly and provided the feature we wanted.
So, cautionary tale, clean up after yourself to avoid mistakes like this.
Turns out, this was deadcode and it never gets run when you follow the process for deployment. Of course we didn't know this and we'd forced its execution in our testing, causing us to break several machines. Fortunately not critically. Once we discovered our mistake we back tracked, unwound the change in each environment we'd touched, and reworked the code in the correct place. Along the way, we removed the deadcode.
We were really glad we discovered this in the integration environment and not production. We would have exposed ourselves to some pretty significant risk. Granted, I don't think anyone would have been hurt, but it would have been egg on our faces for having done this; and it would not have worked correctly and provided the feature we wanted.
So, cautionary tale, clean up after yourself to avoid mistakes like this.
21 June 2017
Measure Once, Cut Twice
So I was thinking about how I go about solving problems. In many cases the consequences of a solution are non-terminal. That is, a software problem can often be solved without doing any real damage. I'm not talking about production hot fixes here, or doing things in a mission critical environment, but rather on your workstation or in a development environment.
As it turns out, you can try anything you want and do little or no harm. So rather than be exacting, you should just try it. You can always roll back the change. You can always make a backup before doing something risky. You can use a REPL to test out ideas. So rather than spending a bunch of time picking through the details of risk, just go for it. Worst case you restore the database or file or hit Ctrl-C. Best case, you've found a solution.
If you do find a solution, learn from that solution and decide if the solution is good enough. That is, it might be super hacky. Or it might consume huge quantities of RAM and therefore isn't viable on your target platform. But you can learn from what you did; even if what you learned was that it isn't a good idea.
If your solution doesn't solve the problem, you can probably still learn from it. You can learn more than 'this didn't work' you can also learn that it was inefficient, it uses lots of RAM, its slow, it talks on the network too much etc.
The faster you execute these experiments the more you can learn and the sooner you can get a problem solved. But I'm not suggesting you do this blindly. You should spend a few minutes at least thinking about a proposed solution. I suggest that if you have an idea and no idea how to implement it, move on to the next idea. But if you kinda know how to solve the problem then try it. No real damage can be done.
Of course the cautionary part of this is to not let yourself obsess over a possible solution. You need to know when to quit and try again. Once upon a time I had what I thought was a good idea for bootstrapping a system, and I became obsessed with it. It was long and laborious and in the end it didn't work out. I spent way too much time trying to force a round peg into a square hole and I should have given up sooner. But I did learn something, actually a number of things; most important, don't try to force it, it should be smooth, easy, and understandable.
As it turns out, you can try anything you want and do little or no harm. So rather than be exacting, you should just try it. You can always roll back the change. You can always make a backup before doing something risky. You can use a REPL to test out ideas. So rather than spending a bunch of time picking through the details of risk, just go for it. Worst case you restore the database or file or hit Ctrl-C. Best case, you've found a solution.
If you do find a solution, learn from that solution and decide if the solution is good enough. That is, it might be super hacky. Or it might consume huge quantities of RAM and therefore isn't viable on your target platform. But you can learn from what you did; even if what you learned was that it isn't a good idea.
If your solution doesn't solve the problem, you can probably still learn from it. You can learn more than 'this didn't work' you can also learn that it was inefficient, it uses lots of RAM, its slow, it talks on the network too much etc.
The faster you execute these experiments the more you can learn and the sooner you can get a problem solved. But I'm not suggesting you do this blindly. You should spend a few minutes at least thinking about a proposed solution. I suggest that if you have an idea and no idea how to implement it, move on to the next idea. But if you kinda know how to solve the problem then try it. No real damage can be done.
Of course the cautionary part of this is to not let yourself obsess over a possible solution. You need to know when to quit and try again. Once upon a time I had what I thought was a good idea for bootstrapping a system, and I became obsessed with it. It was long and laborious and in the end it didn't work out. I spent way too much time trying to force a round peg into a square hole and I should have given up sooner. But I did learn something, actually a number of things; most important, don't try to force it, it should be smooth, easy, and understandable.
19 June 2017
Problem Presentation
Apropos nothing, You must propose a solution or you shouldn't point out the problem. More specifically, if you see a problem that you feel needs to be addressed you should present the problem in one of two ways.
First, I see this problem, and I have a solution.
Second, I see what I think is a problem, but I don't have a solution, do you have a solution? Is it possible this is not a problem?
Just complaining about stuff isn't really productive.
First, I see this problem, and I have a solution.
Second, I see what I think is a problem, but I don't have a solution, do you have a solution? Is it possible this is not a problem?
Just complaining about stuff isn't really productive.
16 June 2017
Tables in Your Cukes, Please Stop!
You've probably read some of my posts on Loops, and Parameterized Tests and other things where I talk about clarity in the test suite. I thought I should pay attention to Cucumber (and other similar frameworks) for a minute.
If you aren't familiar with it, Cucumber is a BDD Testing tool that uses Gherkin to structure English like text into executable tests. It wouldn't be uncommon to find a test like this someplace;
If you aren't familiar with it, Cucumber is a BDD Testing tool that uses Gherkin to structure English like text into executable tests. It wouldn't be uncommon to find a test like this someplace;
Feature: Reverse Words in a String | |
In order to read backwards | |
readers must have the words in their text reversed | |
Scenario: Empty String Reversal | |
Given a String Reverser | |
When I reverse the string "" | |
Then the result is "" |
In some cases these tests are just plain awesome for communicating clearly what the expected behavior of some code is.
Cucumber provides a mechanism for using tables to pump data through a test. This is more or less the equivalent of using a loop or a parameterized test. Its not a great idea.
Using a table to load several instances of a structure in order to feed a method some data is fine. Thats setup. But when you are in essence saying, this table represents several independent assertions about the aforementioned code (that also had several tables to create this output); you're doing worse than a loop, your mashing the setup and the assertion of several tests into one place.
Best of Luck.
If you think the aforementioned is OK, best of luck. I've seen this run rampant more than once and the result is that nobody knows what is specifically broken, just that something is broken.
A Clear Path
For each condition that should be true (at any level of testing) there should be more or more tests that verify that condition explicitly and that is it. So, rather than use a set of tables to keep your Gherkin short, have lots of little tests that explicitly call out the conditions that should be true. Mind your testing triangle of course, but for those things that are important, maintaining each one explicitly will serve better for clarity, understanding, and maintenance.
14 June 2017
I Hate Parameterized Tests
This is totally a personal thing. Well, its a bit 'real' too. I hate parameterized tests.
Parameterized tests obscure what is going on in the code. I have yet to see a clean, understandable, elegant example of a parameterized test. They are close to as bad as looping in a test around an assertion. The one redeeming quality they might have is that a better (I won't see well) done version of a parameterized test will at least run all the tests even if one fails.
They still stink!
One of the most common issues I've run into with parameterized tests is that it is unclear which combination of conditions caused the failure. That is, some collection of parameters caused the test to fail. What were there values? What semantics are associated with that particular grouping? Most of the time nobody knows, and nobody can easily tell. In the rare case that we can quickly isolate the issue we are often still at a loss or 'What does it mean?'
With enough effort, you might survive.
In a few cases I've put a bunch of effort into fixing parameterized tests for people. By fix, I really mean, making it tolerable to have in the test suite. Usually I add some extra parameters that suggest names for conditions and give meaning to the collection of parameters. I tried making parameter objects with explicit names once. I even used an Enumeration in Java to 'name' the parameters. It helped, but it was perilous and fraught with danger. In all cases I think I ultimately surrendered to exhaustion rather than satisfaction.
Better Choices are...Better!
My recommendation about Parameterized Tests is, don't. Rather, find all the nifty edges, give them names, and use those to create named test cases. Type out 1000 tests if you have to, but know explicitly what each test does by its name. When one of these fails you will have a big red arrow pointing at your problem. No hunting, no weird structures, no head scratching. Just immediate feedback.
The labor we put into a test suite is the real asset to the software. Without the tests the software's quality is dubious at best, without the source code, the test suite still holds the answers to what should be. Making the choice to be careful and explicit is the right one. Throwing spaghetti at the wall to see what sticks is a poor way to ensure quality and will make change harder over time rather than easier.
Parameterized tests obscure what is going on in the code. I have yet to see a clean, understandable, elegant example of a parameterized test. They are close to as bad as looping in a test around an assertion. The one redeeming quality they might have is that a better (I won't see well) done version of a parameterized test will at least run all the tests even if one fails.
They still stink!
One of the most common issues I've run into with parameterized tests is that it is unclear which combination of conditions caused the failure. That is, some collection of parameters caused the test to fail. What were there values? What semantics are associated with that particular grouping? Most of the time nobody knows, and nobody can easily tell. In the rare case that we can quickly isolate the issue we are often still at a loss or 'What does it mean?'
With enough effort, you might survive.
In a few cases I've put a bunch of effort into fixing parameterized tests for people. By fix, I really mean, making it tolerable to have in the test suite. Usually I add some extra parameters that suggest names for conditions and give meaning to the collection of parameters. I tried making parameter objects with explicit names once. I even used an Enumeration in Java to 'name' the parameters. It helped, but it was perilous and fraught with danger. In all cases I think I ultimately surrendered to exhaustion rather than satisfaction.
Better Choices are...Better!
My recommendation about Parameterized Tests is, don't. Rather, find all the nifty edges, give them names, and use those to create named test cases. Type out 1000 tests if you have to, but know explicitly what each test does by its name. When one of these fails you will have a big red arrow pointing at your problem. No hunting, no weird structures, no head scratching. Just immediate feedback.
The labor we put into a test suite is the real asset to the software. Without the tests the software's quality is dubious at best, without the source code, the test suite still holds the answers to what should be. Making the choice to be careful and explicit is the right one. Throwing spaghetti at the wall to see what sticks is a poor way to ensure quality and will make change harder over time rather than easier.
12 June 2017
Prejudicial Refactoring
This is a repost from my old blog that fits the short form. I had some conversations recently about 'rewrites' that got me thinking about this topic again.
We are all familiar with the concept of refactoring. I think generally we agree that refactoring can be a useful thing when changes are needed in our applications; but what about making changes for change sake. I've seen recently an abuse of the word; every change becomes a refactoring effort. Properly applied, refactoring is an effective way to reduce code-debt or change from one implementation to another or as an effective means of adjusting the underlying architecture of an application as the needs of an IT organization evolve.
Sometimes though, refactoring is undertaken for no good reason at all. A recent example that I experienced was a 8 month .NET project that had been underway for 4 months. The initial implementation used the CSLA approach to data persistence. It was determined (somehow) that CSLA was a 'bad thing' and the code was refactored to use nHibernate and Spring.NET. The justification for this change (given to management) was technology standardization; not a very convincing argument since this was the only .NET application in-house. Other technical arguments were made about these ease of testing provided by nHibernate and Spring.NET, but these were more academic and preference based arguments. The end result was 7 weeks of lost progress by the project team. The cost in time and money was not determined and the impact on the delivery schedule was not accounted for in the project plan. Estimates vary about the dollar impact of this change, but for argument sake lets say it cost 90,000 USD.
This event was an act of prejudicial refactoring. Prejudicial refactoring is the act of refactoring for the some poorly justified reason. In our example case, 18% of the budget and 25% of the time was expended on an unnecessary change. The objective of any software project is to deliver the software. There are numerous reasons to argue for or against any one technology decision; CSLA v. nHibernate, Oracle v. MySQL etc. Once those decisions have been made and the project is in motion, changes to these underlying technologies must be made with caution.
Software is tricky business, the unforeseen technology issue, or the ever changing requirements can derail a project from it's budget at a moments notice. Knowingly consuming the budget for an arbitrary reason is grossly irresponsible. Change for the sake of change is a 'bad thing', and therefore refactoring just because we can is also a 'bad thing'. Careful consideration of the reason for change, and the impact of that change are a must in order to responsibly develop software.
We are all familiar with the concept of refactoring. I think generally we agree that refactoring can be a useful thing when changes are needed in our applications; but what about making changes for change sake. I've seen recently an abuse of the word; every change becomes a refactoring effort. Properly applied, refactoring is an effective way to reduce code-debt or change from one implementation to another or as an effective means of adjusting the underlying architecture of an application as the needs of an IT organization evolve.
Sometimes though, refactoring is undertaken for no good reason at all. A recent example that I experienced was a 8 month .NET project that had been underway for 4 months. The initial implementation used the CSLA approach to data persistence. It was determined (somehow) that CSLA was a 'bad thing' and the code was refactored to use nHibernate and Spring.NET. The justification for this change (given to management) was technology standardization; not a very convincing argument since this was the only .NET application in-house. Other technical arguments were made about these ease of testing provided by nHibernate and Spring.NET, but these were more academic and preference based arguments. The end result was 7 weeks of lost progress by the project team. The cost in time and money was not determined and the impact on the delivery schedule was not accounted for in the project plan. Estimates vary about the dollar impact of this change, but for argument sake lets say it cost 90,000 USD.
This event was an act of prejudicial refactoring. Prejudicial refactoring is the act of refactoring for the some poorly justified reason. In our example case, 18% of the budget and 25% of the time was expended on an unnecessary change. The objective of any software project is to deliver the software. There are numerous reasons to argue for or against any one technology decision; CSLA v. nHibernate, Oracle v. MySQL etc. Once those decisions have been made and the project is in motion, changes to these underlying technologies must be made with caution.
Software is tricky business, the unforeseen technology issue, or the ever changing requirements can derail a project from it's budget at a moments notice. Knowingly consuming the budget for an arbitrary reason is grossly irresponsible. Change for the sake of change is a 'bad thing', and therefore refactoring just because we can is also a 'bad thing'. Careful consideration of the reason for change, and the impact of that change are a must in order to responsibly develop software.
09 June 2017
Finally, Redundancy in TDD
It took me a while, but I finally remember a case of true redundancy in a test case. I was at the Codemash pre-compiler learning Ruby from Jim Weirich. I had just finished a 1/2 day Ruby Koans session and was hot to try out my new skills, so I plopped down on a couch with a few others and we started banging away on a problem. We decided as a group to each build the same application in different languages and see what the differences were. I'm not going to name names here, but I'll say we had Ruby, Python, Perl, Grails, and Scala all rolling in the same small space for a little while. Our problem was Sudoku Solver if I remember correctly.
I was having a wonderful time banging away on this problem. First off, I love puzzles, second I love automating their solutions. We were working our way through test cases on the board and the topic came up about checking for duplicates in a row. If you aren't familiar with Sudoku, using the numbers 1-9 you populate an 81 cell board such that no number appears twice in the same row or column and each sub-grid contains only the numbers 1-9 without duplicates. Therefore, you could test each column, row, and sub-grid for these conditions and if all of these things are true you've solved the puzzle.
But think about this in terms of your code and the abstractions of the playing board. Does your row checker need to check every row? Does your column checker need to check each column? I think a conscientious design suggests not. What you need is a Nine-Unique-Numbers checker. So in my test suite there are a couple tests for 'row has digit' and 'column has digit' but by no means did I check every single column. You can check out my code here.
Just to be clear, I'm not holding my solver up as the bastion of awesome, I'm just using the example. After rereading that code for this post I'm thinking its a little embarrassing and I should rewrite it ;-).
So all of this lead to some discussion. How thorough do you need to be in order to ensure proper behavior of your solver? On of the other people at the table had very thoroughly written explicit tests for each row, column, and grid (subsection) of the board and tested each case to ensure that he has 100% coverage. I think he went to far. I'd call it 1000% coverage. I say that because after the first few proofs, those others are redundant. Unless you have an impossibly flawed design, the conditions don't vary much from column to column or row to row. Looking at my code you will see I isolated the get_col, get_row, and get_section code. To check for numbers in those structures I flatten the content into an array and check to see if the number is present (see row_has_digit, col_has_digit, and section_has_digit) [oops, I just found a damp spot in that code, do you see it?]
My point is, once I have row_has_digit isolated from get_row I can test each independently and in total isolation. I minimize the number of tests while maximizing the clarity of what I'm doing as an expression in tests.
My cohort had rather brutally tested every possible combination regardless of it being redundant. He used parameterized tests and loops (two things I distain in a test) and ensured that there was no way his code would ever get it wrong. Now if I'm launching a Mars Lander I *might* take it that far, but for Sudoku no way. To be honest, I couldn't justify it for a Mars Lander either.
What we know about boolean logic (and others) is that there are only so many conditions that can be tested across an operator. There are a finite (because bits) number of numbers below, equal, and above any other number in a computer. So if we have a condition like '< 5' we can use 2 tests for complete proof our logic goes the right way. Testing the other numbers is wasteful and redundant. [I made this mistake in 2002 when I was just getting into unit tests, my test ran across a 256-bit number, every single combination of bits, and took 13 hours to complete on my clunker IBM Thinkpad, better solution would have been about 16 cases].
So, it took me a while to think of some cases of true redundancy, and to try to answer (at least in part) Tim's question, if you are like the guy in that group, writing a test for every conceivable case, rather than the minimum number of logical cases possible, then yes, you are being redundant and you should stop.
Hopefully I can provide a cogent explanation of how to unwind redundancy. I'll start thinking about that next.
I was having a wonderful time banging away on this problem. First off, I love puzzles, second I love automating their solutions. We were working our way through test cases on the board and the topic came up about checking for duplicates in a row. If you aren't familiar with Sudoku, using the numbers 1-9 you populate an 81 cell board such that no number appears twice in the same row or column and each sub-grid contains only the numbers 1-9 without duplicates. Therefore, you could test each column, row, and sub-grid for these conditions and if all of these things are true you've solved the puzzle.
But think about this in terms of your code and the abstractions of the playing board. Does your row checker need to check every row? Does your column checker need to check each column? I think a conscientious design suggests not. What you need is a Nine-Unique-Numbers checker. So in my test suite there are a couple tests for 'row has digit' and 'column has digit' but by no means did I check every single column. You can check out my code here.
Just to be clear, I'm not holding my solver up as the bastion of awesome, I'm just using the example. After rereading that code for this post I'm thinking its a little embarrassing and I should rewrite it ;-).
So all of this lead to some discussion. How thorough do you need to be in order to ensure proper behavior of your solver? On of the other people at the table had very thoroughly written explicit tests for each row, column, and grid (subsection) of the board and tested each case to ensure that he has 100% coverage. I think he went to far. I'd call it 1000% coverage. I say that because after the first few proofs, those others are redundant. Unless you have an impossibly flawed design, the conditions don't vary much from column to column or row to row. Looking at my code you will see I isolated the get_col, get_row, and get_section code. To check for numbers in those structures I flatten the content into an array and check to see if the number is present (see row_has_digit, col_has_digit, and section_has_digit) [oops, I just found a damp spot in that code, do you see it?]
My point is, once I have row_has_digit isolated from get_row I can test each independently and in total isolation. I minimize the number of tests while maximizing the clarity of what I'm doing as an expression in tests.
My cohort had rather brutally tested every possible combination regardless of it being redundant. He used parameterized tests and loops (two things I distain in a test) and ensured that there was no way his code would ever get it wrong. Now if I'm launching a Mars Lander I *might* take it that far, but for Sudoku no way. To be honest, I couldn't justify it for a Mars Lander either.
What we know about boolean logic (and others) is that there are only so many conditions that can be tested across an operator. There are a finite (because bits) number of numbers below, equal, and above any other number in a computer. So if we have a condition like '< 5' we can use 2 tests for complete proof our logic goes the right way. Testing the other numbers is wasteful and redundant. [I made this mistake in 2002 when I was just getting into unit tests, my test ran across a 256-bit number, every single combination of bits, and took 13 hours to complete on my clunker IBM Thinkpad, better solution would have been about 16 cases].
So, it took me a while to think of some cases of true redundancy, and to try to answer (at least in part) Tim's question, if you are like the guy in that group, writing a test for every conceivable case, rather than the minimum number of logical cases possible, then yes, you are being redundant and you should stop.
Hopefully I can provide a cogent explanation of how to unwind redundancy. I'll start thinking about that next.
07 June 2017
Loops in Tests, BAD!
Elsewhere I've mentioned my distain for loops in tests (along with Parameterized Tests). I wanted to put together a quick note on loops in tests just to get it out there.
Loops in tests are sloppy!
But lets be clear, I'm not talking about a setup method that populates data into a table or something, those make sense as long as they are well written and clear. I'm talking about tests that loop over an execution to make assertions.
I think we've all done it at some point. I know when I was learning TDD I didn't think anything of it. But I eventually learned the hard way. My compact little test that had a loop in it was clean, clear, and fast. Reading that test gave you a satisfying feeling that it would prove the code valid quickly and without concern.
That was true until I broke the code!
The specific test was pretty high level, it had about six lines of setup and then a simple loop that pushed numbers in a long range through a function. I had an verification calculation that ensured given any number in that range what the correct answer was with a nice epsilon value. However the calculation was pretty complicated and its specification came from engineering. It wasn't entirely incomprehensible, however it was a bit weird. Anyway, we got an adjustment to the calculation and we thought we understood it pretty well. So we made the modification to the module and all was well. According to its own tests everything was super-green.
Then we plugged our new component in and this test blew up. First thing we did was ensure that we'd adjusted the verification calculation and that all seemed to check out. We still had failing cases. As it turned out, not all of them failed though. The first hour we spent reducing and adjusting the input range trying to see if it was a particular case, or on the tail of some curve. But we couldn't get it to budge. We then called the engineers and they came and did the engineer thing. But they couldn't see the issue either. After another 2 hours we decided to trap the failed assertions and list them rather than pick away one by one like we'd been doing.
What we eventually saw was a small fluctuation in the calculation that we hadn't correctly anticipated in our verification function. The change to the module cause a module twice removed to put a 'tremor in the signal' and we didn't adjust for that correctly. So roughly one in a one-hundred tests passed and the rest failed by varying multiples of our epsilon value.
Once we saw this issue (there were 5 of us by this point) the engineers were able to confirm that the variance was what they could tolerate and helped us adjust our verification calculation. We then patched our test back up and went back to green.
This was a terrible choice. The same test broke a few months later and we went through similar (though accelerated) gyrations to fix the issue, and again, it was something we could have avoided. We didn't learn and we put things back into that darned loop.
What we should have done, what we eventually learned was, first you have to make sure your adjusting the verification correctly (sometimes really hard to do) and second, don't use a loop! The loop was causing us pain. The loop was making harder to see the issue. A bunch of individual tests (even if there were a few hundred) would have lit up the build with a big arrow that screamed (metaphorically) 'Hey Dummy! There is Tremor in the Signal. Check module 2'. And it would have done so explicitly, or at least more explicitly that what we were doing, which was relying on tribal knowledge of our prior experiences.
Now I've run into tests elsewhere where loops are used that are vastly less complicated that the aforementioned project. Less complicated doesn't mean necessarily easier to grasp in the case of the insidious loop, it means easier to fix once you spot it. Sloppy Loops in your test suggest that you are too lazy to put forth the effort to explain your code in test clearly. Rather you say, 'if all this works, we're good, no need to explain'. That reeks of unprofessionalism to me.
So I guess my point is, never put a loop in a test case.
Loops in tests are sloppy!
But lets be clear, I'm not talking about a setup method that populates data into a table or something, those make sense as long as they are well written and clear. I'm talking about tests that loop over an execution to make assertions.
I think we've all done it at some point. I know when I was learning TDD I didn't think anything of it. But I eventually learned the hard way. My compact little test that had a loop in it was clean, clear, and fast. Reading that test gave you a satisfying feeling that it would prove the code valid quickly and without concern.
That was true until I broke the code!
The specific test was pretty high level, it had about six lines of setup and then a simple loop that pushed numbers in a long range through a function. I had an verification calculation that ensured given any number in that range what the correct answer was with a nice epsilon value. However the calculation was pretty complicated and its specification came from engineering. It wasn't entirely incomprehensible, however it was a bit weird. Anyway, we got an adjustment to the calculation and we thought we understood it pretty well. So we made the modification to the module and all was well. According to its own tests everything was super-green.
Then we plugged our new component in and this test blew up. First thing we did was ensure that we'd adjusted the verification calculation and that all seemed to check out. We still had failing cases. As it turned out, not all of them failed though. The first hour we spent reducing and adjusting the input range trying to see if it was a particular case, or on the tail of some curve. But we couldn't get it to budge. We then called the engineers and they came and did the engineer thing. But they couldn't see the issue either. After another 2 hours we decided to trap the failed assertions and list them rather than pick away one by one like we'd been doing.
What we eventually saw was a small fluctuation in the calculation that we hadn't correctly anticipated in our verification function. The change to the module cause a module twice removed to put a 'tremor in the signal' and we didn't adjust for that correctly. So roughly one in a one-hundred tests passed and the rest failed by varying multiples of our epsilon value.
Once we saw this issue (there were 5 of us by this point) the engineers were able to confirm that the variance was what they could tolerate and helped us adjust our verification calculation. We then patched our test back up and went back to green.
This was a terrible choice. The same test broke a few months later and we went through similar (though accelerated) gyrations to fix the issue, and again, it was something we could have avoided. We didn't learn and we put things back into that darned loop.
What we should have done, what we eventually learned was, first you have to make sure your adjusting the verification correctly (sometimes really hard to do) and second, don't use a loop! The loop was causing us pain. The loop was making harder to see the issue. A bunch of individual tests (even if there were a few hundred) would have lit up the build with a big arrow that screamed (metaphorically) 'Hey Dummy! There is Tremor in the Signal. Check module 2'. And it would have done so explicitly, or at least more explicitly that what we were doing, which was relying on tribal knowledge of our prior experiences.
Now I've run into tests elsewhere where loops are used that are vastly less complicated that the aforementioned project. Less complicated doesn't mean necessarily easier to grasp in the case of the insidious loop, it means easier to fix once you spot it. Sloppy Loops in your test suggest that you are too lazy to put forth the effort to explain your code in test clearly. Rather you say, 'if all this works, we're good, no need to explain'. That reeks of unprofessionalism to me.
So I guess my point is, never put a loop in a test case.
05 June 2017
Test Pyramid Inversion
So I was talking to Schmonz about Redundancy in the test suite the other day. I asked him for some examples, trying to loosen the grit in my brain, help me see what I couldn't see. He put forth a number of good examples/illustrations of what he sees as redundancy/smells in test suites. Very helpful guy that Schmonz.
I'm going to tackle something that he triggered with his first response. 'lots of acceptance tests instead of lots of unit tests, and the acceptance tests naturally have lots of overlap,'
That sounds like Test Pyramid Inversion to me!
When I think about code and TDD in particular, I'm drawn to the notion of making all the parts right and then putting the parts together well. Unit Testing is my assurance that all the parts are right. Or rather, all the parts have an expected, predictable, and tested behavior. At a minimum, when they behave a certain way I should not be shocked. But TDD as a general principle goes beyond that.
Above my layer of well formed parts I can have more layers of tests. Tests that tell me how collections of units of code work together. I call these integration tests, I've heard a dozen names for them, but one way or another, if I glue several atomic units of code together under one test, it escapes the 'microtest' paradigm and becomes something else. Those tests must test things that have been tested before. Does that make them redundant? I say no.
I don't see these things as redundant because at the microtest level all I'm really doing is proving that the code under test behaves in a certain way. I'm not saying anything about how it might interact with other interesting things in the system. An example of this (real world analog) might be something like a screwdriver.
A screwdriver is supposed to do only two things, put screws in and take them back out. So realistically it might have three or four tests. Say, in_with_screw, out_with_screw, in_without_screw, out_without_screw. There are no tests for used_as_chisel or used_as_crowbar. That should tell you something right there. This object was not meant to integrate with hammer or fulcrum. So you shouldn't be surprised when it doesn't work well as a chisel or crowbar.
Now, that said, we are all prone to tool abuse. I've certainly used my screwdriver as a hole punch or a crowbar or something and you probably have too. So lets say you are building a Rube Goldberg Machine and you decide to use a screwdriver as a weight on a string to turn on a fan. That is not the intended use of the screwdriver. But if you have a Test Driven Rube Goldberg Machine, you might write a higher level test that checks that the screwdriver is the right weight, length, etc. to behave as you need it to. This is not redundancy in the test suite (yet).
Lets say you then choose to tape a screwdriver onto the shaft of the fan and use that screwdriver to turn a long screw that moves a marble up an inclined plane. Thats like what a screw driver does, it turns screws. We have 'in_with_screw' to show us that. Is this test redundant? Its the same thing isn't it?
Its not the same thing. One, the screw is not pulling itself into a block of material (or a block of material toward it). Two, the condition we would assert to be true is something along 'marble moves up the incline plane' not 'screw disappears into wood'. So its not redundant at all. It is a matter of perspective.
So given this assertion that redundancy is a matter of perspective I probably need to go think up things that are genuinely redundant and draw some examples from that. I have a story I'll share on the topic in an upcoming post.
I'm going to tackle something that he triggered with his first response. 'lots of acceptance tests instead of lots of unit tests, and the acceptance tests naturally have lots of overlap,'
That sounds like Test Pyramid Inversion to me!
When I think about code and TDD in particular, I'm drawn to the notion of making all the parts right and then putting the parts together well. Unit Testing is my assurance that all the parts are right. Or rather, all the parts have an expected, predictable, and tested behavior. At a minimum, when they behave a certain way I should not be shocked. But TDD as a general principle goes beyond that.
Above my layer of well formed parts I can have more layers of tests. Tests that tell me how collections of units of code work together. I call these integration tests, I've heard a dozen names for them, but one way or another, if I glue several atomic units of code together under one test, it escapes the 'microtest' paradigm and becomes something else. Those tests must test things that have been tested before. Does that make them redundant? I say no.
I don't see these things as redundant because at the microtest level all I'm really doing is proving that the code under test behaves in a certain way. I'm not saying anything about how it might interact with other interesting things in the system. An example of this (real world analog) might be something like a screwdriver.
A screwdriver is supposed to do only two things, put screws in and take them back out. So realistically it might have three or four tests. Say, in_with_screw, out_with_screw, in_without_screw, out_without_screw. There are no tests for used_as_chisel or used_as_crowbar. That should tell you something right there. This object was not meant to integrate with hammer or fulcrum. So you shouldn't be surprised when it doesn't work well as a chisel or crowbar.
Now, that said, we are all prone to tool abuse. I've certainly used my screwdriver as a hole punch or a crowbar or something and you probably have too. So lets say you are building a Rube Goldberg Machine and you decide to use a screwdriver as a weight on a string to turn on a fan. That is not the intended use of the screwdriver. But if you have a Test Driven Rube Goldberg Machine, you might write a higher level test that checks that the screwdriver is the right weight, length, etc. to behave as you need it to. This is not redundancy in the test suite (yet).
Lets say you then choose to tape a screwdriver onto the shaft of the fan and use that screwdriver to turn a long screw that moves a marble up an inclined plane. Thats like what a screw driver does, it turns screws. We have 'in_with_screw' to show us that. Is this test redundant? Its the same thing isn't it?
Its not the same thing. One, the screw is not pulling itself into a block of material (or a block of material toward it). Two, the condition we would assert to be true is something along 'marble moves up the incline plane' not 'screw disappears into wood'. So its not redundant at all. It is a matter of perspective.
So given this assertion that redundancy is a matter of perspective I probably need to go think up things that are genuinely redundant and draw some examples from that. I have a story I'll share on the topic in an upcoming post.
02 June 2017
A Bit More on TDD and Redundancy
So I started this big post with lots of code talking about design and how it impacts redundancy. It got out of hand and needs some significant editing before I can publish it. I'm going to try to summarize the point without the code here and then follow up with some examples.
As I think about redundancy in a test suite I imagine conditions where I have poorly organized code that causes me to repeat tests as setup to other tests. This tells me that the code is too complicated. If you look around people will tell you things like Cyclomatic Complexities greater than five are too big, its too complicated. Well, a function with that much complexity is going to need quite a few tests, and in many cases it will get repetitive. That is a manifestation of redundancy by bad design.
Proposal, break things down. Lets try a Cyclomatic Complexity of two per method. Test each of those things independently. Then test their aggregates. We end up with many more tests but less redundancy. I haven't run the math, I can't give you a straight number here, but what I experienced was two layers of tests, almost 50% more tests, but absolute clarity about what each function does and how all functions are grouped together. I was also able to be very expressive in naming and I feel like the code (despite its poor contextual description) was easy to understand.
So despite my efforts to address redundancy in tests as a 'here is where you should cut things out', I can't find an answer to that. Every time I try, I find a new set of tests to add.
After I work out my code based example of this (next post or two I hope), I'm going to try to address issues of Integration Tests and Acceptance Tests as redundant to Unit Tests. I'll put forth the argument that what you see as redundancy is not redundant, its overlapping concerns; further justification to not delete anything I suspect.
As I think about redundancy in a test suite I imagine conditions where I have poorly organized code that causes me to repeat tests as setup to other tests. This tells me that the code is too complicated. If you look around people will tell you things like Cyclomatic Complexities greater than five are too big, its too complicated. Well, a function with that much complexity is going to need quite a few tests, and in many cases it will get repetitive. That is a manifestation of redundancy by bad design.
Proposal, break things down. Lets try a Cyclomatic Complexity of two per method. Test each of those things independently. Then test their aggregates. We end up with many more tests but less redundancy. I haven't run the math, I can't give you a straight number here, but what I experienced was two layers of tests, almost 50% more tests, but absolute clarity about what each function does and how all functions are grouped together. I was also able to be very expressive in naming and I feel like the code (despite its poor contextual description) was easy to understand.
So despite my efforts to address redundancy in tests as a 'here is where you should cut things out', I can't find an answer to that. Every time I try, I find a new set of tests to add.
After I work out my code based example of this (next post or two I hope), I'm going to try to address issues of Integration Tests and Acceptance Tests as redundant to Unit Tests. I'll put forth the argument that what you see as redundancy is not redundant, its overlapping concerns; further justification to not delete anything I suspect.
Subscribe to:
Posts (Atom)