24 June 2018

Flow Control with Exceptions

Recently the topic of using exceptions for flow control reared its ugly head. This topic seems to show up in my life every few years so I thought I'd share some things I've learned over the past 25 years of dealing with exceptions.

Don't Do It!

OK, first, just don't. Don't use Exceptions explicitly for flow control. In fact, don't use Exceptions if you can help it. Exceptions should be the result of something essentially beyond your control. The name says it all, Exceptions are exceptional -- your handling of an exception should be to deal with the unexpected, despite how cynical you might be.

General Handling of...

So you really should try to avoid handling exceptions. That is, you should only handle an exception you can do something about.  A typical good pattern for any piece of software is to have one exception handler at the top (closest to the invocation point) and handle everything there, usually with a polite message indicating that a system error has occurred.

I Can Handle It

There are some exceptions you can handle. File Not Found Exception is a pretty common one that you can generally handle. Now by handle, what do I mean? Well, in some cases it might mean printing a helpful error message for the user. In other cases I might mean creating or downloading the missing file, or using a default configuration. 

When you are doing this, you are not using an Exception for flow control. You are using an exception to identify and handle an unexpected (but possible) condition in your application. 

Some other pointers for handling exceptions include, handle them immediately and concisely. That is, don't try to over generalize the handling of possible exceptions (other than the aforementioned top level handler). When exceptions are handled, get to the point, handle them quickly and without too many gyrations, then resume normal flow. 

Where Does It Get Messy?

Things usually get messy in highly modularized code bases. For example, if you have 20 libraries as dependencies to your application, but you wrote all of those libraries and your application, all of this is your code. This can make it hard to understand when you are using an exception for flow control and when you are dealing with things outside of your control.

An easy way to work through this is to consider what you'd do if the library throwing the exception was an open source library, what would you do then? Would you still throw the exception? If you wouldn't do this to a stranger on the internet, don't do it to yourself.

Similarly, if the library throwing the exception was some OSS library you'd pulled off Maven Central how would you handle the exception? Same rule applies to the library you wrote.

Don't Over Complicate

As with most things, it is best if we don't overcomplicate the matter at hand. Exceptions are part of our languages. There are penalties to using them, but there are also advantages. When considering how to use an exception, think about the developer who comes after you. What will make sense to them? That is what you should do. When in doubt, ask someone how they would expect things to work. 





19 June 2018

Automate Everything

Stop me if you have heard this one before. No don't read this again.

Back in 1988-89 I had a job as an assistant systems operator working for a really cool guy named Jason. Mostly I ran backups and did other really simple SysOp work and I probably spent more time learning csh and making patch cables for the machine room than doing much else. But I still learned a lot in this job. 

The most important lesson I learned was, automate everything.

It came up one day that there seemed to be a lot of idle time in the life of a SysOp. Roughly 80% of the time was available for projects like 'make patch cables' or 'clean the attic'. So I asked Jason one day, 

"How is it that we have so much spare time? When are we doing to do some SysOp-ing.?"

He said, "We are! Everything is automated. When I come into the office in the morning I check my email. I review the reports generated by the automated scripts, and if nothing is wrong I have to make stuff up for us to do all day." 

At the time it was sort of a "Ha Ha" moment and I didn't think about it too much. Years later I realized, Jason and the other Real SysOps™ had automated every single tasks they had to perform on a regular basis. They needed guys like me to change the tapes in the ExoByte drive, but not much else. And as long as things went well, there wasn't much to do.

That left lots of free time for other pursuits. Like thinking about how to make things better, more automated. They were basically working to eliminate their own jobs. As a consequence they could work on more interesting things (homework, pet projects, etc.). I wish I'd had a clue back then, but I have one now. 

By automating away all the mundane things we can create more space to think through tough problems, innovate, or just generally sleep better

So have been applying this sort of thinking since back in the day, generally with good success. I admit, sometimes it takes me a long time to figure out how to automate things. I certainly have grown to despise things that are hard to manipulate with scripts and macros. What I've gotten in the end is a fairly simple life. 

One example is a side project I'm working on. I've automated nearly everything. I did it in the Unix Way (small, atomic/acidic scripts that only do one thing). I can use all that automation to my advantage. When my partners in mischief call with an issue I can usually bang out two or three simple commands to 'fix things'. Or send instructions like "Run script X. Delete thing Y. Then restart with command such-and-such". Honestly, if I could anticipate the contortions in advance, I could get most of this down to one script.

What this has given me is the opportunity to think about the Hard Parts™ of the system and then arrive at clever solutions. Rather than spend days trying to build a DAL for the application, I spent a day deriving a generic library that works across all of the domain objects and tables. How'd I do that? Well, I didn't spend all day manually coding up a bunch of one-off objects, I automated the construction, testing, and deployment of those things. The test cycle is about 6 seconds. I was able to iterate over my clever solution so fast that it was almost (but not quite) painless to create. 

Automation is your friend. It may not be sexy and glorious, but it will enable you to do great things. So go out there and automate everything.

18 June 2018

Clever is the Enemy of Good, Part send(f(time.now)+hostname)

So in a recent coding adventure I came across some really super things. One of my favorites worked as follows. </snark>

* Get a reference to a production domain class that contains list of event types
* Get the names of the event types as strings
* Split the strings on '.'
* Use the last element of the returned list to create a snake case string (from the camel case value)
* Use send to find a method on the current object with the same name as the string
* Assemble the results into a list

Now I'm down for some good old fashioned reflection/introspection and general meta-programming. There are plenty of times where its the right thing and it makes sense.

Your test setup code is not this place. 

This example I've laid out took about 20 lines of setup code and resulted in roughly this;

let(:event1) { Event1.new }
let(:event2) { Event2.new }
let(:event3) { Event3.new }
let(:events) { [ event1, event2, event3 ] }

Why would you put all this complicated junk in your test? 

I have only one guess, Future Proofing. The only genuine motivation I can see for using a complicated setup for such a simple thing is a presumption that one day there will be more events and we want to test them all. 

This is wrong thinking. First, don't future proof your test code. It will be necessarily vague and not result in anything very helpful or useful in the future (that might not come). Second, you've now made a simple thing very complicated to the detriment of readability. 

Our first goal in TDD is to understand our system; to determine what code must be created by explaining it in terms of test code. Something like this is clearly not the development of understanding. I'm pretty confident that its an example of test after development, although I didn't check. 

One of the secondary effects of TDD is that we leave behind an explanation of how the system works. Not of how it was implemented necessarily, but of what we expect it to do. Having a let() that is 20 lines long and uses reflection to assemble a list of 3 items is not clear, concise, or helpful. 

So in both cases a test like this misses the mark for good TDD. 

15 June 2018

TDD Preconditions, Moar Design Pressure

As test drivers we need to listen to that design pressure and simplify.

I recently spent several days dissecting a a single RSpec file that was 1300+ lines long. My pair partner and I extracted a single context of 250 lines into a new file and hauled 105 lines of setup code along for the ride. There were 103 let statements and two subjects. Thats not to mention the event machine testing mix-in and the various event mothers. 

In the end we got it working but it took far longer than it should have. There was plenty of time spent questioning our understanding of the system and how it should actually behave. Had we extracted the correct setup? and did this test work before we did the extract? became our repeated refrain. Therefore we were constantly flipping back and forth with another branch and running the test suite to ensure that we weren't screwing things up. 

We got the job done, but here are some things we learned.

1) Tests with preconditions aren't really helpful in explaining anything to the reader. It seems like the should be, but the just kept confusing us. In fact, once we became familiar with the test configuration (getting the file trimmed down to < 300 lines) they were redundant. This is clear evidence that, if the test module is properly formed, the preconditions aren't necessary; hence they are a smell.

Have you ever seen a test that looks like this;


I don't like this test. The precondition (the assertion before the execution) is telling me something is wrong. Mostly what it is telling me is that the system is complicated enough that I need to establish the current state before I can even start executing. 

Thats a design smell if there ever was one.

What that precondition is telling me is our test has become so complicated we are unsure of how the setup works and therefore our test code needs a test. Thats bad. 

2) (off topic but important) Reasonable defaults to you aren't necessarily reasonable to anyone else. When you are dealing with 1000 lines of test code and numerous external factories and fixtures you can get lost and confused very quickly. It doesn't help if an external testing library sets up conditions that aren't explicit but have significant consequences. Clever is the enemy of good. Don't use an unusual setting or configuration just for fun in your defaults, and if you do, make it super obvious that you are doing so or the developer who comes after you might spend a day chasing their tail.

3) Most importantly. Listen to the design pressure your tests provide. If you feel compelled to make an assertion about the state of the system before you execute the code under test, your code is telling you 'Hey, I'm complicated!'. Part of our goal is to not have complicated things. So do something about it. 


14 June 2018

Sekuretee!

Good morning internets. Just wanted to point out that this blog is now available via https, so update those bookmarks to https://blog.noradltd.com 

13 June 2018

Listen to Setup Pressure

Joining an existing project can be an overwhelming experience. It seems most often when I start a new engagement there are a half dozen technologies that I haven't got installed,or have the wrong version of, or maybe haven't ever used. There are configuration files, manual setup steps, and other churn to work through just so you can run a build, let alone do any meaningful work at all.In many cases this configuration period can take an entire week and it is often very difficult to tell if you have done it correctly.

My best ever project in this regard took 2 hours to configure. We spent over a month building an automated script to assemble everything. On joining the project the instructions were these;

* Download and install Java (I think it was 1.2)
* Download and install ANT
* Connect to source control and pull the project
* Run ant build_all
* wait 90 minutes
* done.

Our build script was pretty smart. It would download, install, and configure your IDE, Database, Application Server, test suite, supporting tools, compile, package, and build everything, and run the full test suite. 

I wish that every project could be like this.It paid off tremendously. Every team member who joined went through this process, every time we replaced a laptop, we went through it again. Every time we saved days of work we would have done manually otherwise.

On top of that, everything was (for some definition of it) documented by the ANT script and the files that supported it. Everything inherently had configuration management via the ANT script. And, it wasn't even that complicated to follow along with what it was doing. 

We religiously maintained that script through the entire project.

Almost (but maybe not quite) every project I have ever been on has required me to kill at least a day doing setup work. There are passwords to exchange, keys to setup, IP restrictions to modify, let alone all those tools, packages, and settings to adjust. Make it worse, if you work with more than one client you might have to figure out how to get multiple configurations to cooperate. Its a tragic mess. 

I recently started with a client where I spent the better part of four days configuring my laptop. I recklessly trashed other configurations on my system to ensure that I was setup for them; so I spent no time trying to make things work in two worlds. Several weeks later I **think** my environment and configuration are correct, but I'm not really sure. Nobody is really sure. To make matters less comfortable, the configuration is touchy. Because it isn't fully automated, if I delete the docker volume for the testing database, there are six or seven manual steps that I need to execute to restore it. I had to put those into Evernote because they aren't documented well and they aren't particularly obvious (to me). 

The team doing this work are a great bunch of people and they mean well, but they have a lot of pressure to deliver and have not had the time to automate these parts of their configuration. They are overburdened with new work, defects, and operational maintenance, and other tasks and have not found the time to go back and clean these things up. That's despite their very strong desire to do so. 

I suspect that they, as a group, have become numb to the time consuming activities of fixing their configurations and setting things up; likely they are only working on this one thing and don't get clobbered by having multiple projects on their systems. For them, it's become a distant memory of a growing pain from years back. 

This configuration friction is more than a growing pain. It is a design pressure and it's trying to tell us something. Mostly, your system is complicated. That complexity needs to be dealt with. 

We can talk about the monetary cost of all this, but that is somewhat academic. What is more interesting is the psychological impact this has on a team. It plays out one of two ways, both bad. 

A team might avoid updates/upgrades/changes, or relegate that work to 'system experts' in order to avoid the distant but remembered pain of dealing with the complicated setup, OR the team will become more and more lax about how the configuration works and is documented; basically they will ignore the issues in the hopes that they go away. Both of these are costly in terms of time and money, and risky in terms of stability and sustainability. 

If all the work is pushed off to 'system experts' there are huge risks. One, that the team doesn't really understand its own systems configuration, and two, that the experts will fly-away with the information leaving the team an archaeology project when change is necessary. Furthermore, it constrains the breadth of decision making to a select few, possibly risking opportunities for innovation and growth. Both not good.

More obviously, if the team simply ignores the problem there is a high potential for the system to become stale and prone to security exploits. Its important to keep systems up to date with respect to the technology stack both from a security perspective and an operational cost perspective. If the libraries get too far out of date, or the OS requirements get sun-set-ed, what will you do? Spend a month doing an upgrade? Do you trust your test automation that much?

Lastly, and maybe most importantly, working with such systems can be demoralizing. Starting up on a project where you feel 'dumb' for a few days or weeks while you try to work out the configuration isn't a lot of fun. Worse, when you start to feel inhibited by the difficulties of configuration, like you can't change things because of the risk of breaking things, you may start to loose your drive to innovate and make the best possible solutions.

Setup and build automation are serious aspects of good software development and should be treated as such.It may not be the sexy and glorious activity that you want to engage in daily, but it supports those things, and when done correctly gives you more freedom to do whats most fun about your project.