Tuesday, August 12, 2008

Simple and obvious first

Logically, when you're trying to debug an issue, you should start at the bottom. The simple answers are the easiest and the quickest, and let you get back to solving the real problems. Unfortunately, something usually comes between you and the easy answers: assumptions.

In our ever-present quest to go faster and get through things quicker, we cut corners, we make dumb mistakes, and we take on faith that others have performed due diligence so that we can start with high-level troubleshooting.

Having been a tech support agent for years, I'm acutely aware of this, and having been a sysadmin for over half a decade, you would think that I would be used to this situation, but that's not the case. Oh, how I wish it were.

The reason that I bring this up is that yesterday I spent a couple of hours troubleshooting a missing set of quotes. I know, I know, how can you spend a couple of hours fixing something that obvious? It started with a simple issue: A shell script wasn't passing the right filename to a program that it called.

I was brought in to figure out a way to get an argument to the script passed to the program correctly. I was given the example:

$ shellscript.sh arg1 arg2 "/path/to/file with spaces in name"

Inside shellscript.sh, a program was called like this:

/path/to/program $1 $2 $3

although I didn't know that at the time. I solved the first problem, that the argument to the shell script wasn't being interpreted correctly.

I eventually wrote a slick for loop to handle the arguments without breaking the filename into separate tokens. After testing it, I presented it to the person who asked me to look at it. After installing it into the program, it still didn't work.

By this time, he was tired of dealing with it and asked me to work on the actual script. Of course, I looked at the script, saw the above line of code, and added quotes around the $3.

Obviously, the script worked perfectly after that.

My mistake was to assume that the script was written right and that the real issue was the one I was given. It might have taken 30 extra seconds to look at the contents of the script in the beginning. But I didn't, and lost a couple of hours for my trouble.

Let this be a lesson to me. And you. Learn from my example, and verify your assumptions