I'm going to talk a bit about design in this post. More specifically, I'm going to talk about sensitivity to design issues. But, before that, I'd like to start with an exercise.
Grab some paper and a pencil. Here's an example that is based on some code I saw last week ("Ripped from the Codebase!"). Spend 30 seconds looking at it, and then write down the answers to the questions that follow the code:
private void CreateDataSource(DataSource dataSource)
{
if (dataSource == null)
{
throw new ArgumentNullException();
}
if (string.IsNullOrEmpty(dataSource.CatalogName) ||
string.IsNullOrEmpty(dataSource.ServerName) ||
dataSource.Username == null)
{
throw new NullReferenceException();
}
if (dataSource.Username != String.Empty &&
string.IsNullOrEmpty(dataSource.Password))
{
throw new NullReferenceException();
}
using (LegacyService legacyService = new LegacyService("http:/service.example/com/datasource"))
{
DataSourceLegacy dataSourceLegacy = new DataSourceLegacy
{
ServerName = dataSource.ServerName,
Username = dataSource.Username,
Password = dataSource.Password,
CatalogName = "Legacy" + dataSource.CatalogName
};
legacyService.DataSourceCreate(dataSourceLegacy);
}
}
How good is this code, and what would you do if you had to write tests for it? Write your answers down. I'll be coming back to those answers a bit later.
*****
In my first two posts, I talked about two things that could get in the way of the TDD flow. This posts is about the ability to recognize design feedback – and know what to do about it – in good circumstances. And it also brings me back to what motivated this series.
My assertion is very simple.
Your success using TDD is directly tied to your skill at identifying design issues in your code and addressing them in ways that both make testing easier and make your code better.
Or, to state this in the opposite way, if you are not able to see the design issues in your code and fix them, TDD isn't going to work for you.
If you aren't in the "TDD is great" side of the discussion, then you may wonder if I am casting aspersions on your design skills. Yes, I am.
But, before I talk about that, I need to take a brief digression into the theory of knowledge. To simplify greatly, in any field there are:
- Things that we know. For example, I know that C# generics are implemented directly in the .NET runtime while Java generics use a technique known as erasure.
- Things that we know we don't know. I know that when you are writing code that runs on a GPU, you might write pixel shaders, but I'm totally vague on how the overall system works.
- Things that we don't know we don't know.
I can't come up with an example for the third category, and worse, I can't even tell how big that category is. Let me give you an example:
There is a code smell known as "primitive obsession". Before I became aware of it, I wrote a lot of code that had primitive obsession in it, but I was ignorant of it – I was not *sensitive* to that particular issue. Now I see it everywhere.
My experience is that most developers do not see design issues. And, if you are not able to correct design issues, TDD isn't going to be a lot of fun, because you aren't going to get to better designs.
How do we improve our design skills? Here are a few ideas:
- Pair with somebody that is good at design (though keep in mind that both of your are subject to #3 above).
- Do speculative refactoring of a codebase you own – work to make something better while planning on throwing away the first attempt.
- Work on a kata or exercise and compare what you have done with others
- Find code that feels less-than-optimal but you don't know what to do, create an example out of it, and ask somebody how to make it better.
- Pick a code smell, and look for it in your code.
The first one is much much better than the alternatives; if you are effectively pairing you will learn things while you are getting other work done.
Let's go back to my code example and treat it as an exercise. To make it easier, I created a project here. The method of interest is in DataServer.cs; the rest is just stubbed out. You can follow along in my changes in the Eric-Refactoring branch.
I suggest that you go off and try your hand at making it better, and then come and see what approach I took. I'm asking you to do this because I want to see if you are sensitive to the same things I am sensitive to and whether we will address issues in the same way.
Make it testable
As you probably noticed looking at the code, the current implementation is not testable. The crux of the problem is the following line:
using (LegacyService legacyService = new LegacyService("http:/service.example/com/datasource"))
Because the LegacyService is created in code, there is no way to "get behind" it with a test. So, I did the usual thing:
- Created an ILegacyService interface
- Modify LegacyService to implement it.
- Created LegacyServiceMock
- Used the mock in the tests
I have now met the minimum bar for TDD; the class is testable. And this is where I see a lot of developers stop. But, I don't like the current state for a couple of reasons:
- Writing tests that test the DataSource validation code seems like it belongs somewhere else, not in the tests for the DataServer class.
- Copying from the DataSource class to the DataSourceLegacy class also doesn't seem like it belongs in the DataServer class, and to test it, I'm going to have to create a mock for the LegacyService. that seems unwarranted.
DataSource validation
What I'm seeing here is a variant of the primitive obsession code smell; we have code that is directly related to a data source but it lives somewhere else. That is unfortunate.
So, I created a method from the validation code, moved it to the DataSource class, and changed it to an instance method. Now the server code looks like this:
public void CreateDataSource( DataSource dataSource)
{
if (dataSource == null)
{
throw new ArgumentNullException();
}
dataSource.Validate();
DataSourceLegacy dataSourceLegacy = new DataSourceLegacy
{
ServerName = dataSource.ServerName,
Username = dataSource.Username,
Password = dataSource.Password,
CatalogName = "Legacy" + dataSource.CatalogName
};
_legacyService.DataSourceCreate(dataSourceLegacy);
}
This is better.
I tend towards data classes that are pretty dumb – ideally, they just hold data, and it makes me a bit uncomfortable that this class now does more.
I'm also not really happy with having to remember to call Validate(), and checking whether a data source is valid doesn't seem like it's something this code should have to do. Can we improve that?
Well, the crux of the problem is one of validity. Can we make a data source that is guaranteed to be valid?
Right now, it's problematic because of how construction is done:
DataSource dataSource = new DataSource
{
CatalogName = "Fred",
ServerName = "http://example.com",
Username = "Mister Slate",
Password = "Rock on"
};
What I care about is the final state of the data source, but this creation is equivalent to creating an empty data source and setting the properties, so it will be invalid during construction. That is bad.
Let's switch to a constructor-based initialization pattern, and make all of the properties read-only. We will call our validate function in the constructor, it will throw if there are issues, and then we will know that every data source instance is always valid. We also get immutability out of it, which makes me happy.
The problem we had with validation here is one I see a lot; most typically we see a string that has certain constraints that are checked in multiple places in the codebase (and often not checked when they should be). It's the textbook example for primitive obsession, and if you create a type and make it always valid, it can significantly improve the code. .
Copying new to legacy
Next up is dealing with the copying code; we can extract it to a method, and then move it to a class. But where should it go?
Our options are:
- Put it on DataSource
- Put it on DataSourceLegacy
- Put it in a new class
The choice here is mostly about coupling and how we are structuring our dependencies. In the code that inspired this, the DataSource and DataSourceLegacy classes are really in two different worlds that really shouldn't know about each other, so I'm going to choose the third option, but in other cases just putting it on one of the two classes is simpler. In other cases, it makes sense to have the code live in DataSource or DataSourceLegacy. Since we are talking about TDD, I suggest that you be guided by how hard it is to write the tests.
Now that the copying code is pulled out into a separate method, we can easily write tests for that method.
At this point, we have pushed a bunch of scenarios to other classes, so we only have two cases left:
- The check for a null DataSource
- One test to verify that a DataSource that gets passed in actually makes it through to the legacy service.
Are we done?
This state seems like a pretty good one, and I would probably stop here. But can we go farther?
One option is to lose the null test. When I suggest this sort of thing, people generally look at me as if I had grown a second head (and, perhaps, a third arm…), but I bring it up because I want to challenge some preconceptions.
There are a lot of people out there that got told that you always have to validate your parameters. Some of them confuse the end – detecting mistakes or enforcing security requirements – with the means – adding checks.
If there are security or correctness issues, then you need the check. If the user would be confused by the exception that would be thrown and end up wasting time, then you need the check. But if it's obvious what the issue is without the check, then I advocate getting rid of it.
In this case, I don't see a ton of utility in the null check; the DataSource instance is the only thing passed in the function and in this usage it's passed from other code. I think that pretty much everybody is going to look for a null argument when if it throws a NullReferenceException.
So, I pulled out the null check, which leaves us with the following:
So, yeah, I think we're done, which leaves us with the following:
public void CreateDataSource( DataSource dataSource)
{
var dataSourceLegacy = DataSourceCopyer.CopyDataSourceToDataSourceLegacy(dataSource);
_legacyService.DataSourceCreate(dataSourceLegacy);
}
How sensitive are you to design feedback? Did you see the same things that I saw? Did you address them the same way, or did you do something different? Feel free to share your experience in the comments section.