markdown keeping_unit_test_setup_essential.md
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了markdown keeping_unit_test_setup_essential.md相关的知识,希望对你有一定的参考价值。
# Essential & Relevant: A Unit Test Balancing Act
I have never been a fan of "DRYing" out unit tests. I have always preferred to keep all of my test setup inside each individual test, and I pined about how this made my test suite more readable, isolated, and consistent; despite all of the duplication. I've never been any good at articulating _why_ I preferred to do things this way, but I felt that it was better than the alternative: a test suite full of shared setup methods that forced me to scan many lines of code to try to understand how the tests work.
Then, I began reading [xUnit Test Patterns](https://www.amazon.com/xUnit-Test-Patterns-Refactoring-Code/dp/0131495054/) by Gerard Meszarso. In his book, he codified some of the most profound formulas for writing unit tests. Throughout, the recurring theme around test setup was to keep things **essential & relevant**, and as I continued read, I began to understand that writing good unit tests is much more nuanced than I bargained for.
First, we'll define what unit test setup is. Next, we'll take a dive into examples of when we would want to extract test setup into a method. First, we'll look at when setup is **essential but irrelevant**, exposing symptom called [Irrelevant Information](http://xunitpatterns.com/Obscure%20Test.html#Irrelevant%20Information). Then, we'll a look at then the setup for a test that has **essential & relevant information *missing***, something called [Mystery Guest](http://xunitpatterns.com/Obscure%20Test.html#Mystery%20Guest). After we get our feet wet, we'll audit some code from the [Rails](https://github.com/rails/rails/) codebase to see how these ideas apply in practice.
## The Four-Phase Test
Of all testing paradigms, the most well known is probably [The Four-Phase Test](http://xunitpatterns.com/Four%20Phase%20Test.html). Later disseminated as a distilled variant, "[Arrange, Act, Assert](http://wiki.c2.com/?ArrangeActAssert)," (AAA) (and it's BDD variant "[Given, When, Then](https://martinfowler.com/bliki/GivenWhenThen.html)"), the core of it remains the same: all unit tests, in all programming languages, can take the following form:
```pseudocode
test do
setup
exerise
verify
teardown
end
```
In the **setup** step, we instantiate our [system under test](http://xunitpatterns.com/SUT.html), or SUT, as well as the _minimum number of dependencies_ it requires to ensure it is in the correct state:
```pseudocode
user = User.new(first_name: "John", last_name: "Doe")
```
In the **exercise** step, we execute whatever behavior we want to verify, often a method on our subject, or a function we're passing our subject into:
```pseudocode
result = user.full_name()
```
In the **verify** step, we assert that the result of the exercise step matches our expectation:
```pseudocode
assert(result == "John Doe")
```
Finally, in the **teardown** step, we restore our system to its pre-test state. This is _usually_ taken care of by the language or framework we're using to write our tests.
All together, our test ends up like so:
```pseudocode
// Example 1
...
describe("User#full_name") do
it("returns the full name of the user") do
user = User.new(first_name: "John", last_name: "Doe")
result = user.full_name()
assert(result == "John Doe")
end
end
...
```
It's in the "setup" step where we want to establish the **essential & relevant** information needed throughout the test case. That's when it dawned on me: perhaps the solution isn't to be wholly against removing duplication from test setup, but instead to learn how to do it well.
In Example 1, we're verifying that a user's full name is the concatenation of their first and last, therefore, including their first and last name explicitly within the test setup is both _essential_ and _relevant_.
## Irrelevant Information
As an example of **essential but irrelevant** test setup , we could tweak our original assertion like this:
```pseudocode
// Example 2
...
describe("User#is_logged_in?") do
it("returns false by default") do
user = User.new(first_name: "John", last_name: "Doe")
result = user.is_logged_in?()
assertFalse(result)
end
end
...
```
Here, instead of testing `user.full_name()` as the concatenation of `first_name` and `last_name`, we're testing that the user returned by `User.new()` responds to the `is_logged_in?()` message with `false`.
Is having a `first_name` and `last_name` *relevant* to `is_logged_in?()`? Perhaps not, but perhaps a user is only valid with a `first_name` and `last_name`, which is what makes that setup *essential* to the test. In this case, the only *essential & relevant* setup we need explicitly in our test is a valid user who is not logged in.
Having this irrelevant setup makes for an Obscure Test of the [Irrelevant Information](http://xunitpatterns.com/Obscure%20Test.html#Irrelevant%20Information) variety.
> ...[Irrelevant Test] can also occur because we make visible all the data the test needs to execute rather than focusing on the data the test needs to be understood. When writing tests, the path of least resistance is to use whatever methods are available (on the SUT and other objects) and to fill in all the parameters with values whether or not they are relevant to the test.
>
> -*[xUnit Test Patterns](https://www.amazon.com/xUnit-Test-Patterns-Refactoring-Code/dp/0131495054/)*
We fix this by extracting a setup function/factory method:
```pseudocode
// Example 3
...
describe("User#is_logged_in?") do
it("returns false by default") do
user = default_not_logged_in_user() // setup function
result = user.is_logged_in?()
assertFalse(result)
end
end
...
```
The relevant information is here by way of the method name, and the essential setup is on the other side of the `defalt_not_logged_in_user()` method.
## Mystery Guests
Assuming there are a lot tests with similar setup, it's common to pull duplicated setup code into a setup function like the example above. This is also the solution to writing tests which have a verbose setup, and it helps us to ensure that we don't include any essential but irrelevant information in our tests:
```pseudocode
// Example 4
...
describe("User#full_name") do
it("returns the full name of the user") do
user: User.new(
first_name"" "John"
last_name: "Doe"
street_address: "1000 Broadway Ave"
city: "New York"
state: "New York"
zip_code: "11111"
phone_number: "555555555"
)
result = user.full_name()
assert(result == "John Doe")
end
end
...
```
In this case, it may be *essential* to instantiate a valid user with a `first_name`, `last_name`, `street_address`, etc., but it is *irrelevant* to our assertion!
However, like in Example 1, we're asserting against `user.full_name()`, and we established that including the `first_name` and `last_name` in the setup was in fact relevant to our test. So, if we used a setup function like that of Example 2, our setup would not be wholly essential & relevant:
```pseudocode
// Example 5
...
describe("User#full_name") do
it("returns the full name of the user") do
user = valid_user() // setup function
result = user.full_name()
assert(result == "John Doe")
end
end
...
```
This is a case where there is **essential & relevant information _missing_** from the test. This type of Obscure Test is called [Mystery Guest](http://xunitpatterns.com/Obscure%20Test.html#Mystery%20Guest).
> When either the fixture setup and/or the result verification part of a test depends on information that is not visible within the test and the test reader finds it difficult to understand the behavior that is being verified without first having to find and inspect the external information, we have a *Mystery Guest* on our hands.
>
> -*[xUnit Test Patterns](https://www.amazon.com/xUnit-Test-Patterns-Refactoring-Code/dp/0131495054/)*
The solutions here are to 1) create an explicitly-named setup function that returns the user we need, 2) create a setup function that returns a mutable user that we can update before our assertion, or 3) alter our setup function to accept parameters:
```pseudocode
// Example 6
...
describe("User#full_name") do
it("returns the full name of the user") do
user = valid_user(first_name: "John", last_name: "Doe") // new setup function
result = user.full_name()
assert(result == "John Doe")
end
end
...
```
This is called a [Parameterized Creation Method](http://xunitpatterns.com/Creation%20Method.html#Parameterized%20Creation%20Method) and we use it execute all of the essential but irrelevant steps for setting up our test. With it, we're able to keep our test setup DRY by creating a reusable method that keeps *essential* information inline.
## Let's Audit Production Code
With some this idea of _essential & relevant_ under our belt, let's take a look at some production code. The following examples are taken from the [Rails](https://github.com/rails/rails/) codebase. Make no mistake about the things I point out about the tests; not only do they work extremely well, they have stood the test of time. Please treat the following only as a thought experiment in order to hone our understanding of _essential & relevant_ unit tests. Lastly, it's a huge thanks to being open sourced that we can stand on the Rails maintainers' shoulders and learn from their work.
### Irrelevant Information
Let’s take a look at another test of in the codebase, this time from `ActiveRecord::Base`, where we [assert that unwanted characters are escaped from columns names](https://github.com/rails/rails/blob/master/activerecord/test/cases/base_test.rb#L70):
```ruby
# rails/activerecord/test/cases/base_test.rb
def test_column_names_are_escaped
conn = ActiveRecord::Base.connection
classname = conn.class.name[/[^:]*$/]
badchar = {
"SQLite3Adapter" => '"',
"Mysql2Adapter" => "`",
"PostgreSQLAdapter" => '"',
"OracleAdapter" => '"',
"FbAdapter" => '"'
}.fetch(classname) {
raise "need a bad char for #{classname}"
}
quoted = conn.quote_column_name "foo#{badchar}bar"
if current_adapter?(:OracleAdapter)
\# Oracle does not allow double quotes in table and column names at all
\# therefore quoting removes them
assert_equal("#{badchar}foobar#{badchar}", quoted)
else
assert_equal("#{badchar}foo#{badchar * 2}bar#{badchar}", quoted)
end
end
# ...
```
Here, we might be running into the **essential but irrelevant** issue, Irrelevant Information.
In this case, it seems important that we account for the different behavior based on the class name of the `ActiveRecord::Base.connection`. This is _essential_ to the test, as is the conditional further down, checking for a specific case. However, as an exploration, I would question how _relevant_ this information is to the setup portion of our test case.
Ultimately, the test is verifying that a call to `#quote_column_name()` with a `string`, will return a new `string` with certain characters escaped. Further more, the exact behavior of the `#quote_column_name()` method is based upon which database we are using, (this is _essential_ to how `ActiveRecord` remains database agnostic).
With that said, could this setup be hidden behind a well named setup method? Would it be easier to understand the assertion without the lengthy setup? What would it look like if we used our _essential & relevant_ methodology to refactor this test case? And, of course, which would you prefer to work with? Why? And does it matter?
It's worth noting that in this example, the importance of what this test verifies is extremely high! This test ensures that behavior which prevents SQL injection, (escaping quotes in column names), is working properly.
Given this context, does the lengthy setup information included within the test case become more _relevant_?
### Mystery Guests
This example is from `ActiveSupport`'s `hash_with_indifferent_access`, which allows us to [access Ruby hashes with keys that either strings or symbols](https://github.com/rails/rails/blob/master/activesupport/test/hash_with_indifferent_access_test.rb#L40), interchangeably:
```ruby
# rails/activesupport/test/hash_with_indifferent_access_test.rb
def setup
@strings = { "a" => 1, "b" => 2 }
@symbols = { a: 1, b: 2 }
@mixed = { :a => 1, "b" => 2 }
# ... omitted for brevity
end
def test_symbolize_keys_for_hash_with_indifferent_access
assert_instance_of Hash, @symbols.with_indifferent_access.symbolize_keys
assert_equal @symbols, @symbols.with_indifferent_access.symbolize_keys
assert_equal @symbols, @strings.with_indifferent_access.symbolize_keys
assert_equal @symbols, @mixed.with_indifferent_access.symbolize_keys
end
# ...
```
In this file, there are many more test cases like the one above, and they all rely on the same `setup()` method to instantiate instance variables.
Before we jump in and evaluate whether or not using a shared setup callback fits within our _essential & relevant_ exploration, it’s worth mentioning that these tests were originally written in January 2005, and the pattern they've introduced have stood up ever since!
With that said, we have some Mystery Guests on our hands: without scanning up and down the test file, we cannot easily extrapolate the cause and affect relationship of the assertion. This is a case of **_missing_ relevant & essential** information from our test case.
Consider that with this test alone, it can only be verified that seemingly _any_ object that we call `#with_indifferent_access.symbolize_keys()` on, returns whatever is stored in the `@symbols` variable. We can’t tell what `@symbols` is, nor can we easily deduce what state the `@strings` and `@mixed` objects are in _before_ we call `#with_indifferent_access.symbolize_keys()` on them.
This may not be a huge deal, after all, we can just easily look up to the `setup()` method and piece together the puzzle, but what about this assertion, over 400 lines below?
```ruby
# rails/activesupport/test/hash_with_indifferent_access_test.rb
def test_indifferent_to_hash
# ...
assert_equal @strings, @mixed.with_indifferent_access.to_hash
# ...
end
# ...
```
So, what gives? These tests have survived for two decades, but do they fit our _essential & relevant_ criteria? Are they difficult to understand? Does it matter? What about the multiple assertions in one test case?
Think about how you might refactor these tests to ascribe to the idea of _essential & relevant_, and ask yourself which version you’d prefer to work with and why.
------
After all of this exploration, I wish I was left with an easy answer about wether or not I should extract my unit test setup into a method or not, however, what I’m left with instead, is so much more nuanced and interesting. Test writing, like production code writing, requires us to ask questions about the impact on the code _reader_. After all, the longevity of software is centered on our code both being accurate and communicating effectively.
Overall, when judging when to DRY the setup of unit tests, I've found it important to consider what is **essential** for our setup vs **relevant** to our test reader. Of course there are thousands of pages more about what makes good unit tests, but I find this topic particularly nascent as the focus begins to shift from "_why_ should we TDD," to "_how_ do we TDD well." Being able to articulate what is **essential & relevant** to a test, might be the key to finding the balance between people like me, who always opposed DRY unit tests, to people who have more often prefer to keep things tidy. There are smells in both directions, but I think that **essential & relevant** is the middle ground.
以上是关于markdown keeping_unit_test_setup_essential.md的主要内容,如果未能解决你的问题,请参考以下文章