Long methods vs simple ones. What’s the problem?
Long methods’ issues:
- usually does too much and “knows” too much
- as a consequence, it breaks the Single Responsibility Principle
- it is harder to grasp what it does when you read it
- more changes needed to add or amend functionality
- you have to write pretty complex tests
- and you need a lot of them to cover all of the possible permutations
Benefits of simple methods:
- does one thing
- as a consequence, it follows the Single Responsibility Principle
- it is easy to grasp what it does from the first glance
- a small number of changes, usually by adding another simple method
- short and simple tests
- that covers all the cases
Let’s take a look at a (totally made-up) example:
class Item < ApplicationRecord
DEFAULT_AGGR_LAT = 0.5
DEFAULT_AGGR_LONG = 0.5
def synchronize_with(incoming_data_item)
# update simple attributes
self.attr_short_name = incoming_data_item.attr_short_name
self.attr_fancy_name = incoming_data_item.attr_fancy_name
self.attr_just_usual_name = incoming_data_item.attr_just_usual_name
# update complex attributes
self.coordinates = [incoming_data_item.lat, incoming_data_item.long]
self.complex_attr1 = {a: incoming_data_item.simple1, b: incoming_data_item.simple2, c: incoming_data_item.simple3}
# calculate aggregated values
aggregated_latitude =
if incoming_data_item.lat > 123
(incoming_data_item.lat + 0.123) / 3
elsif incoming_data_item.lat < 0.23
(incoming_data_item.lat + 24) * 1.5
else
DEFAULT_AGGR_LAT
end
aggregated_longitude =
if incoming_data_item.long > 4
(incoming_data_item.long - 0.25) * 2.3
elsif incoming_data_item.long < -3
incoming_data_item.long.abs / 1.1
else
DEFAULT_AGGR_LONG
end
# update aggregated attributes
self.aggregated_coordinates = [aggregated_latitude, aggregated_longitude]
self.coordinates_text = "lat: #{aggregated_latitude}, long: #{aggregated_longitude}"
save!
end
end
We are lucky if the author has added those explaining comments. At least we have something to begin our understanding with.
Now, try to imagine how many tests you would need to cover all the possible edge cases,
nil
values, etc., and their permutations.
Most likely this method would have only one - success path - test. It would be long and require explaining comments in it.
Maybe there will be one or two additional tests that were added later, when some bug has been caught in the production.
Those are usually called “anti-regression” tests.
So how we should fix this long method?
I would use the explaining comments as hints for the new methods. Looking at our example, the resulting method should look something like this:
class Item < ApplicationRecord
# ...
def synchronize_with(incoming_data_item)
@incoming_data_item = incoming_data_item
#
update_simple_attributes
update_complex_attributes
calculate_aggregated_values
update_aggregated_attributes
#
save!
end
# ...
end
As a consequence:
- you don’t need the explaining comments
- the code itself “reads” as documentation now
- you can write simple tests for each of those simpler methods
- the test for the
synchronize_with
method now is dead simple
“But how do you hand over the intermediate state between the methods?” you would ask.
I would use either instance variables for that, or better yet, the attribute accessors for them.
There are many good articles out there explaining why you should prefer attribute accessors. I’ll skip this part here instead of just instance variables in most cases.
So the first one is the incoming incoming_data_item
data.
We only read its value, so the attr_reader
is enough.
class Item < ApplicationRecord
attr_reader :incoming_data_item
# ...
end
This way, the first method could be just a copy-past of the existing code:
# before
# update simple attributes
self.attr_short_name = incoming_data_item.attr_short_name
self.attr_fancy_name = incoming_data_item.attr_fancy_name
self.attr_just_usual_name = incoming_data_item.attr_just_usual_name
# after
def update_simple_attributes
self.attr_short_name = incoming_data_item.attr_short_name
self.attr_fancy_name = incoming_data_item.attr_fancy_name
self.attr_just_usual_name = incoming_data_item.attr_just_usual_name
end
As you can imagine, the testing of the update_simple_attributes
method is now dead simple
and it is really easy to cover all the nil
and other edge cases.
Ok, what about the update_complex_attributes
method?
For starters, we can do the same copy-paste and that already would be a gain.
# before
# update complex attributes
self.coordinates = [incoming_data_item.lat, incoming_data_item.long]
self.complex_attr1 = {a: incoming_data_item.simple1, b: incoming_data_item.simple2, c: incoming_data_item.simple3}
# after
def update_complex_attributes
self.coordinates = [incoming_data_item.lat, incoming_data_item.long]
self.complex_attr1 = {a: incoming_data_item.simple1, b: incoming_data_item.simple2, c: incoming_data_item.simple3}
end
But now looking at the resulting method, it is apparent that
the new method “knows” too much about the incoming_data_item
attributes.
We definitely have the “Tell, Don’t Ask” code smell here.
Ideally, we want something like this instead:
def update_complex_attributes
self.coordinates = incoming_data_item.coordinates
self.complex_attr1 = incoming_data_item.complex_attr1
end
This way the tests for the update_complex_attributes
also become dead simple now.
And here are the new methods that we would add to the incoming_data_item
’s class:
class IncomingDataItem
# ...
# incoming_data_item -> self
def coordinates
[self.lat, self.long]
end
def complex_attr1
{a: self.simple1, b: self.simple2, c: self.simple3}
end
end
Obviously, you don’t need to use self
here.
I’m just showing an intermediate result of the entire method’s extraction.
class IncomingDataItem
# ...
def coordinates
[lat, long]
end
def complex_attr1
{a: simple1, b: simple2, c: simple3}
end
end
Here is a word of caution though.
When we “move” this way functionality into the “children” classes, eventually, they become “God Objects” that do too much.
This is the opposite side of the “Tell, Don’t Ask” principle.
There are different methods to address this issue, but it is a topic for another article. Now let’s get back to the rest of the extracted methods.
Here is the original calculate aggregated values
code:
# calculate aggregated values
aggregated_latitude =
if incoming_data_item.lat > 123
(incoming_data_item.lat + 0.123) / 3
elsif incoming_data_item.lat < 0.23
(incoming_data_item.lat + 24) * 1.5
else
DEFAULT_AGGR_LAT
end
aggregated_longitude =
if incoming_data_item.long > 4
(incoming_data_item.long - 0.25) * 2.3
elsif incoming_data_item.long < -3
incoming_data_item.long.abs / 1.1
else
DEFAULT_AGGR_LONG
end
If we move the existing code into the new method, we would need to write tests that cover both “branches” of the new method. They would be long and hard to read.
Instead, we can either extract two sub-methods smth like this:
def calculate_aggregated_values
calculate_aggregated_latitude
calculate_aggregated_longitude
end
def calculate_aggregated_latitude
# ...
end
def calculate_aggregated_longitude
# ...
end
That would already be a good win.
Or we can get rid of the calculate_aggregated_values
method at all
if we think that the resulting code’s readability remains to be good.
class Item < ApplicationRecord
# ...
def synchronize_with(incoming_data_item)
@incoming_data_item = incoming_data_item
#
update_simple_attributes
update_complex_attributes
# here is the changed part: instead of the `calculate_aggregated_values`, we have two methods
calculate_aggregated_latitude
calculate_aggregated_longitude
update_aggregated_attributes
#
save!
end
# ...
def calculate_aggregated_latitude
# ...
end
def calculate_aggregated_longitude
# ...
end
end
And here are the new methods themselves:
DEFAULT_AGGR_LAT = 0.5
attr_reader :aggregated_latitude
def calculate_aggregated_latitude
@aggregated_latitude =
if incoming_data_item.lat > 123
(incoming_data_item.lat + 0.123) / 3
elsif incoming_data_item.lat < 0.23
(incoming_data_item.lat + 24) * 1.5
else
DEFAULT_AGGR_LAT
end
end
DEFAULT_AGGR_LONG = 0.5
attr_reader :aggregated_longitude
def calculate_aggregated_longitude
@aggregated_longitude =
if incoming_data_item.long > 4
(incoming_data_item.long - 0.25) * 2.3
elsif incoming_data_item.long < -3
incoming_data_item.long.abs / 1.1
else
DEFAULT_AGGR_LONG
end
end
Notice that we need two new attr readers for the intermediate state handover.
Also, it is mostly a matter of taste and the team’s preferences, we can move the constants next to the methods where they are used.
Again, now after we’ve moved the code into the new separate methods,
it becomes apparent, that the methods “know” too much about the incoming_data_item
and also they “ask” it many times, hence we have the “Tell, Don’t Ask” code smell.
Let’s fix it by moving the functionality to the incoming_data_item
’s class.
Here is the new code for the IncomingDataItem
class.
Notice that we also moved the constants.
class IncomingDataItem
# ...
DEFAULT_AGGR_LAT = 0.5
def aggregated_latitude
if lat > 123
(lat + 0.123) / 3
elsif lat < 0.23
(lat + 24) * 1.5
else
DEFAULT_AGGR_LAT
end
end
DEFAULT_AGGR_LONG = 0.5
def aggregated_longitude
if long > 4
(long - 0.25) * 2.3
elsif long < -3
long.abs / 1.1
else
DEFAULT_AGGR_LONG
end
end
# ...
end
Now we don’t need those two new attr readers in the parent’s class.
And the resulting synchronize_with
method is also simpler now:
class Item < ApplicationRecord
def synchronize_with(incoming_data_item)
@incoming_data_item = incoming_data_item
#
update_simple_attributes
update_complex_attributes
# no need for a separate "calculate aggregated values" stage/method
update_aggregated_attributes
#
save!
end
end
Ok. Now let’s tackle the last update aggregated attributes
part.
Here is the initial code for it:
# update aggregated attributes
self.aggregated_coordinates = [aggregated_latitude, aggregated_longitude]
self.coordinates_text = "lat: #{aggregated_latitude}, long: #{aggregated_longitude}"
Here we also can just copy-paste the code into a new method,
but instead of the aggregated_latitude/_longitude
accessors,
we use the new methods of the incoming_data_item
instance like this:
def update_aggregated_attributes
self.aggregated_coordinates = [incoming_data_item.aggregated_latitude, incoming_data_item.aggregated_longitude]
self.coordinates_text = "lat: #{incoming_data_item.aggregated_latitude}, long: #{incoming_data_item.aggregated_longitude}"
end
And again, let’s address the “Tell, Don’t Ask”.
Here is the new -dead simple- code for the IncomingDataItem
class:
class IncomingDataItem
# ...
def aggregated_coordinates
[aggregated_latitude, aggregated_longitude]
end
def coordinates_text
"lat: #{aggregated_latitude}, long: #{aggregated_longitude}"
end
# ...
end
As you can imagine, the tests for these new methods also would be pretty short and dead simple.
And here is the usage of these new methods:
def update_aggregated_attributes
self.aggregated_coordinates = incoming_data_item.aggregated_coordinates
self.coordinates_text = incoming_data_item.coordinates_text
end
Now we have three pretty simple and pretty similar-looking methods in the parent class:
def update_simple_attributes
self.attr_short_name = incoming_data_item.attr_short_name
self.attr_fancy_name = incoming_data_item.attr_fancy_name
self.attr_just_usual_name = incoming_data_item.attr_just_usual_name
end
def update_complex_attributes
self.coordinates = incoming_data_item.coordinates
self.complex_attr1 = incoming_data_item.complex_attr1
end
def update_aggregated_attributes
self.aggregated_coordinates = incoming_data_item.aggregated_coordinates
self.coordinates_text = incoming_data_item.coordinates_text
end
We can combine them into one:
def update_attributes # <= new method
self.attr_short_name = incoming_data_item.attr_short_name
self.attr_fancy_name = incoming_data_item.attr_fancy_name
self.attr_just_usual_name = incoming_data_item.attr_just_usual_name
self.coordinates = incoming_data_item.coordinates
self.complex_attr1 = incoming_data_item.complex_attr1
self.aggregated_coordinates = incoming_data_item.aggregated_coordinates
self.coordinates_text = incoming_data_item.coordinates_text
end
But looking at the whole picture, it is apparent that we don’t need that intermediate method.
Neither do we the attr_reader :incoming_data_item
.
Here is the updated synchronize_with
method:
def synchronize_with(incoming_data_item)
self.attr_short_name = incoming_data_item.attr_short_name
self.attr_fancy_name = incoming_data_item.attr_fancy_name
self.attr_just_usual_name = incoming_data_item.attr_just_usual_name
self.coordinates = incoming_data_item.coordinates
self.complex_attr1 = incoming_data_item.complex_attr1
self.aggregated_coordinates = incoming_data_item.aggregated_coordinates
self.coordinates_text = incoming_data_item.coordinates_text
save!
end
The AR’s update!
method does the same:
- updates the attribute’s values
- and uses the
save!
under the hood
So let’s use it instead:
def synchronize_with(incoming_data_item)
update!(
attr_short_name: incoming_data_item.attr_short_name,
attr_fancy_name: incoming_data_item.attr_fancy_name,
attr_just_usual_name: incoming_data_item.attr_just_usual_name,
coordinates: incoming_data_item.coordinates,
complex_attr1: incoming_data_item.complex_attr1,
aggregated_coordinates: incoming_data_item.aggregated_coordinates,
coordinates_text: incoming_data_item.coordinates_text,
)
end
Again, “Tell, Don’t Ask”.
Let’s use the attributes
convention:
class IncomingDataItem
# ...
def attributes
{
attr_short_name: attr_short_name,
attr_fancy_name: attr_fancy_name,
attr_just_usual_name: attr_just_usual_name,
coordinates: coordinates,
complex_attr1: complex_attr1,
aggregated_coordinates: aggregated_coordinates,
coordinates_text: coordinates_text,
}
end
# ...
end
This way the synchronize_with
method becomes a simple one-liner:
def synchronize_with(incoming_data_item)
update!(incoming_data_item.attributes)
end
With this one-liner, we can consider getting rid of it for good. Here is what it would look like:
# find all the "before" usage
item_instance.synchronize_with(incoming_data_item)
# and change it to the "after"
item_instance.update!(incoming_data_item.attributes)
Personally, I would prefer to use the synchronize_with
one-liner.
The “outer” tests would be more straightforward with it.
Here is the resulting complete code for both classes to see the entire picture:
class Item < ApplicationRecord
def synchronize_with(incoming_data_item)
update!(incoming_data_item.attributes)
end
end
class IncomingDataItem
def attributes
{
attr_short_name: attr_short_name,
attr_fancy_name: attr_fancy_name,
attr_just_usual_name: attr_just_usual_name,
coordinates: coordinates,
complex_attr1: complex_attr1,
aggregated_coordinates: aggregated_coordinates,
coordinates_text: coordinates_text,
}
end
def coordinates
[lat, long]
end
def complex_attr1
{a: simple1, b: simple2, c: simple3}
end
def aggregated_coordinates
[aggregated_latitude, aggregated_longitude]
end
def coordinates_text
"lat: #{aggregated_latitude}, long: #{aggregated_longitude}"
end
DEFAULT_AGGR_LAT = 0.5
def aggregated_latitude
if lat > 123
(lat + 0.123) / 3
elsif lat < 0.23
(lat + 24) * 1.5
else
DEFAULT_AGGR_LAT
end
end
DEFAULT_AGGR_LONG = 0.5
def aggregated_longitude
if long > 4
(long - 0.25) * 2.3
elsif long < -3
long.abs / 1.1
else
DEFAULT_AGGR_LONG
end
end
end
As a result, we’ve ended up with all of the functionality moved to the “child” class to address the “Tell, Don’t Ask”. Of course, this is not always like this.
Please, keep in mind, that this is a pretty simple and totally made-up example.
The real-life production code frequently has loops, map
, each
, etc., collections-related usage,
which brings its own additional questions.
But overall, the main principle remains the same.
Extract everything into smaller methods that are simpler to test, read, and reason about.