Prefer simple methods

screenshot

Long methods vs simple ones. What’s the problem?

Long methods’ issues:

  • usually does too much and “knows” too much
  • as a consequence, it breaks the Single Responsibility Principle
  • it is harder to grasp what it does when you read it
  • more changes needed to add or amend functionality
  • you have to write pretty complex tests
  • and you need a lot of them to cover all of the possible permutations

Benefits of simple methods:

  • does one thing
  • as a consequence, it follows the Single Responsibility Principle
  • it is easy to grasp what it does from the first glance
  • a small number of changes, usually by adding another simple method
  • short and simple tests
  • that covers all the cases

Let’s take a look at a (totally made-up) example:

class Item < ApplicationRecord
  DEFAULT_AGGR_LAT = 0.5
  DEFAULT_AGGR_LONG = 0.5

  def synchronize_with(incoming_data_item)
    # update simple attributes
    self.attr_short_name = incoming_data_item.attr_short_name
    self.attr_fancy_name = incoming_data_item.attr_fancy_name
    self.attr_just_usual_name = incoming_data_item.attr_just_usual_name

    # update complex attributes
    self.coordinates = [incoming_data_item.lat, incoming_data_item.long]
    self.complex_attr1 = {a: incoming_data_item.simple1, b: incoming_data_item.simple2, c: incoming_data_item.simple3}

    # calculate aggregated values
    aggregated_latitude =
      if incoming_data_item.lat > 123
        (incoming_data_item.lat + 0.123) / 3
      elsif incoming_data_item.lat < 0.23
        (incoming_data_item.lat + 24) * 1.5
      else
        DEFAULT_AGGR_LAT
      end

    aggregated_longitude =
      if incoming_data_item.long > 4
        (incoming_data_item.long - 0.25) * 2.3
      elsif incoming_data_item.long < -3
        incoming_data_item.long.abs / 1.1
      else
        DEFAULT_AGGR_LONG
      end

    # update aggregated attributes
    self.aggregated_coordinates = [aggregated_latitude, aggregated_longitude]
    self.coordinates_text = "lat: #{aggregated_latitude}, long: #{aggregated_longitude}"

    save!
  end
end

We are lucky if the author has added those explaining comments. At least we have something to begin our understanding with.

Now, try to imagine how many tests you would need to cover all the possible edge cases, nil values, etc., and their permutations.

Most likely this method would have only one - success path - test. It would be long and require explaining comments in it.

Maybe there will be one or two additional tests that were added later, when some bug has been caught in the production.

Those are usually called “anti-regression” tests.

So how we should fix this long method?

I would use the explaining comments as hints for the new methods. Looking at our example, the resulting method should look something like this:

class Item < ApplicationRecord
  # ...
  def synchronize_with(incoming_data_item)
    @incoming_data_item = incoming_data_item
    #
    update_simple_attributes
    update_complex_attributes
    calculate_aggregated_values
    update_aggregated_attributes
    #
    save!
  end
  # ...
end

As a consequence:

  • you don’t need the explaining comments
  • the code itself “reads” as documentation now
  • you can write simple tests for each of those simpler methods
  • the test for the synchronize_with method now is dead simple

“But how do you hand over the intermediate state between the methods?” you would ask.

I would use either instance variables for that, or better yet, the attribute accessors for them.

There are many good articles out there explaining why you should prefer attribute accessors. I’ll skip this part here instead of just instance variables in most cases.

So the first one is the incoming incoming_data_item data. We only read its value, so the attr_reader is enough.

class Item < ApplicationRecord
  attr_reader :incoming_data_item
  # ...
end

This way, the first method could be just a copy-past of the existing code:

# before
# update simple attributes
self.attr_short_name = incoming_data_item.attr_short_name
self.attr_fancy_name = incoming_data_item.attr_fancy_name
self.attr_just_usual_name = incoming_data_item.attr_just_usual_name

# after
def update_simple_attributes
  self.attr_short_name = incoming_data_item.attr_short_name
  self.attr_fancy_name = incoming_data_item.attr_fancy_name
  self.attr_just_usual_name = incoming_data_item.attr_just_usual_name
end

As you can imagine, the testing of the update_simple_attributes method is now dead simple and it is really easy to cover all the nil and other edge cases.

Ok, what about the update_complex_attributes method? For starters, we can do the same copy-paste and that already would be a gain.

# before
# update complex attributes
self.coordinates = [incoming_data_item.lat, incoming_data_item.long]
self.complex_attr1 = {a: incoming_data_item.simple1, b: incoming_data_item.simple2, c: incoming_data_item.simple3}

# after
def update_complex_attributes
  self.coordinates = [incoming_data_item.lat, incoming_data_item.long]
  self.complex_attr1 = {a: incoming_data_item.simple1, b: incoming_data_item.simple2, c: incoming_data_item.simple3}
end

But now looking at the resulting method, it is apparent that the new method “knows” too much about the incoming_data_item attributes.

We definitely have the “Tell, Don’t Ask” code smell here.

Ideally, we want something like this instead:

def update_complex_attributes
  self.coordinates = incoming_data_item.coordinates
  self.complex_attr1 = incoming_data_item.complex_attr1
end

This way the tests for the update_complex_attributes also become dead simple now.

And here are the new methods that we would add to the incoming_data_item’s class:

class IncomingDataItem
  # ...

  # incoming_data_item -> self
  def coordinates
    [self.lat, self.long]
  end

  def complex_attr1
    {a: self.simple1, b: self.simple2, c: self.simple3}
  end
end

Obviously, you don’t need to use self here. I’m just showing an intermediate result of the entire method’s extraction.

class IncomingDataItem
  # ...
  def coordinates
    [lat, long]
  end

  def complex_attr1
    {a: simple1, b: simple2, c: simple3}
  end
end

Here is a word of caution though.

When we “move” this way functionality into the “children” classes, eventually, they become “God Objects” that do too much.

This is the opposite side of the “Tell, Don’t Ask” principle.

There are different methods to address this issue, but it is a topic for another article. Now let’s get back to the rest of the extracted methods.

Here is the original calculate aggregated values code:

# calculate aggregated values
aggregated_latitude =
  if incoming_data_item.lat > 123
    (incoming_data_item.lat + 0.123) / 3
  elsif incoming_data_item.lat < 0.23
    (incoming_data_item.lat + 24) * 1.5
  else
    DEFAULT_AGGR_LAT
  end

aggregated_longitude =
  if incoming_data_item.long > 4
    (incoming_data_item.long - 0.25) * 2.3
  elsif incoming_data_item.long < -3
    incoming_data_item.long.abs / 1.1
  else
    DEFAULT_AGGR_LONG
  end

If we move the existing code into the new method, we would need to write tests that cover both “branches” of the new method. They would be long and hard to read.

Instead, we can either extract two sub-methods smth like this:

def calculate_aggregated_values
  calculate_aggregated_latitude
  calculate_aggregated_longitude
end
def calculate_aggregated_latitude
  # ...
end
def calculate_aggregated_longitude
  # ...
end

That would already be a good win.

Or we can get rid of the calculate_aggregated_values method at all if we think that the resulting code’s readability remains to be good.

class Item < ApplicationRecord
  # ...

  def synchronize_with(incoming_data_item)
    @incoming_data_item = incoming_data_item
    #
    update_simple_attributes
    update_complex_attributes

    # here is the changed part: instead of the `calculate_aggregated_values`, we have two methods
    calculate_aggregated_latitude
    calculate_aggregated_longitude

    update_aggregated_attributes
    #
    save!
  end

  # ...

  def calculate_aggregated_latitude
    # ...
  end
  def calculate_aggregated_longitude
    # ...
  end
end

And here are the new methods themselves:

DEFAULT_AGGR_LAT = 0.5
attr_reader :aggregated_latitude
def calculate_aggregated_latitude
  @aggregated_latitude =
    if incoming_data_item.lat > 123
      (incoming_data_item.lat + 0.123) / 3
    elsif incoming_data_item.lat < 0.23
      (incoming_data_item.lat + 24) * 1.5
    else
      DEFAULT_AGGR_LAT
    end
end

DEFAULT_AGGR_LONG = 0.5
attr_reader :aggregated_longitude
def calculate_aggregated_longitude
  @aggregated_longitude =
    if incoming_data_item.long > 4
      (incoming_data_item.long - 0.25) * 2.3
    elsif incoming_data_item.long < -3
      incoming_data_item.long.abs / 1.1
    else
      DEFAULT_AGGR_LONG
    end
end

Notice that we need two new attr readers for the intermediate state handover.

Also, it is mostly a matter of taste and the team’s preferences, we can move the constants next to the methods where they are used.

Again, now after we’ve moved the code into the new separate methods, it becomes apparent, that the methods “know” too much about the incoming_data_item and also they “ask” it many times, hence we have the “Tell, Don’t Ask” code smell.

Let’s fix it by moving the functionality to the incoming_data_item’s class.

Here is the new code for the IncomingDataItem class. Notice that we also moved the constants.

class IncomingDataItem
  # ...
  DEFAULT_AGGR_LAT = 0.5
  def aggregated_latitude
    if lat > 123
      (lat + 0.123) / 3
    elsif lat < 0.23
      (lat + 24) * 1.5
    else
      DEFAULT_AGGR_LAT
    end
  end

  DEFAULT_AGGR_LONG = 0.5
  def aggregated_longitude
    if long > 4
      (long - 0.25) * 2.3
    elsif long < -3
      long.abs / 1.1
    else
      DEFAULT_AGGR_LONG
    end
  end
  # ...
end

Now we don’t need those two new attr readers in the parent’s class. And the resulting synchronize_with method is also simpler now:

class Item < ApplicationRecord
  def synchronize_with(incoming_data_item)
    @incoming_data_item = incoming_data_item
    #
    update_simple_attributes
    update_complex_attributes
    # no need for a separate "calculate aggregated values" stage/method
    update_aggregated_attributes
    #
    save!
  end
end

Ok. Now let’s tackle the last update aggregated attributes part.

Here is the initial code for it:

# update aggregated attributes
self.aggregated_coordinates = [aggregated_latitude, aggregated_longitude]
self.coordinates_text = "lat: #{aggregated_latitude}, long: #{aggregated_longitude}"

Here we also can just copy-paste the code into a new method, but instead of the aggregated_latitude/_longitude accessors, we use the new methods of the incoming_data_item instance like this:

def update_aggregated_attributes
  self.aggregated_coordinates = [incoming_data_item.aggregated_latitude, incoming_data_item.aggregated_longitude]
  self.coordinates_text = "lat: #{incoming_data_item.aggregated_latitude}, long: #{incoming_data_item.aggregated_longitude}"
end

And again, let’s address the “Tell, Don’t Ask”. Here is the new -dead simple- code for the IncomingDataItem class:

class IncomingDataItem
  # ...
  def aggregated_coordinates
    [aggregated_latitude, aggregated_longitude]
  end

  def coordinates_text
    "lat: #{aggregated_latitude}, long: #{aggregated_longitude}"
  end
  # ...
end

As you can imagine, the tests for these new methods also would be pretty short and dead simple.

And here is the usage of these new methods:

def update_aggregated_attributes
  self.aggregated_coordinates = incoming_data_item.aggregated_coordinates
  self.coordinates_text = incoming_data_item.coordinates_text
end

Now we have three pretty simple and pretty similar-looking methods in the parent class:

def update_simple_attributes
  self.attr_short_name = incoming_data_item.attr_short_name
  self.attr_fancy_name = incoming_data_item.attr_fancy_name
  self.attr_just_usual_name = incoming_data_item.attr_just_usual_name
end
def update_complex_attributes
  self.coordinates = incoming_data_item.coordinates
  self.complex_attr1 = incoming_data_item.complex_attr1
end
def update_aggregated_attributes
  self.aggregated_coordinates = incoming_data_item.aggregated_coordinates
  self.coordinates_text = incoming_data_item.coordinates_text
end

We can combine them into one:

def update_attributes # <= new method
  self.attr_short_name = incoming_data_item.attr_short_name
  self.attr_fancy_name = incoming_data_item.attr_fancy_name
  self.attr_just_usual_name = incoming_data_item.attr_just_usual_name
  self.coordinates = incoming_data_item.coordinates
  self.complex_attr1 = incoming_data_item.complex_attr1
  self.aggregated_coordinates = incoming_data_item.aggregated_coordinates
  self.coordinates_text = incoming_data_item.coordinates_text
end

But looking at the whole picture, it is apparent that we don’t need that intermediate method. Neither do we the attr_reader :incoming_data_item.

Here is the updated synchronize_with method:

def synchronize_with(incoming_data_item)
  self.attr_short_name = incoming_data_item.attr_short_name
  self.attr_fancy_name = incoming_data_item.attr_fancy_name
  self.attr_just_usual_name = incoming_data_item.attr_just_usual_name
  self.coordinates = incoming_data_item.coordinates
  self.complex_attr1 = incoming_data_item.complex_attr1
  self.aggregated_coordinates = incoming_data_item.aggregated_coordinates
  self.coordinates_text = incoming_data_item.coordinates_text
  save!
end

The AR’s update! method does the same:

  • updates the attribute’s values
  • and uses the save! under the hood

So let’s use it instead:

def synchronize_with(incoming_data_item)
  update!(
    attr_short_name: incoming_data_item.attr_short_name,
    attr_fancy_name: incoming_data_item.attr_fancy_name,
    attr_just_usual_name: incoming_data_item.attr_just_usual_name,
    coordinates: incoming_data_item.coordinates,
    complex_attr1: incoming_data_item.complex_attr1,
    aggregated_coordinates: incoming_data_item.aggregated_coordinates,
    coordinates_text: incoming_data_item.coordinates_text,
  )
end

Again, “Tell, Don’t Ask”. Let’s use the attributes convention:

class IncomingDataItem
  # ...
  def attributes
    {
      attr_short_name: attr_short_name,
      attr_fancy_name: attr_fancy_name,
      attr_just_usual_name: attr_just_usual_name,
      coordinates: coordinates,
      complex_attr1: complex_attr1,
      aggregated_coordinates: aggregated_coordinates,
      coordinates_text: coordinates_text,
    }
  end
  # ...
end

This way the synchronize_with method becomes a simple one-liner:

def synchronize_with(incoming_data_item)
  update!(incoming_data_item.attributes)
end

With this one-liner, we can consider getting rid of it for good. Here is what it would look like:

# find all the "before" usage
item_instance.synchronize_with(incoming_data_item)

# and change it to the "after"
item_instance.update!(incoming_data_item.attributes)

Personally, I would prefer to use the synchronize_with one-liner. The “outer” tests would be more straightforward with it.

Here is the resulting complete code for both classes to see the entire picture:

class Item < ApplicationRecord
  def synchronize_with(incoming_data_item)
    update!(incoming_data_item.attributes)
  end
end

class IncomingDataItem
  def attributes
    {
      attr_short_name: attr_short_name,
      attr_fancy_name: attr_fancy_name,
      attr_just_usual_name: attr_just_usual_name,
      coordinates: coordinates,
      complex_attr1: complex_attr1,
      aggregated_coordinates: aggregated_coordinates,
      coordinates_text: coordinates_text,
    }
  end

  def coordinates
    [lat, long]
  end

  def complex_attr1
    {a: simple1, b: simple2, c: simple3}
  end

  def aggregated_coordinates
    [aggregated_latitude, aggregated_longitude]
  end

  def coordinates_text
    "lat: #{aggregated_latitude}, long: #{aggregated_longitude}"
  end

  DEFAULT_AGGR_LAT = 0.5
  def aggregated_latitude
    if lat > 123
      (lat + 0.123) / 3
    elsif lat < 0.23
      (lat + 24) * 1.5
    else
      DEFAULT_AGGR_LAT
    end
  end

  DEFAULT_AGGR_LONG = 0.5
  def aggregated_longitude
    if long > 4
      (long - 0.25) * 2.3
    elsif long < -3
      long.abs / 1.1
    else
      DEFAULT_AGGR_LONG
    end
  end
end

As a result, we’ve ended up with all of the functionality moved to the “child” class to address the “Tell, Don’t Ask”. Of course, this is not always like this.

Please, keep in mind, that this is a pretty simple and totally made-up example.

The real-life production code frequently has loops, map, each, etc., collections-related usage, which brings its own additional questions.

But overall, the main principle remains the same.

Extract everything into smaller methods that are simpler to test, read, and reason about.