By Ghouse Mohamed
Ruby on rails certainly makes it easier for developers to bootstrap applications and POCs at a pace which is hard to imagine achieving using any other framework. The MVC architecture of the framework ties up together many pathways which speeds things up in terms of development speed. And as a consequence, it brings along with a few antipatterns, which if not taken care of, can over time turn into some nasty unmanageable technical debt. Either in terms of Code maintainability, code readability or complexity, it is important to keep the code well-reasoned, simple and DRY.
Here are some of the patterns to look out for when looking at your code critically
Say that you've properly encapsulated your application functionality inside different models, and you've broken up each of those functions into smaller methods. Say you have code that looks like the following
1
2
3
4
5
6
7
8
9
10
11
12
class Address < ActiveRecord::Base
belongs_to :customer
end
class Customer < ActiveRecord::Base
has_one :address
has_many :invoices
end
class Invoice < ActiveRecord::Base
belongs_to :customer
end
And the view code to display the address of the invoice as follows
1
2
3
4
5
<%= @invoice.customer.name %>
<%= @invoice.customer.address.street %>
<%= @invoice.customer.address.city %>,
<%= @invoice.customer.address.state %>
<%= @invoice.customer.address.zip_code %>
Here, you can see that the Invoice model dives deep inside the Customer to reach across to the Address model. This should ideally be avoided. Because if, for example, in the future your application were to change so that a customer has both a billing address and a shipping address, every place in your code that reached across these objects to retrieve the street would break and would need to change.
To avoid this problem, we can use the Law of Demeter, or the Law of Least Knowledge to get around this antipattern.
Since we do not want the Invoice model to access the Address model directly through the Customer model, we can create methods so that the address is retrieved like so in the view
1
2
3
4
5
<%= @invoice.customer_name %>
<%= @invoice.customer_street %>
<%= @invoice.customer_city %>,
<%= @invoice.customer_state %>
<%= @invoice.customer_zip_code %>
And the methods: customer_name
, customer_street
, etc can be created in the Invoice model.
But now the problem is your Invoice model is going to be littered with individual wrapper methods.
We can solve this by using the delegate method which rails provide. Here's how that would look
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class Address < ActiveRecord::Base
belongs_to :customer
end
class Customer < ActiveRecord::Base
has_one :address
has_many :invoices
delegate :street, :city, :state, :zip_code, to: :address
end
class Invoice < ActiveRecord::Base
belongs_to :customer
delegate :name,
:street,
:city,
:state,
:zip_code,
to: :customer,
prefix: true
end
Now all the address fields are available in the Invoice model. Let's say you want to access the street of the customer related to the invoice. You can do so simply by
1
invoice.customer_street
Now if you had included two separate addresses in the future, you only have to change attributes you've included in the delegate method in the appropriate models.
It is a standard convention to keep your controllers slim allowing it as much breathing room as possible. Most of the business logic must be moved into either library classes or model classes of their own. The code has to be properly encapsulated and has to follow the Law of Single Responsibility. A single class should only contain methods related to the function of that class. Any other methods should be abstracted out of the class to its own class. This is also an application of separation of concerns. Keeping this principle makes it easier to scale and maintain code. If code blocks are properly encapsulated in their own domains, it's clear what the function of each module is.
Let's take a look at this controller action
1
2
3
4
5
6
7
8
9
10
class UsersController < ApplicationController
def index
@user = User.find(params[:id])
@memberships = @user
.memberships
.where(active: true)
.limit(5)
.order("last_active_on DESC")
end
end
You can see that the index action is making a find call on the memberships model. This find call can easily be moved into the User model where the find call is appropriately performed. But moving the find call on the memberships model to the user model does not make sense as the user model should not have anything to do with logic performed on the memberships model. So it makes more sense to move the find call to the membership model itself. What about when the application demands for users with active memberships in actions other than the index action. We should not be rewriting the 'where' find call. Here, we can use scopes to keep it DRY.
Here's how the refactored code would look like
1
2
3
4
5
6
7
8
9
10
11
12
13
14
class User < ActiveRecord::Base
has_many :memberships
def find_recent_active_memberships
memberships.only_active.order_by_activity.limit(5)
end
end
class Membership < ActiveRecord::Base
belongs_to :user
scope :only_active, where(active: true)
scope :order_by_activity, order('last_active_on DESC')
end
Here's another example of the Principle of Single Responsibility. Let's say you have model that looks something like this
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
class Order < ActiveRecord::Base
def self.find_purchased
#...
end
def self.find_waiting_for_review
#...
end
def self.find_waiting_for_sign_off
#...
end
def self.find_waiting_for_sign_off
#...
end
def self.advanced_search(fields, options = {})
#...
end
def to_xml
#...
end
def to_json
#...
end
def to_csv
#...
end
def to_pdf
#...
end
end
The converter methods (tocsv, topdf, to_json) can reach 1000s lines of code, and all have a single scope
in terms of the function they serve. They all serve as converter to any order record.
So here, we can move these converter methods to a class of their own, say, OrderConverter
which has the order
as one of its attributes. And as we saw previously, we can delegate these methods using the delegate
method
Here's how the refactored code would look like:
1
2
3
4
5
6
7
class Order < ActiveRecord::Base
delegate :to_xml, :to_json, :to_csv, :to_pdf, to: :converter
def converter
OrderConverter.new(self)
end
end
The views can sometimes get messy if there is a lot of conditional rendering or displaying of data based on some transformations. These conditionals and data transformations can appropriately be moved into the views helper classes.
Let's say you have a Job model which contains a title attribute. But you do want to display it as it is in the record. You want to perform some transformations to data to display it how you want it.
You might choose to do it such a way in the view itself
1
<%= job.title.split(/\s*\/\s*/).join(" / ") %>
But the above line of code can be abstracted out into its own method in the view helper like so
1
2
3
def display_title(job)
job.title.split(/\s*\/\s*/).join(" / ")
end
Now all you have to do is call the display_title method in the view.
1
<%= display_title(job) %>
This is concise, clear and much easier to understand what exactly this helper method is doing.
This is only a crude example. But there are scenarios where you want to display links based on the state and type of the profile a user is viewing. You might choose to do something like this for displaying the links
1
2
3
4
5
6
7
8
9
10
11
<div class="feed">
<% if @project %>
<%= link_to "Subscribe to #{@project.name} alerts.",
project_alerts_url(@project, format: :rss),
class: "feed_link" %>
<% else %>
<%= link_to "Subscribe to these alerts.",
alerts_url(format: :rss),
class: "feed_link" %>
<% end %>
</div>
The above code different links based on the user subscription state. It can easily be abstracted like so
1
2
3
4
5
6
7
8
9
10
11
def subscribe_link(project = nil)
if project
link_to "Subscribe to #{project.name} alerts.",
project_alerts_url(project, format: :rss),
class: "feed_link"
else
link_to "Subscribe to these alerts.",
alerts_url(format: :rss),
class: "feed_link"
end
end
Now just mentioning <%= subscribe_link(project) %>
in the view tells us exactly what the line of code intends to do.
We can see clearly how abstracting code to its own in the view helper improves code readability.
Often, performance issues in an application using the MVC framework can be traced back to the database layer. Improving the performance of your application here means writing optimized queries.
Defining models without indexes can slow the process of querying by a large margin when the records for that model grow uncontrollably out of hand. So it's best to add indexes. A fair amount of thought must be given to which fields in the model you want to index. If a primary key is specified for a table, an index is automatically created. The same is not true for a foreign key. So explicitly defining an index for a foreign key might be favourable for a suitable use case.
Figuring out which fields need indexes and adding an index is an easy task of code housekeeping For example,
you might have a User model, which validates the uniqueness of the field email
like so,
1
2
3
class User < ActiveRecord::Base
validates :email, unique: true
end
In this model, every time you save a User record, Active Record runs a query to try to find other rows that have the same data in the email field. It is much faster to do this comparison on indexed columns than on non-indexed columns. Therefore, this column should have an index. Figuring out which columns need to have an index usually means better performance.
There are several Rails plugins you can use tools to identify missing indexes. The simplest
of them is Limerick Rake, which provides a rake
task db:indexes:missing
. When run on your application, this task examines your database and identifies
obvious missing indexes, primarily missing foreign key indexes.
Another pattern to look out for in the database layer is the use of Ruby for querying instead of using SQL directly where ever possible.
1
2
3
@article.comments.count
@article.comments.length
@article.comments.size
Here, @article.comments.count
will first perform a SQL query after which the ruby method is executed on top of that.
In such cases, using SQL methods can result in better optimization.
Here's an example which demonstrates this
1
2
3
4
5
@account = Account.find(3)
@users = @account
.users
.sort { |a,b| a.name.downcase <=> b.name.downcase }
.first(5)
Instead of using ruby, you can rewrite this like so
1
@users = @account.users.order('LCASE(name)').limit(5)
The above code is much better written since it is using SQL query methods directly to logically fetch records.
Another prime suspect which makes a contribution to bad performance is the presence of N+1 databases queries. In rails, it is effortless to setup associations between different models. But it brings along with one caveat.
Example
1
2
3
4
5
6
7
8
9
10
11
12
class Article < ActiveRecord::Base
has_many :comments
end
class Comment < ActiveRecord::Base
belongs_to :article
end
comments = Comment.limit(100) # 1 DB Query
comments.each do |comment|
puts comment.article.title # 100 DB Query
end
In the above example, every instance of comment.article
is a DB query. So assuming if we have 5 Article comprising
100 comments, this would result in 100+1 query. This is the basic concept of N+1 queries.
This can be easily avoided by using Eager loading when such associated items are accessed. Here Comments
are fetched
matching the condition, post which ActiveRecord fetches all the Article
associated with those comments. Internally
ActiveRecord does an IN
SQL query SELECT * FROM articles WHERE id IN (1,2,3, etc)
. Making a total of just
2 queries. When accessing comment.article
ActiveRecord magically fetches the data from teh cached results.
This can be achieved by rewriting like so with the use of the includes method.
1
comments = Comment.includes(:article).all
In the above logs, you can see that first all the comments are fetched, after which it gets all the associated
articles for those comments. Doing comment.article
does not make a DB query, which makes the code more
performant here.
Memoization is used to cache the result of a code block temporarily and use the cached result when called upon subsequently in the request lifecycle.
1
2
3
4
5
6
7
8
9
10
11
def fetch
load_user_settings
load_user_settings
load_user_settings
end
private
def load_user_settings
do_some_db_query
end
The above code snippet fetches the user_settings
thrice, thereby making atleast 3 DB queries. This can be avoided
with the use of memoization so that after making the DB query once, subsequent query would use instance variable.
1
2
3
4
5
6
7
8
9
10
11
def fetch
load_user_settings
load_user_settings
load_user_settings
end
private
def load_user_settings
@load_user_settings ||= do_some_db_query
end
In the above improved code, calling load_user_settings
does 1 DB query no matter how many times it's being called.
Abused and improper use of DB queries is an antipattern that should be avoided.
Do not fire and forget. When using external services, there are three main response strategies
The third strategy is called 'fire and forget'. In your development configuration file you will have this
1
config.action_mailer.raise_delivery_errors = false
With this config, no errors are raised for both connection errors and bad email addresses. Fishing out such such 'fire and forget' default strategies and handling them appropriately, if needed, for client-side error display, debugging, and better logging is often advised.
When first encountering the wide selection of gems and plugins available, it’s common for new Rails developers to grab any of these third-party tools that will help reduce the amount of code they have to write by hand. This results in the 'Vendor junk drawer' Antipattern. This happens when care is not taken to keep the number of gems and plugins in an application manageable.
It's best to re-evaluate the worth of each gem if the number of gems becomes an overgrown garden. Gems must be pruned regularly to ensure the garden stays tidy and more manageable.
In some cases, newer versions of existing gems might result in breaking changes in your application. You can choose to opt for older versions but in some cases the features that come along with the newer version are desired more than the stability of the older version without you existing application. In such cases, 'monkey-patching' the gem is the right way to go.
Monkey patching is a programming technique wherein you modify or extend existing code at runtime,
without modifying the original source code. You can place the monkey patch in the lib
directory of your application.
Let's a gem called OurGem
always returns true for positive numbers and false for negative numbers.
But in the newer version of the gem's release, it returns true for all numbers.
This would certainly result in some undesirable changes in code. We can monkey patch this like so
1
2
3
4
5
6
7
8
9
10
module OurGem
class HelperMethods < OurGemBase
def valid?(number)
if number < 0
return false
else number >= 0
return true
end
end
end
Placing this code in the lib folder with the file name format such as #{gem_name}_extensions.rb
will
override the valid?
method for the OurGem
gem.
If there are many breaking changes in the newer versions, choosing to refactor the whole application in light of such changes is time consuming, especially when it can done away with monkey patching.
Unorganized tests over time can soon turn into a jungle that can be hard to navigate. When tests are organized into bundles or contexts where the feature that is being tested is properly encapsulated within that context, the test suite becomes much more well reasoned.
Turning this
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
should "should post" do
#
end
should "should share post" do
#
end
should "should delete post" do
#
end
should "should like post" do
#
end
should "should go to homepage" do
#
end
should "should show homepage" do
#
end
To this
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
context "Post" do
should "should post" do
#
end
should "should share post" do
#
end
#
end
context "Homepage" do
should "should go to homepage" do
#
end
should "should show homepage" do
#
end
end
Here, tests are properly encapsulated and the code reviewer knows exactly what each context is referencing.
Never build beyond the application requirements at the time you are writing the code.
If you do not have concrete requirements, don’t write any code.
Don’t jump to a model prematurely; there are often simple ways, such as using Booleans and denormalization, to avoid using adding additional models.
If there is no user interface for adding, removing, or managing data, there is no need for a model. A denormalized column populated by a hash or array of possible values is fine.