Things to Consider when Metaprogramming in Ruby
10 Apr 2016Metaprogramming in Ruby is a polarizing topic. The most common purpose of Ruby metaprogramming is for code to alter itself at runtime. Metaprogramming can be used to achieve terse and more flexible code. However, it is not without its cost. As with most things, nothing of value is free, even metaprogramming.
Undoubtedly, there is a time and place for metaprogramming; but, awareness of concessions that need to be made to support a metaprogrammed solution is important.
Code Discovery and Readability
One problem with metaprogramming solutions are their obstruction of code discovery. When entering a new project or simply trying to re-familiarize oneself with an existing one, tracing code execution in a text editor can be quite difficult if method definitions do not exist.
For example, we can assume that a User
class exists with a set of metaprogrammed methods:
class User
[
:password,
:email,
:first_name,
:last_name
].each do |attribute|
define_method(:"has_#{attribute}?") do
self.send(attribute).nil?
end
end
end
Although a little contrived, this code is a list of simple convenience methods on a User
class. This solution is easily extended to include additional attributes without a full method definition per attribute.
However, these methods can not be found using grep
, silver searcher, or other “find all” tools. Since the method has_password?
is never explicitly defined in the code, it is not discoverable.
A Work Around:
To combat this issue, some developers choose to write a comment listing the defined method names above the metaprogramming block. This simple solution can greatly help the readability of the code:
class User
# has_password?, has_email?, has_first_name?,
# has_last_name? method definitions
[
:password,
:email,
:first_name,
:last_name
].each do |attribute|
define_method(:"has_#{attribute}?") do
self.send(attribute).nil?
end
end
end
Performance
Depending on the amount of times a piece of code is executed, performance considerations can be extremely important. “Hot code” is a term used to describe code that is called frequently during an application’s request cycle. Since not all code is created equally, understanding the performance implications of different metaprogramming approaches is imperative when writing or modifying hot code.
The Setup
An example application needs to handle incoming data at scale. Upon receiving the data, it must make it accessible to the rest of the application. Myriad options exist to solve this problem, but we can assume that only a few are feasible for this Ruby codebase.
The incoming data looks like the following:
{
"user": {
"name": "Some User",
"phones": [
"818-555-5555",
"415-555-5555"
],
"email": "email@whatever.com",
"birthday": "12-12-1900"
}
}
Note: This data will be referred to in the following examples as incoming_data
and we can assume it was been decoded from JSON
into a Ruby Hash
.
1. All Methods
One way to accept and integrate this incoming data would be create a class which maps all attributes under the 'user'
key to methods:
class UserMetaMethods
def initialize(hash)
hash.each_pair do |key, value|
self.class.send(:attr_accessor, key)
self.send(:"#{key}=", value)
end
end
end
user = UserMetaMethods.new(incoming_data['user'])
user.email
# => email@whatever.com
This solution makes accessing the incoming data very consumer friendly. All attributes appear as methods that return positively to respond_to?
and have corresponding instance variables per attribute.
2. method_missing
A group of metaprogramming solutions would not be complete without one utilizing method_missing
. With method_missing
, a non-existent method call can be intercepted on an object and evaluated with additional data unbeknownst to the original caller.
class UserMethodMissing
def initialize(hash)
@hash = hash
end
def method_missing(method_name, *arguments, &block)
key = method_name.to_s
if @hash.key?(key)
@hash[key]
else
super
end
end
def respond_to_missing?(method_name, include_private = false)
@hash.key?(method_name.to_s) || super
end
end
user = UserMethodMissing.new(incoming_data['user'])
user.email
# => email@whatever.com
The respond_to_missing?
method is also defined to enable respond_to?
and method
calls to execute successfully. Read more information about respond_to_missing?
here.
Note: Patterns equivalent to this are used in some popular libraries like OpenStruct
and Hashie
to achieve similar results.
3. “Regular” Object
As a control, a regular Ruby object can be created with specific attributes defined:
class UserRegular
attr_reader :name,
:phones,
:email,
:birthday
def initialize(hash)
@name = hash['name']
@phones = hash['phones']
@email = hash['email']
@birthday = hash['birthday']
end
end
user = UserRegular.new(incoming_data['user'])
user.email
# => email@whatever.com
An immediate downside to this approach is: if the contract of the external service changes this object may not be initialized with all pertinent data.
4. A Hash
No additional code is required for this approach, a consumer would simply use the resulting Ruby Hash
after the received JSON
is parsed:
incoming_data['user']['email']
# => email@whatever.com
Not a metaprogramming solution, but still a valid way of handling the passed in data. Using a simple Hash
does not grant the flexibility of the other solutions but can be a great base-case for performance testing.
How They Compare
Finally, the exciting part: potentially relevant performance benchmarks.
The library we will use to test how each of these solutions does is benchmark/ips
.
This library makes it simple to define different implementation sections and then compare them:
require 'benchmark/ips'
Benchmark.ips do |x|
x.report('UserMetaMethods') do
1000.times do
u = UserMetaMethods.new(incoming_data['user'])
u.email
end
end
x.report('UserMethodMissing') do
1000.times do
u = UserMethodMissing.new(incoming_data['user'])
u.email
end
end
x.report('UserRegular') do
1000.times do
u = UserRegular.new(incoming_data['user'])
u.email
end
end
x.report('Hash') do
1000.times do
u = Hash(incoming_data['user'])
u['email']
end
end
x.compare!
end
Each report
corresponds to a solution described above. The Hash
report did not need to initialize a new object, but for consistency’s sake, a new Hash
is initialized from everything under the 'user'
key of the original Hash
.
Running this code results in:
Calculating -------------------------------------
UserMetaMethods 79.294 (± 2.5%) i/s - 399.000
UserMethodMissing 1.531k (± 1.2%) i/s - 7.791k
UserRegular 913.295 (± 1.4%) i/s - 4.628k
Hash 3.141k (± 1.0%) i/s - 15.860k
Comparison:
Hash: 3141.2 i/s
UserMethodMissing: 1530.5 i/s - 2.05x slower
UserRegular: 913.3 i/s - 3.44x slower
UserMetaMethods: 79.3 i/s - 39.61x slower
Wow! Aside from simply using a Hash
, the method_missing
implementation is the fastest. Quick, everyone go change all the code to use method_missing
! No. Stop. Do not do that.
While it might be faster than the UserRegular
implementation, it is not without its drawbacks. A method_missing
solution certainly has value in a variety of situations but a simple benchmark should not persuade anyone to simply switch their code around to gain the “speed up”.
What about existing libraries that have similar behaviour to method_missing
(i.e. Hashie
and OpenStruct
)?
To add them to the existing benchmark, the corresponding classes must be made:
require 'ostruct'
class UserOpenStruct < OpenStruct
end
require 'hashie'
class UserMash < Hashie::Mash
end
Then two new report
calls can add them to the existing benchmark:
Benchmark.ips do |x|
# ...
x.report('OpenStruct') do
1000.times do
u = UserOpenStruct.new(incoming_data['user'])
u.email
end
end
x.report('UserMash') do
1000.times do
u = UserMash.new(incoming_data['user'])
u.email
end
end
# ...
end
The results of these two additions is a bit of a surprise:
Calculating -------------------------------------
UserMetaMethods 79.050 (± 2.5%) i/s - 399.000
UserMethodMissing 1.537k (± 1.3%) i/s - 7.752k
UserRegular 914.824 (± 1.4%) i/s - 4.576k
OpenStruct 49.954 (± 6.0%) i/s - 250.000
UserMash 194.411 (± 1.5%) i/s - 988.000
Hash 3.140k (± 0.9%) i/s - 15.759k
Comparison:
Hash: 3140.1 i/s
UserMethodMissing: 1536.9 i/s - 2.04x slower
UserRegular: 914.8 i/s - 3.43x slower
UserMash: 194.4 i/s - 16.15x slower
UserMetaMethods: 79.0 i/s - 39.72x slower
OpenStruct: 50.0 i/s - 62.86x slower
Despite OpenStruct
and Hashie
seeming very similar to our homegrown method_missing
solution, both yielded worse results. However, like other metaprogramming solutions, both OpenStruct
and Hashie
make up for this speed deficiency with flexibility.
If this were a real problem in a production application, OpenStruct
and Hashie
could certainly both be viable solutions. Unless the code path to utilize these libraries was scorching hot, their performance issues might not be a factor.
Why the Slowdown?
The reason that some metaprogramming solutions are slow has partially to do with the Ruby inline method cache. In Ruby, the inline method cache is responsible for storing methods that it knows about so as to avoid a costly look up operation every time. Metaprogramming interferes with this built in cache by invalidating its cache key.
Every time a class is reopened or a method is defined on a class, pieces of the inline method cache key change, resulting in a cache miss and method lookup. Metaprogrammed code (especially code that executes at every Object.new
like UserMetaMethods
) does not benefit from the inline method caching in the same ways as “normal” code does. For much more information, a man much smarter than I wrote a great article explaining Ruby inline method caching in detail.
Use the Right Tool for the Job
When iterating through a list of options, no single data point is sufficient enough to rule one option superior to all others. Benchmarks should be treated as a single data point and bring depth to a comparison, not rule it. After all, who cares how slow a piece of code is if it is never run?
Metaprogramming is a very powerful tool in the Ruby language that should be wielded with care. Like anything, using metaprogramming too much can cause unmaintainable code. This sort of code might be great for job security, but could be less performant, unreadable, and unmaintainable by coworkers.
When used correctly and in appropriate circumstances, metaprogramming can be a great asset. The trick is knowing when to use it and when to refrain. Sometimes just using a Hash
might be the best solution.