Skip to content

Instantly share code, notes, and snippets.

@deanrad
Last active July 10, 2019 19:00
Show Gist options
  • Save deanrad/d0a02c0d30ea6dc9d658 to your computer and use it in GitHub Desktop.
Save deanrad/d0a02c0d30ea6dc9d658 to your computer and use it in GitHub Desktop.
has_many is a lie

has_many is a SOLID violation

TL;DR has_many is an anti-pattern which leads straight to monolithic applications.

First - what defines a monolithic application? It is one that cannot be split apart because everything depends on everything else. No part can be extracted to an external library - the library would have to be the size of our entire app. Let this be our working definition of a monolith. Ignore, for the moment, whether it's deployed on more than one machine, but ask could it be separated if you desired.

One major culprit of not being able to split an application up is a dependency cycle.

Feedback loop

Why are dependency cycles so bad? Because when there is a dependency cycle, every module is ultimately dependent on every other— the monolith problem. So, what explains the persistence of dependency cycles in almost every Rails app, ever?

has_many

has_many. It has seduced you with it's allure. First it woos you with its eerily easy way of getting to child entities - user.posts, customer.orders, etc. Then it distracts you with its apparent complementary nature with belongs_to. You can't split up this dynamic duo, like peanut butter & jelly, right? Wrong. has_many is the Siren that lured your ship straight into the Cliffs of Monolith.


The Cliffs of Monolith

What is the first model you added to your application? Probably User, right? So, once you wrote user.rb and its corresponding tests, and committed it - why did you ever open that file up again to tell it about something that it did not need to know existed? Rails keeps you from reopening user.rb if you add a column to the User table, and this is good, right? So why, when you added a Posts table far away, did you open up User again to make it aware of Posts? Did the definition of being a user change? Did you did not realize you were violating the Open-Closed Principle, one of the 5 principles of SOLID design? Somewhere inside I bet you knew it felt dirty to keep opening up User and making it aware of things that it had been blissfully unaware of. But you did it anyway, and so did I for a long time, but it's possible to quit.

belongs_to, on the other hand, is a SOLID, dependable creature like a loyal Saint Bernard. Its necessity follows from the need to enforce the rule that a Post must never exist without a User. Post is described as being "functionally dependent" on User, in the language of databases. It is a smell, in a relational database model, for a parent table know about its children. Except in strategic cases like creating a denormalized post_count field on User. Boyce and Codd, the seriously smart and analytical creators of the Relational Model defined the degrees of normalization to include 3rd Normal Form and even a stronger 5th normal form, with the recommendation that we at least achieve 3NF. This is good guidance to heed.

Edgar Codd

Now, some seriously smart people designed Rails as well, but nobody could have predicted how long-lived Rails applications would have become, way back in 2005. And thus we're in the predicament we're in today. has_many is established practice, yet harmful. It is mostly to blame for user.rb having the worst thrash in your codebase. It has turned your data model into a web of dependency cycles.

But now - the easy, rewarding part - seeing that doing without has_many will impact you less than you once thought. Which of the following will reduce thrash of the user.rb file, allow you to follow SOLID principles, and break dependency cycles in your application?

posts = user.posts
posts = Posts.for(user)

Now that you know, it's obvious. And the implementation of for, while not done yet, would be trivially simple. I hope to see an implementation of for soon, possibly even with deprecation warnings on has_many.

Happy Coding!

<script> (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) })(window,document,'script','//www.google-analytics.com/analytics.js','ga'); ga('create', 'UA-59977321-1', 'auto'); ga('send', 'pageview'); </script>
@arnaudsj
Copy link

Good write Dean! But is has_many the only reason for the monolithic trap in rails?

@eigenhombre
Copy link

I like it -- funny, strong focus on one point, well delivered. I'd be curious about performance of user.posts and Posts.for(user) -- does AR turn them into the same SQL?

I agree with Sébastien's point. Also, the Rich Hickey enthusiast in me wants a mention of why BBMs/monoliths are bad. Why are the O (and S) in SOLID important? Why does complexity kill? Even a single sentence or phrase justifying this would make the overall point stronger.

Finally, I'd drop the colon in the title -- everyone else since Dijkstra has.

I find it interesting that so many of the good ideas in programming have to do with removing things from a language or framework: GOTO, explicit memory management/pointers, mutation, side effects, has_many. My gut take on this now is that any language or technology that wasn't made with a hard look at what could be eliminated is probably problematic.

@blischalk
Copy link

Another talk that I find related to your post is:

Keeping Your Massive Rails App From Turning Into a S#!t Show with Benjamin Smith from Pivotal Labs. (Dependency stuff around 18min)

In this talk Benjamin talks about circular dependencies in Rails. He points out that as soon as you have a Post belongs_to a user and a User has_many posts you have just created a circular dependency. He suggests utilizing the DataMapper Pattern to prevent the circular dependencies. In the User/Posts example, you would have a UserManager class that you could call find_by_id on e.g

user_manager = UserManager.new
user = user_manager.find_by_id(@post.user_id)

What are your thoughts on DataMapper and its merits compared to the Post.for option you suggest in your post?

@deanrad
Copy link
Author

deanrad commented Feb 15, 2015

  • ActiveRecord does not actually contain the for method I described, it's a change I think needs to be written.
  • I never said has_many explains all monoliths. Just that for this criteria of being a monolith - having a dependency cycle - has_many is to blame. If other, measurable criteria exist, let me know, I'll write about those too 😀

Though I agree with it, I couldn't write as strongly about the Single Responsibility Principle, though perhaps I should. Every app I've ever worked on had an average responsibility per module greater than 1.0 - to itself, and to at least one other.

@blatyo
Copy link

blatyo commented Feb 20, 2015

I don't disagree that cycles are problematic. However, I don't think you've removed any by taking away has_many and adding for. You need to decide which direction you want to use and only use that direction. The for solution has a number of drawbacks related to performance. For example, how would someone perform an includes?

@vrybas
Copy link

vrybas commented Feb 23, 2015

Great! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment