Event-driven Modular Monolith

Strategies for keeping legacy Rails apps maintainable.

The main Rails app I currently work on has just turned eight. It’s not a huge app. It doesn’t deal with web-scale traffic or large volumes of data. Only six people working on it now. But eight years of pushing new code adds up.

This is a quick overview of some of the strategies we use to keep the codebase maintainable.

In this post

Background

After the first few years, our codebase suffered from typical ailments: tight coupling between domains, complex database queries spread across various parts of the app, overgrown models, a maze of side effects triggered by ActiveRecord callbacks, endlessly chained associations (e.g. Current.user.agency.invoices) – with an all-encompassing User model sitting on top of the pile.

The patterns I’m sharing here are meant to address those issues, without radical rewrites and microservices.

Modular Monolith

At the heart of this approach lies the Monolith. A single Rails app broken up into smaller modules, all using the same database, all sitting in the same Git repository.

It’s an idea borrowed from DDD and Shopify’s Modular Monolith. Each module (domain) contains a mini-app with its own namespace.

For example, the Invoicing module handles all invoice-related functionality. Other modules ideally know nothing about the existence of Invoices. If two modules depend on each other, the relation should be one-sided whenever possible: the Payments module knows all it needs to know about Invoices, but not the other way around¹. If there’s a need for different modules to interact, it should happen via events (more on that later).

We don’t use Rails Engines or any third-party libraries. The split is purely based on pathname lookup and namespaces. Initially, it involved just moving files around.

The directory structure looks like this:

├── app
├── bin
├── config
├── db
├── domains
│   ├── accounts
│   │ ├── app
│   │ │   ├── events
│   │ │   ├── graphql
│   │ │   ├── jobs
│   │ │   ├── models
│   │ │   ├── repositories
│   │ │   └── services
│   │ └── test
│   ├── agencies
│   ├── analytics
│   ├── geography
│   ├── invoicing
│   ├── leads
│   ├── lead_distribution
│   ├── marketing
│   ├── partnerships
│   └── payments

It involves a simple change in the application config:

# config/application.rb
module MyApp
  class Application < Rails::Application
    # ...
    PATHS = %w[
      app app/controllers app/channels app/helpers app/models app/mailers
      lib lib/tasks
    ].freeze

    Dir.glob("domains/*").each do |component_root|
      PATHS.each do |path|
        config.paths[path] << Rails.root.join(component_root, path)
      end
    end
  end
end

We also have an ENV var set to include test files in those modules when running rails test:

DEFAULT_TEST="{test/**/*_test.rb,domains/*/test/**/*_test.rb}

A frustrating aspect of having sub-apps with namespaces is that you end up with lots of redundant subdirectories just to comply with the autoloader:

domains/invoicing/app/models/invoicing/invoice.rb

We don’t enforce strict boundaries. There’s nothing preventing you from calling Invoicing code in other modules, but namespaces make it clear when the boundary is crossed.

Pub/Sub (Events)

To decouple the modules, we make heavy use of events. Rather than a sequence triggered by callbacks:

form = ContactForm.create
lead = Lead.create_from_contact_form(form)
assigned_agency = lead.assign_to_agency
AgencyMailer.with(agency: assigned_agency).new_lead.deliver

We have the following:

The ContactForms module stores a form record and publishes an event called ContactFormSubmitted.
The Leads module listens for that event, creates a new Lead record in the database and then publishes a LeadCreated event.
LeadDistribution module listens for that event, decides which Agency gets the lead based on distribution rules and then it publishes a LeadAssigned event.

The ContactForms module knows nothing about Leads. Leads know nothing about LeadDistribution. Each module handles its own responsibilities. At one point, there will be a LeadAccepted event and the Invoicing module will kick in, followed by Payments.

We chose RailsEventStore, but even a basic Pub/Sub implementation will get the job done. We use both synchronous and asynchronous handlers.

Events have many benefits if not taken to the extreme. They naturally describe the flow (free audit log) and it’s easy to add new functionality by subscribing to existing events. Let’s say you want to push users to your marketing email list. You can subscribe to the user events (UserSignedUp, UserUpdated, UserDeleted) and react accordingly. At no point does the UserAccounts module need to know about the Marketing module.

Now for the not-so-great part. Events do make the code more difficult to follow. At a glance, it’s not obvious what happens after the event, especially if event handlers are asynchronous (background jobs).

For this reason, we have a rule to avoid adding event handlers within the same module. For example, if the Invoicing module publishes an InvoiceIssued event, it’s fine if other modules don’t handle it. But the Invoicing module should not listen for this event, for example, to send email notifications or schedule background jobs. This should happen explicitly in the service that issues invoices.

Testing event-based code has its challenges too: often the decoupling you aimed for is reintroduced in test files. This is something we don’t yet have clear rules for, because many test files precede the refactoring and we rely on them to ensure everything works.

Patterns

Modules and Pub/Sub are the foundation of this approach. However, we have several other non-standard patterns that make a big difference in maintainability.

Service Objects

Services can be a controversial topic in the Rails community and I’m not here to defend them. You can achieve a similar effect with PORO models.

For us, a service class is part of the business logic layer. It performs operations on one or more models and explicitly triggers side effects (events, background jobs, email/SMS notifications).

Examples from the codebase:

UserAccounts::DeleteUserAccount.call(user:)
Invoicing::IssueInvoice.call(object:)
Agencies::InviteUser.call(agency:, email:)
Partnerships::GenerateAgreementForm.call(agency:)

One potential drawback of this approach is that it can produce many small classes that could as well be methods in a model or controller.

But the big upside is that it gives us manageable, logical units of code that are clear and easy to test. When a class has too many lines of code, it’s a telltale sign that it has too much responsibility and needs to be broken up.

Repositories for Database Queries

This rule raised some eyebrows initially.

No ActiveRecord queries outside of repositories. If you need to query the database, you should do it in a repository class.

class Invoicing::InvoiceRepository
  def self.find(id)
    Invoice.find(id)
  end

  def self.find_overdue_invoices
    Invoice.where(status: :overdue).where('due_date < ?', Date.today)
  end

  def self.create_invoice(attrs)
    # ...
  end
end

For simple queries, it’s hard to see the benefit, because you’re essentially wrapping the standard ActiveRecord functionality. However, the true value shows when you isolate complex queries with conditional WHERE clauses, joins and filters.

The Repository pattern itself isn’t that important. The main objective is to avoid littering the codebase with ActiveRecord queries. It also prevents us from bloating model classes.

Each domain can and should have its own repository class that handles the same underlying ActiveRecord model:

Agencies::UserRepository
UserAccounts::UserRepository

In theory, repositories allow us to replace ActiveRecord with another library if we ever need to. It’s unlikely that will ever happen, but the app isn’t coupled to the ORM to the same extent as before.

Slim and Dumb Models

We keep models minimal. A model class will usually contain validations, scopes, associations and a couple of convenience methods like:

def deletable?
def non_exclusive?
def first_name

There should be no after_save callbacks² and the record should not perform operations on itself.

SQL queries should always happen in a repository and the business logic will usually go to a service class.

We keep a set of concerns like NormalizablePhone, PhoneValidations, Notifiable, but we avoid using concerns to decompose models.

Bonus: A Separate Frontend App

Rails has always had a weak frontend story³. I personally don’t mind ERB views, partials, helpers and JavaScript sprinkles. But the world moved on in a different direction.

A few years back, finding a frontend developer willing to work with vanilla Rails views⁴ proved to be a big challenge. Hiring for React? A whole different story.

That was the main reason we decided to extract our frontend into a Next.js app (with TypeScript, Tailwind). The old Rails app was turned into a GraphQL server⁵.

It was the most radical rewrite we did: we set up the new Next.js app, migrated routes one-by-one and kept both frontends running (thanks to nginx path rules) until the whole thing was moved to React.

Having a separate frontend app comes with its own set of headaches, but it had a considerable benefit of simplifying the Ruby codebase. The main application no longer bothers with presentation details, CSS, JavaScript, assets etc. It’s much cleaner this way.

How Do I Start?

The idea of implementing this in a legacy app might sound daunting. In our case, it happened bird by bird, one step at a time and took months before any visible results. In many parts of the app, these guidelines are still aspirational, but we do our best to write new code following them.

Many of these changes can be done incrementally. Some don’t even make sense for smaller apps. If you’re just starting out and don’t have a lot of code, modularizing the app should not concern you. You can get most of the benefits by introducing namespaces in the global app/ directory.

That said, if I had to start all over again, here’s what I’d do:

Create the domains/ (or modules/) directory, start with one or two obvious modules (user accounts, subscriptions), others will emerge naturally later on. Move relevant files to the newly created module directories, without doing anything else. From the Rails standpoint, the change is neutral: tests will pass, the code will be the same, the only difference is that some files are in a different location.
Start publishing domain events (ContactForms::FormSubmitted, UserAccounts::UserSignedUp, UserAccounts::UserLoggedIn), even if nothing listens for them. Store those events somewhere. Consider it an audit log at first.
Extract service objects from models/controllers. A good candidate is a piece of code that operates on one or more models and triggers side effects (grep for after_commit calls).
Introduce namespaces to reflect the domain the file is in. There’s no need to rename existing model classes. With this approach, you don’t call models directly anyway, so focus on service and repository namespaces.
Extract queries to repositories. Start by putting new SQL queries in a repository class. Gradually extract existing ActiveRecord code.
Be on the lookout for any code that calls different domains. Try to turn those calls into event handlers.

These patterns naturally enforce structure. You can no longer just dump new code in the global app/, so you have to ask yourself questions, such as: “Where does this belong?”, “Is this module the right place for it?”, “Should I create a new one?”, “How do I avoid this coupling?”

Obviously, there’s a lot more to it and I left out many implementation details. If you’re curious about any specifics, feel free to email me and I’ll be happy to elaborate.

This may sound obvious if you’re coming from a different language, but in Rails, dividing applications into modules is not a common practice. ↩︎
We do have a couple of before_save or before_validation callbacks, like before_save :normalize_phone. ↩︎
Remember RJS? ↩︎
Hotwire did not yet exist at the time. ↩︎
GraphQL is not something I’d choose again today and it’s the part of the stack that bothers me most with its absurd complexity. But at this point, we’re in too deep and I haven’t found a good alternative.

We considered OpenAPI and Inertia.js, but didn’t find an elegant solution for generating TypeScript types for the frontend, which is the main selling point of our current GraphQL setup. ↩︎