Mobile navbar hamburger

A ledger’s primary role is to serve as the definitive source of truth for your financial product or records. It’s a system designed to track and report data accurately, but above all, it must answer one critical question at all times: "What is the balance?"

Whether it’s the current position of an asset, a user, or an entity, knowing the balance is the top priority of any ledger. How do we ensure this question is answered correctly and efficiently? The architecture you choose depends on your preferences, needs, and constraints.

In this article, I’ll explore the two main approaches used today—Running Balances and Aggregated Balances—and help you decide which might work best for your system.

Running balances

A running balance is a continuously updated total that changes as transactions occur, much like the balance in a check-book or bank account. For example, if you deposit $100 into a savings account with a starting balance of $500, the running balance immediately updates to $600.

Here’s what defines this approach:

  • Reacts to events: The balance adjusts in real-time as transactions (deposits, withdrawals) are recorded.
  • Concurrency challenges: Concurrency refers to multiple actions happening at once. If two transactions, like a deposit and a withdrawal, occur simultaneously, the system must ensure the balance doesn’t get miscalculated. This requires careful design, such as locking mechanisms.
  • Rebuildable: You should be able to reconstruct the running balance by replaying the transaction history, providing a safety net if something goes wrong.
  • Fast retrievals: Since the balance is precomputed and updated live, fetching it is quick and resource-light.

Trade-offs: Running balances excel in scenarios needing instant access (an ATM checking your account, for instance). However, they demand robust systems to handle concurrency and can be costly to scale if transaction volume spikes.

Aggregated balances

An aggregated balance calculates the total by summing all transactions (credits and debits) at a given moment, rather than maintaining a live tally. Imagine a payroll system: instead of updating an employee’s total earnings with every paycheck, it logs each payment and sums them up at the end of the month.

Key characteristics include:

  • No real-time reaction: Transactions are recorded without immediately updating a balance, simplifying the logging process.
  • Concurrency-friendly: Without a running total to maintain, there’s no risk of conflicts from simultaneous updates.
  • Aggregation process: The balance is computed by adding all credits and subtracting all debits. For large datasets, this can get slow without optimization.
  • Retrieval considerations: Fetching the balance requires summing transactions on demand, so performance may degrade as data grows unless you use techniques like precomputed summaries.

Trade-offs:

Aggregated balances are simpler to implement and maintain, ideal for high-throughput systems not needing real-time updates. However, the lack of instant balance tracking can fail apps like trading platforms or financial tools where users demand up-to-the-second accuracy.

Performance suffers as data grows, summing millions of transactions can slow queries or demand costly hardware. Without optimization (indexing, caching), retrievals may stall under load. Scaling requires heavy investment in storage and compute power.

Historical balances

Answering "What is the balance?" is only half the job. A good ledger must also answer "What was the balance?" to provide a complete financial history. Both approaches handle this differently.

Running balances

  • Snapshotting: You can take periodic snapshots of the running balance, saving the total at midnight each day, for example. If your savings account was $600 on Monday and $550 on Tuesday, snapshots capture those states.
  • Combining data: Pairing snapshots with transaction history lets you reconstruct the balance at any past point, like figuring out your balance mid-day after a withdrawal.

Aggregated Balances

  • Effective dates: Aggregation uses effective dates, the actual date a transaction occurred, not when it was logged. If you got paid on March 1st but it was recorded on March 3rd, the effective date ensures the balance reflects March 1st’s reality.
  • Caching: Storing precomputed totals (last month’s balance, for instance) and building from there speeds up historical queries without recalculating everything.

Trade-offs:

Running balances make historical lookups fast with snapshots but require extra storage. Aggregated balances are flexible with effective dates but may need caching to avoid slow recalculations.

Choosing the right approach

So, how do you decide between running balances and aggregated balances? There’s no one-size-fits-all answer, it hinges on your use case. Here are some questions to guide you and your team:

  1. How critical is fast retrieval with minimal resources? If users, like bank customers, need instant balance updates, running balances shine. Aggregated balances might frustrate them with delays.
  2. Are we prepared to handle concurrency issues? If your system sees heavy simultaneous activity (an e-commerce platform during a sale, for example), aggregated balances avoid concurrency headaches. Running balances require more engineering effort.
  3. Do we trust our database or data warehouse’s aggregation capabilities? Modern databases like PostgreSQL or warehouses like Snowflake can handle massive sums efficiently, favoring aggregated balances if you’ve invested in them.
  4. Who needs access to this data across the organization? Real-time dashboards for executives might lean toward running balances, while periodic reports for accountants align with aggregated balances.

Example decision:

A bank ATM system might choose running balances for speed and real-time accuracy, despite concurrency challenges. A payroll system might pick aggregated balances for simplicity and batch reporting, accepting slower on-the-fly queries.

Regardless of your choice, every ledger must answer:

  1. "What is the balance?" quickly and accurately.
  2. "What was the balance?" with equal precision.

Conclusion

In summary, choosing between running and aggregated balances depends on your specific needs. Running balances offer real-time accuracy, ideal for scenarios requiring instant updates.

Aggregated balances, on the other hand, simplify concurrency management and are easier to implement for high-throughput systems. Consider your use case, concurrency demands, and performance requirements to determine the best fit. Ultimately, the right approach is one that aligns seamlessly with your system's goals and constraints.