The Agile Echo

Continuous Integration and Trunk-Based Development: coding patterns and real-world examples

I often describe the advantages of CI and TBD: today, I will make it more concrete by sharing with you some real-world use cases and coding examples from my experience. This issue aims to share a concrete guide for those who want to start applying these practices to help you achieve them in your practices first, and then in daily work - the coding patterns tips will be simple but clear, enough to clarify the approach and start practicing it.

Cover Image for Continuous Integration and Trunk-Based Development: coding patterns and real-world examples
Dan the Dev
Dan the Dev
continuous-integration
trunk-based-development

Introduction

Hello, developers! 🚀

Since I started this newsletter, I’ve talked of a lot of topics related to programming, Agile practices, and the work of a Software Engineer in general - but for sure, there are some of them that I talked about much more than others, with the top 3 probably composed by Test-Driven Development, Continuous Integration, and Trunk-Based Development.

Today, we talk again about the last two: after a quick recap of what they are about, I will share with you the coding patterns that enable CI and TBD by hiding work in progress and some real-world examples where I applied those practices.

This issue aims to share a concrete guide for those who want to start applying these practices to help you achieve them in your practices first, and then in daily work - the coding patterns tips will be simple but clear, enough to clarify the approach and start practicing it.

In this first part, we will introduce the topics in short, and then deep dive into the coding pattern I want to talk about. Then, next week, in the 2nd part, I will share 2 real-world examples where I applied those patterns in my experience - with my final reflections on the topic.

I wanted to put some code examples, but this kind of pattern requires a lot of context to build a meaningful use case, and it would still be hard to understand only by reading, so I thought it wasn’t worth it and gave up the idea.

The principles

Continuous Integration is one of the most powerful concepts to master in the Software Development World, probably the one that most impacted how I see this job.

CI is the activity of very frequently integrating work to the trunk of version control and verifying that the work is, to the best of our knowledge, releasable.

Implementing Continuous Integration implies some practices that are embedded into it, and one of those practices is Trunk-Based Development.

Trunk-Based Development is a methodology where changes to code are integrated directly in the main branch, without any other branch in the middle, at least once per day.

If you want to implement CI in your workflow, you must start using the set of practices that together make CI, including Trunk-Based Development: in other words, you can’t say you do CI if you don’t do TBD.

Sadly, that’s what most company does, at least in my experience - they use feature branches, typically with a very long life of multiple days, and then think they are doing CI just because they have an automated pipeline.

That’s just wrong.

Anyway, when it comes to the idea of very frequently integrating work to the trunk, where very frequently means daily or even multiple times per day, the typical concern is: how do we hide work in progress if I have to merge unfinished work to master (and even release to production, if I also want CD)?

There is a great quote I agree with, about this topic. The quote comes from Bryan Finster, who as a guest on an issue of the Crafting Tech Teams newsletter once said:

Software Developers have been trained to deliver complete features. That’s the problem. It’s one of the hardest habits to break.

That is so much true!

90% of software developers I know have this bias: we were taught to release fully completed features, and see the release of the software as the last piece of a long process instead of part of the daily work; this habit makes the question about how to hide work in progress seems to be a very hard and complicated question to find an answer to.

The reality is that this is a problem that has already been solved: there are multiple patterns that we can use to keep the work in progress hidden from the user, allowing us to release our code without any issue on their side.

We will dive into the most common patterns in a while, but I want to point out an important thing before; let me share a quick example with the most known pattern to hide work in progress: Feature Flags. A basic implementation could be as simple as a boolean value in the configuration, at a user level, that allows you to decide if a user has access to that feature or not; this has some consequences on code implementation, of course, because you’ll have to handle this configuration value to decide whether to execute the behavior or skip it.

From this example we can highlight a characteristic that is shared by all the “work in progress hiding patterns”: it will probably require some additional work to hide the work in progress; we have to think about how to hide it and make some coding to achieve this purpose.

So the question here is: is it worth it?

Of course, it is, and the reason is quite simple: the investment we take to hide our work in progress and enable the release of our code is completely repaid by the unplanned work we will avoid after the release; it is proved that implementing CI in our daily work will drastically reduce the unplanned work we will face later on during the process of completing the feature.

Maybe you remember it from our Lean Software Development issue, but let’s say it again: unplanned work is one of the biggest sources of waste in software development, therefore reducing it should be one of our targets.

Here, we are reducing unplanned work thanks to a little additional, but plannable, work: it means we are investing 1 today to be sure to not waste 10 tomorrow; we are accepting a bit of additional conscious work, that we are aware of and therefore is plannable, to ensure we avoid some unplanned, unexpected work later that will cause waste, issues, waiting time and context switch to be solved.

As always, if you want to deep dive into such topics, you will find a lot of useful resources in the Go Deeper section at the end of this issue.

Coding patterns

Keep the API Hidden

The pattern

The easier way to hide code to be executed or noticed in production is to hide its interface: just make sure that the path to the feature is either hidden to the user or the last piece you release. In a well-designed system, such interface elements should be minimal and thus simple to add.

Practical Tips

  1. Hide (or release last) the route of the API - I typically like to do this with a stubbed response that respects the real contract

  2. Hide (or release last) the UI that enables the user to interact with the feature (it can even mean hiding an entire component or page)

Feature flags

The pattern

A boolean value, typically based on a constant, a parameter, or a configuration value, that a class can use to decide if we want to execute path A or B: this is basically what Feature Flags are. This is a simple way to be able to activate the new behavior for testing, while it’s still disabled in production.

Practical Tips

  • Build a Feature Flag class or use an existing library (an example here) - some tips:

    • prefer Strategy pattern to implement the two different paths

    • be careful to pick the granularity you need for the flag: if it’s too big it can become useless, if it’s too small it can become too complex to maintain, so make sure it’s worth it

    • Feature Flags are temporary most of the times (at some point only one of the 2 paths usually survive, whether it was an experiment or a change) so be sure that the implementation is also easy to remove

  • Make sure that you can enable/disable the flag quickly: if your release process is fast enough, even changing code can be fine - otherwise favor a DB table configuration or something else that allows you to do it in seconds

Parallel change

The pattern

Also known as expand/contract, these patterns suggest expanding the current code instead of changing it: for example, if we have to rename a database field, we should:

  1. create a new one instead

  2. start writing in both fields

  3. copy data from old to new field

  4. start reading from the new field

  5. remove the old field, now unused

This allows us to make each of those steps reversible easily, something impossible if we directly change the name of the column.

Practical Tips

  • The main objective when doing a change is to make it safe, ensuring we can rollback in case of issues, and avoid the need for synchronous changes with clients.

  • Another example is creating a new version of a public API:

    • release the new version first, while still keeping the old one

    • ensure that no one uses the old one anymore (providing a deprecation date might be good enough in most cases)

    • remove the old version of the API

  • The idea of Parallel Change is pretty flexible: in general, stop thinking of software development as changing code, and think more of it in terms of expanding what exists first, and then removing what’s not needed anymore

Dark Launch

The pattern

When you have to add a new behavior, you can make the code execute this behavior without impacting the existing one: this way, it will be invisible from the user's perspective, nothing has changed from his point of view - but in reality, you are executing that code in production to test its impact and monitor things like performances or saved data.

Add your behavior, and just ignore the results from the application point of view - but log or save any info about what has been done when it goes well, or what happened when it broke up.

Practical Tips

  • This pattern best suits use cases where the new behavior is enhancing something already existing that doesn't require any additional interaction from the user.

  • Ensure that the “dark” code is under a try/catch that handles all errors: this way, you can save all the info you need from the error (in logs or everywhere else you might need it) and allow the code to move on without breaking

    function foo(): void {
       // do something before
       try {
           // execute the new behaviour here
       } catch (Throwable $exception) {
           // save the data you need about the error, then let the code move on
       }
    }
    

Branch by abstraction

The pattern

The idea here is that we can introduce a different alternative behavior, but instead of making it used by everyone immediately, we use an abstraction layer to route only part of the requests to it: it can mean that only a specific component uses the new behavior at first, or just a percentage of requests.

Thanks to this approach, we can iteratively roll out the change and increment the usage until it replaces the old behavior completely.

This pattern has some similarities with Parallel Change: the main difference is that here we progressively go from old to new behavior, while in Parallel Change it’s more of a direct switch after the two behaviors live together.

Practical Tips

  • If the code you need to change is used directly by others, build the abstraction layer first

  • If you are doing this on some legacy or badly tested code, you can take advantage of the abstraction layer to add tests to that behavior

Real-World examples I applied

A food delivery API backend

The context

In a food-tech company, I was working on a Food Delivery B2B product, and I was able to work on that almost from the beginning. We had a couple of customers who were ready to take advantage of our food delivery services to offer them to their employees as an additional lunch benefit option, and we had to build the entire platform where they could place orders for the day.

As I mentioned, it was “almost” the beginning: we had a few months of work already done on a generic set of APIs that could serve any generic e-commerce/food delivery app, with way more API routes and features than we needed - but still no frontend and no clear idea of what the core of the product should be.

The use case

As you can imagine, a lot of work had to be done, and a lot of it was outside of code - but for this issue, we will focus on coding. We had to sustain the business in a better way than was done before, therefore we had three main objectives:

  • build the frontend MVP

  • build the missing backend APIs for the MVP

  • clean up existing code (clean up here means multiple stuff: the code had a lot of code smells, tests were there but it was only integration tests at controller level, they were very brittle and weren’t testing enough, most often only 200 as HTTP response code, and some features weren’t even needed at all)

Of course, the clean-up had to be done sustainably, because we still had to serve the business to achieve the MVP release successfully and less waste possible (zero waste was impossible at that point).

The approach

We applied mainly two patterns here: Parallel Change and Dark Launch.

Since when I joined, I pushed to move into an MVP version of the software for the first release - of course, the backend would have to live with existing useless code somehow, but at least we could build the frontend from an MVP perspective and also use the set of features used in the frontend as the list of “actually required” features.

Taking inspiration from Dark Launch, we moved all the existing routes of the APIs under a useless route group that no user had access to - and then built a clean version of the active group of APIs including only useful routes. The routes were still there in the codebase, but no one could access them. This made it easy to make decisions about how to improve the code, especially when we had to clean up some code that was used in multiple places: every feature that didn’t enter the MVP phase was removed safely unless we were already sure that it was a feature that we had to include in a new release very soon (example: special rules in discount).

In some cases, we had to keep a feature: the behavior was fine but we had to refactor somehow to build a code and test we could trust… (1 ensure tests were good, 2 copy-paste the feature, 3 refactor in parallel change to test the behavior was the same)

This way, we were able to face this issue iteratively and progressively, keeping it sustainable and completely transparent to the users.

Smart Fridges iterative rollout

The context

In the same company and product context as the previous example, we also offered Smart Fridges to our customers, and one specific use case is to use it as a vending machine for healthy, high-quality products for both lunch and snacks. The UX was very smooth: you could just open the fridge via QR code with your phone, take what you wanted, and go away, and we automatically calculate what you have to pay and take it from the wallet.

The use case

After the first version release, we needed more marketing-related features like using coupons, discount codes, special offers, highlights favorites products, etc. From the user perspective, what we wanted to do is add a pre-opening screen: after using the QR code, the fridge wouldn’t open immediately anymore - instead, we show a page dedicated to the fridge, with an “OPEN” button; that page enabled us to new interactions with the customer. While from the user's perspective, it wasn’t a big change, technically it required some changes on the fridge opening and payments, basically the two most delicate pieces of the puzzle.

The approach

We approached this with two patterns in mind: Parallel change and Feature flags; but then also Branch by Abstraction came in handy.

On the backend, we applied Parallel change by creating a completely new API version for both the opening and the payment, starting from a copy of the existing one. Since the opening API became “pre-opening”, we also had to create a new API specifically dedicated to opening. These APIs were also released in production, but no one could use them because the front end wasn’t using them and we didn’t give access to anyone to those routes.

On the front end, we applied Parallel change by creating the new “pre-opening” page, which was a brand new page - it simply wasn’t used by the user flow, so no one could see it. We didn’t hide it completely (meaning that if someone guessed the URL he could have seen it) because it was in any way related to the QR code and auth strategy with the fridge, so it wouldn’t have worked anyway, and that was enough for us. If needed, we could have added a way to hide it completely.

Then, once released in production, we had to test it from a user perspective (we had unit and integration tests of course), so we implemented a feature flag at a fridge level: this allowed us to decide what to do when a user tried to open a fridge based on a variable related specifically to that fridge.

This was also a simple implementation of Branch-by-abstraction: the interface that decided which open strategy to use was based on the feature flag in this case, but still, we had an interface that was taking that decision and redirecting the user to the right strategy - we could also have a more granular approach to send only a percentage of users of a single fridge into the new strategy, but we didn’t have thousands of users so we didn’t need this level of safety.

Thanks to this flag, we were able to roll out first on our test fridge in our office and run the first tests - then, one by one, we activated the new opening on all fridges. The iterative rollout was already simplifying things, but the flag was also helpful to rollback immediately in case something didn’t work well.

Once the new feature was stable enough (in our case, we waited for a couple of weeks), we just removed the old code and the feature flags.

Feature Flag for a progressive release

The context

In a more recent experience, I had to work on a project dedicated to upgrading the database of the historical monolith database from MySql 5 to 8. In case you don’t know, with this upgrade there is an automatic cache feature that is deprecated, and our application was taking so much advantage of it that upgrading was causing a drop in performances on some of our main pages for SEO: we couldn’t afford that.

The use case

From the beginning, I was very concerned about the work to do because the codebase wasn’t covered by enough tests, and we couldn’t trust the existing ones - I also didn’t have confidence in that codebase either because I was new.

What we had to do in practice was replace the (dozens of) queries from the most important pages for SEO with a single GET read operation from REDIS: an async data calculation would have prepared the required JSON structure to replace the queries.

Some problems: no one had enough knowledge to build a JSON sample, so we had to build it step by step, and no one was aware of what data from those queries was used - we were pretty sure that a lot of fields were unnecessary since most queries were getting all the fields from all the tables, but we couldn’t understand which ones were unnecessary upfront (and even if we could do it, it would have taken ages).

The approach

An iterative, exploratory approach was the only safe way here but we also needed a way to test the changes (non-regression kind of testing, since it’s like refactoring work here).

Again, Parallel Change, Dark Launch, and Feature Flags were our safety net here.

We put together those patterns to build a cascade structure:

  • via Feature Flags at a page level (a simple boolean value on a database configuration table) we could decide for each page if the data was coming from DB or Cache (with a Strategy Pattern implemented on code through the Feature Flag)

  • via Parallel Change, we handled some shared code that was unsafe to refactor by copy-pasting it in a version dedicated to cache, where we replaced DB queries with data from cache - once it was stabilized as a good solution, we removed the duplication

  • and, last but not least, in a sort of a Dark Launch approach, we also implemented the Cache strategy in a way that catches any Exception that might happen in the process: if that happens, we log the exception (so that we could collect all the data issues to fix) and then we transparently fallback to the DB strategy so that the page keeps working from the user perspective

Again, some additional work during development put us in a situation where we were able to avoid any real big issues to reach production - we had some small issues, but it was easy to handle them because the user wasn’t seeing anything weird, and we could always turn the flag OFF if needed.

Conclusions

As you might have noticed, one pattern does not appear in the examples I chose: the reason is that I mostly used those in more simple situations.

For example, I used the Keystone interface sometimes to release a new API without setting up the route where the Controller would respond; sometimes it’s even enough if the frontend is built after the backend, so you can just release the new API but no one can use it. This is useful when we work with separated frontend and backend task (I usually share the API contract before, and release a stub respecting that contract, to enable async releases of backend and frontend - just be careful to replace the stub with the real implementation before allowing the users to see this) but it’s also helpful when we work in Pair/Mob Programming (in this case, we can just build the API on the backend and then work on the frontend after that).

Branch by abstraction has a small example, but it’s the best I can provide from real experience - in any case, you can use it together with Parallel change to build a class that behaves as a sort of Load Balancer between old and new logic, and even configure how many requests to direct to the new logic.

In general, such techniques are powerful and pretty useful, and I strongly suggest everyone that who wants to become a Senior become confident in them because they are great ways to avoid issues in production in a safe way.


We must avoid unplanned work at any cost: a little planned cost today is far better than a much higher unplanned cost later.


Until next time, happy coding! 🤓👩‍💻👨‍💻

Go Deeper 🔎

📚 Books

  • Continuous Integration: Improving Software Quality and Reducing Risk - The authors first examine the concept of CI and its practices from the ground up and then move on to explore other effective processes performed by CI systems, such as database integration, testing, inspection, deployment, and feedback.

  • Trunk-Based Development And Branch By Abstraction - An all you need to know reference book about trunk-based development, Branch by abstraction, and related software development practices. Many diagrams throughout, and a sections on working out how your company can get from where you are to trunk-based development, CI, CD, and all that comes with it.

  • Feature Flag Best Practices - With this practical book, software engineers will learn eight best practices for using feature flags in production, including how to configure and manage a growing set of feature flags within your product, maintain them over time, manage infrastructure migrations, and more.

📩 Newsletter issues

📄 Blog posts

🎙️ Podcasts

🕵️‍♀️ Others

Did you enjoy this post?

Express your appreciations!

Join our Telegram channel and leave a comment!Support Learn Agile Practices

Also, if you liked this post, you will likely enjoy the other free content we offer! Discover it here: