Forget Mono Repo vs. Multi Repo - Building Centralized Git Workflows in Python
Updated March 5, 2024.
About
This content is brought to you by Jit - a platform that simplifies continuous security for developers, enabling dev teams to adopt a ‘minimal viable security’ mindset, and build secure cloud apps by design from day 0, progressing iteratively in a just-in-time manner.
This blog article summarizes a talk given by David Melamed, Jit CTO, at Pycon DE & PyData 2022 in Berlin.
In every software development project, before even writing the first line of code, you gotta pick an architecture for your repo.
Picking an architecture is not easy. There are many tradeoffs that need to be considered and this choice will impact future development
Should I place all code in one place?
- Should I use microservices and follow the single responsibility principle?
- Should I manage all the codebases in a single repository, allowing for easy refactoring and creating a common culture across teams?
- Should I allow for independent deployments and versioning and more autonomous work in each team? Letting the teams decide which language they want to use or which web frameworks they are more comfortable with?
Developers ask themselves these questions over and over when they are about to start a new project - I asked myself these questions while working on jit.io.
Choosing the right model seems like a never-ending debate. Naturally, each model has its pros and cons.
Introducing centralized CI
What if you could have a central place to keep your CI/CD pipeline configuration while finding a way to trigger it from every repository? What would it look like? Basically, you would need the ability to listen to every PR, using some backhand service that would trigger the CI pipeline automatically, with the relevant context, in order to clone the branch and run your tests on it.
Watch the video below to explore a concrete example:
One way to achieve some sort of centralized CI involves one recent feature introduced by Github: reusable workflows. Think of this as a way to only reference external files in each repo instead of copying the same file over and over. However, right now this feature is limited to public repositories only so it is not relevant for us. Also, the approach I am suggesting is providing more granularity in terms of control, since there is a central place where you can decide with a logic you define how to trigger the centralized workflows.
Implementing CI architecture in real life - what do you need?
Here's what you need to do in order to implement the CI architecture in real life:
#1: Create a Github application
A Github application will give you access to the GitHub API and the ability to register and receive all the events about open PRs, so you will be able to trigger your CI workflow remotely.
#2: Create some simple backend to listen to PRs
- Async library: aiohttp + aiohttp-devtools
- Github API wrapper
#3: Use ngrok to get a public URL for local development
The ability to have a public URL is important since GitHub needs to send us the open PR events. Using ngrok can create a tunnel that provides you with a unique URL.
Building a centralized Linter - a detailed tutorial
Watch this video to learn how to build a centralized Linter:
And lets see this in action:
Summary
In summary, the short answer to the question of whether you can choose an architecture without worrying about multi-repo or mono repo is YES.
I have fully demonstrated, in detail, how to create a centralized linter with a feedback loop that updates the PR status of the original repo. The main advantage of a centralized link linter is that it releases the developer from thinking about adding IDs each and every time. Centralized CI can help developers overcome the never-ending debate of choosing a specific app architecture. Armed with this knowledge, you’re ready to choose the right architecture for your app.. Good luck! 😉