Challenges
A small engineering team needed to move fast but found themselves increasingly slowed by microservice complexity
Solution
A low friction solution that enhanced visibility, improved data quality, and simplified engineering governance
Results
Scaling successfully, without microservice chaos or slowing down
Learn how Marshmallow has used OpsLevel's Catalog, Checks, & Campaigns to keep development velocity high while scaling their engineering organization.
Marshmallow is a tech-driven InsurTech focused on disrupting legacy insurance providers in a multi trillion dollar industry. Marshmallow exists to increase access to affordable financial services. To do it, they need to iterate quickly and deliver results fast–all the while providing a seamless digital experience to their customers.
David Goaté, Co-founder and Chief Architect, has been with the company since its inception. He and the Marshmallow engineering team embraced microservices from their founding because of the flexibility and agility they provide.
But as the company has grown, so has the number of microservices. With 100+ services managed by a relatively small team, it’s been challenging to manage microservice complexity as they've scaled. To address these challenges, Marshmallow turned to OpsLevel.
Challenges
A small engineering team needed to move fast but found themselves increasingly slowed by microservice complexity
Marshmallow, which was founded in 2017, embraced microservices from the start. As a result, they haven’t battled tech debt from legacy systems, which has made it possible for them to iterate quickly and deliver a high-quality, customer centric experience that’s disrupting legacy insurers. “Using microservices has enabled us to reshape teams and change ownership in a frictionless way– it’s helped us scale from three engineers to 50 in a relatively short timeline,” said David.
Embracing microservices has been the right call for the team, but as the company scaled, these microservices introduced complexity and tech debt of their own.
The team didn’t always have the necessary instrumentation and observability in place. For example, if they saw higher error rates, they lacked holistic distributed tracing that could identify how errors in one service impacted other services.
"Embracing a microservices architecture has offered us so much flexibility, but of course some complexity comes from that as well.” – David Goaté, Co-founder and Chief Architect
Additionally, as Marshmallow was iterating quickly to prove they had product market fit, some of their early microservices were unintentionally proliferating tech debt to new services. They encountered things like data model compatibility issues between services or unintended dependencies that prevented some services from being deployed independently.
"When you start to see that debt be replicated and proliferate throughout your ecosystem, it becomes a bit more difficult to get on top of those things. The activation energy required to go and actually address those things becomes a barrier," said David.
David and team knew that maintaining their velocity and agility, even while scaling the company, was vital. So they were determined to address their emerging microservice tech debt before it became a serious headwind.
To start, the team focused on improving visibility into their microservices. They experimented with a variety of homegrown solutions. These had the goal of tracking service ownership through things like tags and READMEs for repositories. They also had collections of Notion pages for teams that outlined the services they were responsible for, as well as documentation about how the services ran and fit together.
These homegrown solutions were an improvement, but they didn't remain up to date over time and didn't address related pains like onboarding new engineers to on-call rotations. Plus, they didn't help the team drive change–e.g. standardizing observability tooling–across the whole ecosystem in a systematic way. Marshmallow needed something that would not only give visibility into their microservices, but also help establish governance as they continued to scale.
Solution
A low friction solution that enhanced visibility, improved data quality, and simplified governance
To better manage their microservices David and his team turned to OpsLevel. They saw that OpsLevel would let them create a trustworthy service catalog and help them implement better governance, with less manual toil.
Instead of relying on ad-hoc documentation for ownership, Marshmallow quickly adopted a config-as-code approach, managing service metadata within OpsLevel via YAML files.
And with Service Maturity's automated Checks and Campaigns, they saw a clear path to improving standardization across their architecture, which would be a game changer for the small engineering team. No more manually communicating platform changes or manually collating tool usage or migration statuses.
David was particularly impressed with the support he received from the OpsLevel team during the implementation as they offered specific recommendations for Marshmallow, as well as a plethora of examples of how other companies had approached similar problems.
“Implementing OpsLevel was a frictionless process–the team was very friendly, proactive, and helpful. They met with us to discuss our specific needs and gave us recommendations so that we could get maximum value as quickly as possible,” said David.
Results
Scaling successfully, without microservice chaos or slowing down
OpsLevel has given the team more visibility, better data quality, and ultimately better insight into their microservices environment. It’s helped them standardize their services so that engineers can quickly iterate on new features or confidently handle on-call rotations, no matter where within the architecture they're working.
If we want to make a holistic change across say 60 different services, it's really good to be able to get that fast visibility. Which services have already adopted this, or are yet to?
And then we can put a Campaign in place to set a timeline around what a migration should look like, give people access to the relevant information about how to perform such a migration, and critically why we want to do that. That was where we started to find Checks and Campaigns extremely helpful.
Not only that, but OpsLevel has brought positive cultural shifts, giving the team expectations of higher standards and continuous improvement. According to David, it’s made the team more deliberate about defining what “good” looks like:
“We are having more rigor in our discussions around what constitutes a ‘good’ microservice. These discussions are happening much more often now that we have OpsLevel in place.”
Even in its current high growth mode, Marshmallow has been able to manage tech debt and avoid microservice chaos thanks to its partnership OpsLevel. Most engineering organizations slow down as they shift from startup to scale up, but Marshmallow has been able to maintain its engineering velocity and agility.
Subscribe for regular updates.
Conversations with technical leaders delivered right to your inbox.