How Hootsuite Created a Robust Service Catalog for 700+ Microservices Using OpsLevel
Challenge
The team struggled with visibility into their 700+ microservices which left Hootsuite vulnerable
Solution
A catalog that would provide the foundation for more secure and compliant services
Results
An accurate and functional services catalog, more insight into orphaned services, and the foundation for better service maturity
With over 700 microservices, spread across more than 50 engineering teams, Hootsuite needed a foundation for improved visibility, security, and compliance.
Hootsuite is the global leader in social media management, igniting social media for brands and organizations around the world. With tools and expertise that span social media management, social insights, employee advocacy and social customer care, Hootsuite empowers organizations to strategically grow their brands, business and customer relationships using social media.
Shawn Wowk, Manager of Development Operations at Hootsuite, manages the Pod Compute Platform Team, which is responsible for the infrastructure and related tooling—such as Kubernetes and Terraform—that power Hootsuite’s services. Shawn’s goal is to optimize for ease of development and support the operations of existing services.
In total, Hootsuite has 700+ microservices and 50+ teams working on them internally. This relatively large services footprint previously made it difficult for Shawn and his team to have visibility, leaving them vulnerable to orphaned services. In 2021, the team turned to OpsLevel to better understand the number of existing services, who owned them, and whether each service met their high standards.
Challenges
The team struggled with visibility into their 700+ microservices which left Hootsuite vulnerable
More than 14 years since the company was founded, Hootsuite had reached an inflection point. It had grown substantially since its inception and could no longer act like a startup. According to Shawn, this shift meant the team could no longer hack together solutions or product updates without considering how to safely scale the business.
With so many microservices, Shawn and his team struggled to gain visibility. They had a large spreadsheet that served as a catalog of services, but maintaining this spreadsheet was very manual. As time went on, teams merged and changed. Individual contributors got promoted or left the company. Because of this, the spreadsheet wasn’t always top of mind.
“We had some visibility into our services via a spreadsheet, but it was very manual. Because it was so manual, it was usually out of date. We’d go to the spreadsheet and find that some of the teams listed didn’t exist anymore or that some of the services hadn’t been updated.
Shawn Wowk
Manager of Development Operations at Hootsuite
Although the Hootsuite team recognized the importance of having a healthy and robust service catalog, their existing solution did not accurately represent their microservices. This left the organization at risk of orphaned services.
"Not having accurate information left us at risk for orphaned services. If a service is orphaned with no one maintaining it and there is an outage because of it, you’ve got an enormous problem. You have no one with the technical knowledge to fix that service. This has massive ramifications for the customer experience and no business can tolerate that kind of risk."
As Hootsuite hit a certain maturity level, the team wanted to make sure that they were building and maintaining services thoughtfully, while eliminating any potential risk to the business. They found that orphaned services caused a big drain on their organizational capacity to properly staff teams and ensure that critical pieces of infrastructure were working correctly.
Solution
A catalog that would provide the foundation for more secure and compliant services
The team at Hootsuite knew that there was a better way to streamline their manual processes, and took it upon themselves to find a way to gain more visibility into their microservices.
"When the team saw what OpsLevel offered, they immediately understood that it would not only serve as a service catalog, but also help create it."
Initially, the team thought they could make some minor modifications to some of their deployment pipelines, essentially adding some tagging to their scripts. However, they soon saw that OpsLevel had the ability to scan their clusters and pull out information, collecting pertinent information automatically.
"We didn’t need to already have a full inventory of our services to use OpsLevel, which would've actually been a bit of a problem considering where we were coming from. Having a tool that could help us fill in missing information excited us."
The team was also excited about the ability to quantify and track service maturity. Using OpsLevel’s rubrics, they were able to establish a set of standards or minimum expectations for how their services are built and managed, as well as consider stretch goals for long-term operational excellence.
Results
Now that the team has fully implemented OpsLevel, they are enjoying the results. In particular, they now have an accurate and functional services catalog, no more orphaned services, and a foundation for better, more standardized service maturity.
"I would recommend OpsLevel to anyone in a similar place as Hootsuite! It's meeting our needs and our teams are getting excited about what it can do for the organization at large."
An accurate and functional services catalog
Before OpsLevel, Hootsuite did not have an accurate and functional services catalog. Today, they use OpsLevel to gain an understanding of who owns which services and dig into the associated metadata. This is especially helpful when issues arise, as Shawn can now immediately find the right person to help resolve challenges.
More insight into orphaned services
Prior to onboarding OpsLevel, Hootsuite had a number of orphaned services. Although someone at Hootsuite knew about these services, there was a disconnect in communication and Shawn’s group was not always aware of them.
OpsLevel was able to bring awareness of these orphaned services to the surface for the DevOps team. “We had a whole swath of services that we are aware of now, because of OpsLevel” said Shawn. “We were lucky that these services didn’t cause any issues, but we were vulnerable.”
The foundation for better service maturity
Hootsuite’s DevOps team is seeing OpsLevel start to gain a lot of traction, as other engineering teams recognize how helpful and valuable it is. According to Shawn, teams are beginning to come together to define what service maturity actually means at Hootsuite, and that’s in large part because they have OpsLevel to help answer that question. As the team moves into the future towards more mature services, they predict they’ll continue to leverage OpsLevel for increased visibility.
Subscribe for regular updates.
Conversations with technical leaders delivered right to your inbox.