This is part 2 of a 2-part blog post on robotics observability platforms.
Many of our customers came to us because they were struggling to scale their in-house robot monitoring system. Other customers came to us because they wanted to avoid building a home-brew solution right from the start. But in conversations with each of these customer types, they invariably asked us how Formant would be a better choice than a home-brewed solution.
We love being asked this question.
To properly dissect the ‘build vs. buy’ question when it comes to robot monitoring and operations, it is instructive to look at the status quo of the robotics industry. The next wave of robotics is only just now moving from the lab (and highly controlled environments) to the unpredictable real-world, requiring an entirely additional layer of oversight and complex predictive thinking.
The industry is relatively young — best practices, standards, and operations playbooks have not been fully established. As a result, many robot companies attempt to build their own robot monitoring and operations tools on top of existing cloud offerings. These in-house solutions exhibit a common set of shortcomings:
Large implementation costs: A robotics team must now include not just expert roboticists and hardware specialists, but also cloud architects, frontend and backend software engineers, and data science/analytics/AI experts.
Large maintenance costs: In addition to the upfront cost of designing and building a robot operations toolset, there are ongoing maintenance costs to manage that toolset, requiring resources from your devops / infrastructure team effectively until the end of time.
Non-scalable solutions: Home-brewed monitoring and observability solutions are typically designed to solve the immediate needs of approximately 5-10 engineers in a lab working with 1-2 prototype robots. This results in tools that do not scale to meet the needs of 10, 100, or 1,000-robot fleets. And if they do, they will require significant additional resources to manage.
Difficult ingestion: Non-production-hardened solutions tend not to be robust enough to handle local connectivity issues such as WiFi failures, poor cellular coverage, or those (many) times the robot simply passes behind a metal wall.
Robot ‘illiteracy’: Standard cloud platforms tend to cater to server monitoring scenarios — they aren’t designed for the workflows common in the robotics field.
Lack of support for rich robot media: Existing cloud monitoring products typically support text and scalar data only, and are incompatible with rich sensor data such as LiDAR point clouds, geometric poses, and high-resolution camera streams.
Security vulnerabilities: Home-brewed solutions often take security shortcuts. Any solution that requires remote SSH access to your robots is probably a liability.
Lastly, ‘distraction’: Robotics teams can and should focus on what makes their technology and their business great: using robots to provide a valuable service. Any time and money that a team spends on building infrastructure and tooling means resources not spent on what makes their offering unique.
Build vs. Buy in Robotics
Much has been written about build vs. buy decisions generally in blogs, articles, and newsletters. I won’t attempt to rehash those discussions, but will summarize a few key points.
Build vs. buy decisions generally revolve around three main decision points:
Market Factors:
- Will this get me to market more quickly?
- Will this enable me to innovate faster than my competitors?
- Is there a tool in the ecosystem that satisfies my needs?
Cost Considerations:
- Does the long-term cost of building exceed the cost of buying?
- Does the yearly maintenance cost of an internal solution exceed the cost of buying?
Competitive Considerations:
- Is the thing I’m building part of my core competency? Does it create competitive advantage?
- Does my robot application have unique enough requirements that no commercially available platform satisfies them?
In general, the decision to build vs. buy should be made based on what creates competitive advantage for you as a company. Unfortunately, many early stage companies in immature markets interpret this to mean that their entire tech stack is a competitive differentiator. And it’s true – early on when the market is immature it can seem like building technical infrastructure is the only way to grow. But soon enough this runs out of steam. A quick look at the websites of Sumo Logic, Data Dog, and Splunk reveals logos of many technically capable companies. These companies have decided that personally maintaining their entire monitoring stack is a resource strain that distracts from their overall mission.
However, it’s hard to make this decision without context. One thing we do find is that companies have trouble estimating the total cost of ownership of these types of platforms. Let’s break this down.
Total Cost of Ownership
It’s not straightforward to estimate the amount of time a company would spend building an observability and operations platform…but we can break it down into engineering staff, maintenance costs, and opportunity costs. For the purposes of this exercise, we’ll assume negligible infrastructure costs.
Engineering Costs – Build – $600K to $1.6M+
To build a robotics monitoring platform that scales with your business, you need at minimum 3 engineers to build a minimal solution and closer to 8 engineers to build the necessary telemetry pipelines, data infrastructure, and a frontend that serves your operations staff. At least the following are required:
- QA Engineer
- Cloud Engineer (2)
- UI/UX Resource
- Back End Engineer (2)
- Front End Engineer (2)
At an average cost estimate of $200K/engineer, that’s $600K to $1.6M/year.
In general it would take at least a year to build and QA this platform, but depending on complexity of your application, may take longer. Shorter build cycles are possible, but not without sacrificing scalability and increasing future maintenance costs.
Maintenance Costs – $600K to $1.2M/year
Unfortunately, like any other part of your software stack, your monitoring and observability stack will require maintenance. Stability issues will arise, new features and capabilities will be requested, and underlying technologies may change. Assuming ongoing maintenance for your platform only requires your cloud and infrastructure teams, you will still need 3-6 engineers at a yearly cost of $600K to $1.2M
Maintenance costs continue to add up over time. Over a 5 year time horizon, you should expect to spend $3M to $6M
Opportunity Costs
It’s not just the cost of building and maintaining — the harder cost to model is the opportunity cost of dedicating resources to something that delays rather than accelerates your time to market. You tell us, what does getting to market 6 months earlier mean to you?
Let’s say your team built a minimum version of monitoring that got you through your first pilots. Now you have an order for a dozen deployments, or like some of our customers, hundreds or even thousands. Your cloud infrastructure grinds to a halt, security and privacy concerns become a blocking issue, and you discover the tool designed by your engineering team doesn’t make sense to your operations team. You need to rebuild your alerting, monitoring, and operations stack before you’re able to address your opportunity.
We’ve seen this play out as companies take the next step in scaling their business with the unfortunate consequence of delaying their time to market. This is a dangerous place for a company to be considering the market is moving fast with newly minted startups coming online everyday.
But my robot operation isn’t cookie cutter…
It’s true, many robot operations are very unique. You may be generating unique data sets, be using your data in a new way, and/or you may need to interface your data with a set of specific systems. There’s also the very practical need to control your company’s destiny. All of these are valid concerns. However, we’ve found that customers with unique, market-making use cases are the very same ones that, by necessity, must focus on their core robotic service in order to maintain their differentiation. Maintaining cloud infrastructure does not serve that differentiation. Ask yourself these questions:
- What specific integration or functionality do I need to be successful?
- Is this part of my core competency? Or something that will be better served by the market long term?
- Will I be able to improve my cloud feature set at the same rate as a dedicated service provider with many customers as data points?
- Is building this myself a distraction?
Focus on your Application
Ultimately the build vs. buy decision isn’t an easy one. The default choice in most organizations is to build, but in the long run buying monitoring, observability, and operations software can be a time and money saver for your team and allow you to focus on what you do best – making awesome robots.
If the answer to any of these questions isn’t straightforward, we’d encourage you to have a conversation with us.