To implement impactful machine learning, it’s critical to do one thing exceptionally well: manage anomalies. Detecting and managing errors is a core function of tuning your algorithm. If your algo thinks a human is a dog and you don’t catch it… what’s the point of using machine learning? In other words, there’s a direct correlation between how fast you can spot and fix errors and how good/useful your ML model is. But here’s the friction: detecting anomalies with available ML tools is onerous! Saying “tools” might even be a misnomer. Inspecting results requires developers to do a lot of tedious data prep and custom coding. And it takes time. Lots of it. So is one of the leading tools in AI stuck in a logjam? We need better tools.
At Formant, our core beliefs are that humans are intuitive, visual and great at pattern recognition. With this in mind, we approached the ML tuning problem from the assumption that it could be done differently: in an intuitive, visual way. So we built it. And we think the resulting solution has the potential to revolutionize how the power of ML is harnessed.
This blog takes you through a concrete example — an experiment that we’re actually running right now at our HQ — so you can get an understanding of the way it could be. Frankly, we hope you want to try it out.
Tuning Machine Learning Algorithms the Hard Way
Context
We’re running a 4k security camera from the roof of our office building — it’s focused on the entryway. We’re interested in understanding the flow of people past our office — but also into our office (we are interested in space utilization rates, but we can also extrapolate many other use cases). So our project’s objective is simply to count humans. We are using the NVIDIA TX2 to run fast detection on-device. Additionally, we’re making use of pre-trained object detection models using Tensorflow Lite, as well as OpenCV for object tracking.
Common Pitfalls
The machine learning algorithm we’re using is like so many others: it’s great at bootstrapping our application but requires refinement to provide results. For example, sometimes it double counts people who linger. And sometimes it counts animals as humans. It even gets thrown off by inanimate objects like shadows and hubcaps. So, clearly, it needs some babysitting, right? Machine Learning doesn’t deliver ROI out of the box — it shines with humans in the loop.
A Traditional Exception Resolution Workflow
If you were the lead engineer on this project, you’d start off the analysis workflow by retrieving files off of the camera. You’d download them to a network drive or a laptop. For the 4K camera we’re using, the files are big — 35 MBs per file (frame). We’re shooting 1 frame per second so this is about 3 GBs a day. Next, you’d use Python to get at the inference results and dump it out as a CSV. Then there’s a lot more custom coding to combine the data and the images in a way that lets you explore the results. Then more custom code to get the data in a query-ready format. Nothing about this process is easy or fluid — there is no way around it being a slog. So, what are your options?
- Keep doing manual prep and data wrangling (aka nothing)
- Productize your custom code (aka create a massive ‘side project’)
- Use a pre-built solution that handles this automatically (outsource)
Tuning ML Algos Using A ‘DVR for Data’
The Formant Visual Interface (the DVR)
Formant’s approach can be communicated through the idea of a video player. Behind the scenes, data has been programmatically joined (device data and metadata as a time series) and fed into a video player. In the screenshot below you’ll see the video feed, related data (tags) and the video player’s timeline. Want to go to a specific time? One click. Want to scrub along the timeline? One click. Want to flip through occurrences of “unrecognized”? One click. This transforms the tuning process because it eliminates the slog and serves up an interface that fosters investigation and exploration — and insight.
Intuitive, Visual ML Tuning — Now What?
Save time by inviting others to help
As with many aspects of technology, when it becomes very easy to do something, more people will do that something. Specifically, in the case, more types of people can now be involved in machine learning. Who can you invite in to help? How can you help your algorithm learn faster? We think “faster learning” is a goal everyone can rally around — it hits at the core of why you advocated bringing ML into your company in the first place: automation and optimization. So who can you invite in to help? Can you task someone (who doesn’t have advanced ML skills) into the role of exception handling? It’s time to start experimenting.
Save time by using a visual query builder
With the data wrangling done automatically, an ML engineer, business analyst or roboticist can now spend their time slicing data across any/multiple dimensions to find answers to their questions. Pick one or many devices, independent streams, a time, an exception… Have at it. It’s a radically simple solution to what used to take days to accomplish — and it’s no longer the domain of a single person. This is the democratization of machine learning.
Save time by getting notified when X#%! happens
Not everyone wants to sift through raw feeds even if they are enriched and easily accessible — that’s why we make it simple to set up alerts. We built an alert builder that allows anyone to set up an alert on any dimension of their data. Set a threshold — get an alert. And you can even tie the alerts into 3rd party tools like Slack, PagerDuty and more.
Save time by watching your streams at 16X
A handy precept in data science is “know your data”. It’s especially true in machine learning. And once you have a ‘DVR for data’ it becomes apparent that an easy way to get intimate with your data is to watch it unfold over time. And then watch it again. Then rewind and rewatch key areas of interest over and over until you discover something. Formant allows you to watch your feeds in real time — or play it back at 2x, 4x, 8x and 16x. Imagine being able to go to a point in time in one click.
Summary: It’s Time To Unlock the Value of Machine Learning
We started off with two primary assumptions — that most of ML tuning is a manual, bespoke process, and that productizing that process is a steep, long journey. If these assumptions are correct, then many ML implementations are underperforming because it’s taking too much time to prepare and analyze the data.
We then discussed Formant’s vision and described how it handles ML algo tuning in a new way — by allowing fast, intuitive exploration along a visual timeline. Our thesis is that we can save ML engineers a game-changing amount of time. Not 10%. Not 20%. Our testing indicates it could be closer to 90%. And if we help the engineer, then we’re helping the company. In other words, we think we can help companies derive a much better ROI on their ML implementations by accelerating the learning and by lowering the bar for entry.
What improvements to this process have you envisioned? How can you envision using a ‘DVR for data’? Let us know how you’re streamlining your ML processes — and if you want to explore Formant’s ground-breaking solutions.