In Turning Video into Traffic Data Part One, I wrote about Miovision’s systematic method for processing the large amount of video that is uploaded to our system. I detailed our three step process for video configuration, quality assurance, and data validation, and explained how computer vision is used to detect vehicle movements from video. If you haven’t yet read Part One, I would recommend you start there.
In this second-and-final post, I will be diving into the details of data accuracy, how we account for error, how we develop our best-in-class algorithm, and how that helps our customers rely on the quality of Miovision data for any project of any size.
Deconstructing a Frame of Video into Spatial Regions for Counting
When video is uploaded to Miovision, cardinal direction and number of lanes are required inputs. That is because each video is split into video segments to be processed individually.
Each video segment is determined by spatial region, lane and approach. Segments are then distributed through a number of processes on a cloud computing service and queued for distribution to a computer vision task.
When computer vision tasks are complete, each video segment is queued for human review and verification. Humans manually count a 12% cross-section from each hour of video to ensure that the computer vision algorithm is properly producing counts and the data is accurate.
How Miovision Defines Data Accuracy: ±5 / 95%
- For volumes of up to 100 vehicles within a 15 minute period, the data will be accurate to ±5 vehicles.
- For volumes greater than 100 vehicles within a 15 minute period, the data will be accurate within 5%.
- Accuracy guaranteed with proper setup of the Scout Video Collection Unit or other video devices
Why not 100%?
Producing data that is verified and reconciled to be 100% accurate is time consuming. With multiple measurements, we can eventually converge on ground truth; however, that comes with high overhead and is a trade-off between cost, turnaround time, and an acceptable accuracy threshold.
At Miovision, we verify all computer vision counts with a 12% human verification overlap that is divided across every hour of video. In our experience over the past 1.5M hours of video, 12% has been the proper balance between appropriately verifying the computer vision algorithm and keeping overheads contained.
When computer vision detection produces an error, it does so logically and consistently. Humans, however, are quite good at making random one-off errors. For this reason, we validate that the datasets are within 5% of each other, and publish the computer vision data as truth data. In areas of low confidence a human will make a more comprehensive review of the video segment.
Why ±5 in low volumes?
At volumes less than 100 vehicles of a single classification within a 15 minute bin, the 5% accuracy threshold can be less than a whole vehicle, and therefore, not an applicable measure.
For example, the computer vision algorithm counts 1147 cars, and ten articulated trucks in 15 minutes; our human overlap verifies 1159 cars (1% error), but only nine articulated trucks (10% error).
At this point, we have a choice to delay turnaround and perform additional verification measures to converge on 100% accuracy, or publish the data for our customers immediately at the consolidated accuracy of 99%.
In our experience, transportation professionals would rather receive their data quickly and cost effectively, rather than take additional time to search for the possibility of one vehicle in a low-volume classification.
How We Account for Error
Computer vision doesn’t always produce perfect counts and there are circumstances, such as blizzards, hurricanes, or intense sun glare, where the visual scene is outside of our algorithm’s scope. Every video is different, and weather, wind, and time of day can all affect the computer vision algorithms performance. That is why we need to account for detection errors in the algorithm.
When the algorithm detects that a video segment is outside of its scope and cannot be counted automatically, that segment is immediately distributed to a Data Services Technician to be manually reviewed and counted. Miovision has the ability to do large scale human correction of any video segment where the computer vision algorithm has a low confidence of detection and reporting.
When a human is required to intervene due to low computer vision confidence, the manually processed video segment is incorporated as part of the training process for the evolution and iteration and improvement of our algorithms.
Train in Order to Count, Count in Order to Train
The continual addition of new video segments to our training algorithm ensures that similar video segments have a better chance at automatically being counted with future iterations of the algorithm. We employ a team of computer vision scientists and engineers with specialities in statistical modelling, machine learning, and image processing, whose work continues to evolve our detection algorithm. Algorithm release candidates are continually introduced into the product testing environment and are compared against the existing algorithm and manually observed ground truth data.
Prior to full release and algorithm update, several members of our Data Services Team rigorously validate randomly sampled video segments for true positive and true negative computer vision detections. Once our Data Services Team has observed that the release candidate achieves our quality standards, it is released to production and the product is seamlessly updated.
Our goal is to make computer vision handle as much video volume as possible to reduce turnaround times and manual interventions, while preserving accuracy.
Quality Data is our Brand
The accuracy and quality of our data is of paramount importance. It’s more than our product, it’s our brand. We work hard to ensure that Miovision customers are getting the best quality data on the market, and the customer service they need to have their operations up and running 24/7.