A new generation of video cameras is poised to boost capabilities dramatically at the edge of the IP network, including more powerful artificial intelligence (AI) and higher resolutions, and paving the way for new applications that would have previously been too expensive or complex.
Technologies at the heart of the coming new generation of video cameras are Ambarella’s newest systems on chips (SoCs). Ambarella’s CV5S and CV52S product families are bringing a new level of on-camera AI performance and integration to multi-imager and single-imager IP cameras. Both of these SoCs are manufactured in the ‘5 nm’ manufacturing process, bringing performance improvements and power savings, compared to the previous generation of SoCs manufactured at ‘10nm’.
CV5S and CV52S AI-powered SoCs
The CV5S, designed for multi-imager cameras, is able to process, encode and perform advanced AI on up to four imagers at 4Kp30 resolution, simultaneously and at less than 5 watts. This enables multi-headed camera designs with up to four 4K imagers looking at different portions of a scene, as well as very high-resolution, single-imager cameras of up to 32 MP resolution and beyond.
The CV52S, designed for single-imager cameras with very powerful onboard AI, is the next-generation of the company’s successful CV22S mainstream 4K camera AI chip. This new SoC family quadruples the AI processing performance, while keeping the same low power consumption of less than 3 watts for 4Kp60 encoding with advanced AI processing.
Faster and ubiquitous AI capabilities
Ambarella’s newest AI vision SoCs for security, the CV5S and CV52S, are competitive solutions"
“Security system designers desire higher resolutions, increasing channel counts, and ever faster and more ubiquitous AI capabilities,” explains John Lorenz, Senior Technology and Market Analyst, Computing, at Yole Développement (Yole), a French market research firm.
John Lorenz adds, “Ambarella’s newest AI vision SoCs for security, the CV5S and CV52S, are competitive solutions for meeting the growing demands of the security IC (integrated circuit) sector, which our latest report forecasts to exceed US$ 4 billion by 2025, with two-thirds of that being chips with AI capabilities.”
Edge AI vision processors
Ambarella’s new CV5S and CV52S edge AI vision processors enable new classes of cameras that would not have been possible in the past, with a single SoC architecture. For example, implementing a 4x 4K multi-imager with AI would have traditionally required at least two SoCs (at least one for encoding and one for AI), and the overall power consumption would have made those designs bulky and prohibitively expensive.
By reducing the number of required SoCs, the CV5S enables advanced camera designs such as AI-enabled 4x 4K imagers at price points much lower than would have previously been possible. “What we are usually trying to do with our SoCs is to keep the price points similar to the previous generations, given that camera retail prices tend to be fairly fixed,” said Jerome Gigot, Ambarella's Senior Director of Marketing.
4K multi-imager cameras
“However, higher-end 4K multi-imager cameras tend to retail for thousands of dollars, and so even though there will be a small premium on the SoC for the 2X improvement in performance, this will not make a significant impact to the final MSRP of the camera,” adds Jerome Gigot.
In addition, the overall system cost might go down, Gigot notes, compared to what could be built today because there is no longer a need for external chips to perform AI, or extra components for power dissipation.
The new chips will be available in the second half of 2021, and it typically takes about 12 to 18 months for Ambarella’s customers (camera manufacturers) to produce final cameras. Therefore, the first cameras, based on these new SoCs, should hit the market sometime in the second half of 2022.
Reference boards for camera manufacturers
The software on these new SoCs is an evolution of our unified Linux SDK"
As with Ambarella’s previous generations of edge AI vision SoCs for security, the company will make available reference boards to camera manufacturers soon, allowing them to develop their cameras based on the new CV5S and CV52S SoC families.
“The software on these new SoCs is an evolution of our unified Linux SDK that is already available on our previous generations SoCs, which makes the transition easy for our customers,” said Jerome Gigot.
Better crime detection
Detecting criminals in a crowd, using face recognition and/or licence plate recognition, has been a daunting challenge for security, and one the new chips will help to address.
“Actually, these applications are one of the main reasons why Ambarella is introducing these two new SoC families,” said Jerome Gigot.
Typically, resolutions of 4K and higher have been a smaller portion of the security market, given that they came at a premium price tag for the high-end optics, image sensor and SoC. Also, the cost and extra bandwidth of storing and streaming 4K video were not always worth it for the benefit of just viewing video at higher resolution.
4K AI processing on-camera
The advent of on-camera AI at 4K changes the paradigm. By enabling 4K AI processing on-camera, smaller objects at longer distances can now be detected and analysed without having to go to a server, and with much higher detail and accuracy compared to what can be done on a 2 MP or 5 MP cameras.
This means that fewer false alarms will be generated, and each camera will now be able to cover a longer distance and wider area, offering more meaningful insights without necessarily having to stream and store that 4K video to a back-end server. “This is valuable, for example, for traffic cameras mounted on top of high poles, which need to be able to see very far out and identify cars and licence plates that are hundreds of meters away,” said Jerome Gigot.
The advent of on-camera AI at 4K changes the paradigm |
Enhanced video analytics and wider coverage
“Ambarella’s new CV5S and CV52S SoCs truly allow the industry to take advantage of higher resolution on-camera for better analytics and wider coverage, but without all the costs typically incurred by having to stream high-quality 4K video out 24/7 to a remote server for offline analytics,” said Jerome Gigot.
He adds, “So, next-generation cameras will now be able to identify more criminals, faces and licence plates, at longer distances, for an overall lower cost and with faster response times by doing it all locally on-camera.”
Deployment in retail applications
Retail environments can be some of the toughest, as the cameras may be looking at hundreds of people at once
Retail applications are another big selling point. Retail environments can be some of the toughest, as the cameras may be looking at hundreds of people at once (e.g., in a mall), to provide not only security features, but also other business analytics, such as foot traffic and occupancy maps that can be used later to improve product placement.
The higher resolution and higher AI performance, enabled by the new Ambarella SoCs, provide a leap forward in addressing those scenarios. In a store setup, a ceiling-mounted camera with four 4K imagers can simultaneously look at the cashier line on one side of the store, sending alerts when a line is getting too long and a new cashier needs to be deployed, while at the same time looking at the entrance on the other side of the store, to count the people coming in and out.
This leaves two additional 4K imagers for monitoring specific product aisles and generating real-time business analytics.
Use in cashier-less stores
Another retail application is a cashier-less store. Here, a CV5S or CV52S-based camera mounted on the ceiling will have enough resolution and AI performance to track goods, while the customer grabs them and puts them in their cart, as well as to automatically track which customer is purchasing which item.
In a warehouse scenario, items and boxes moving across the floor could also be followed locally, on a single ceiling-mounted camera that covers a wide area of the warehouse. Additionally, these items and boxes could be tracked across the different imagers in a multi-headed camera setup, without the video having to be sent to a server to perform the tracking.
Updating on-camera AI networks
Another feature of Ambarella’s SoCs is that their on-camera AI networks can be updated on-the-fly, without having to stop the video recording and without losing any video frames.
So, for example in the case of a search for a missing vehicle, the characteristics of that missing vehicle (make, model, colour, licence plate) can be sent to a cluster of cameras in the general area, where the vehicle is thought to be missing, and all those cameras can be automatically updated to run a live search on that specific vehicle.
If any of the cameras gets a match, a remote operator can be notified and receive a picture, or even a live video feed of the scene.
Efficient traffic management
With the CV52S edge AI vision SoC, those decisions can be made locally at each intersection by the camera itself
Relating to traffic congestion, most big cities have thousands of intersections that they need to monitor and manage. Trying to do this from one central location is costly and difficult, as there is so much video data to process and analyse, in order to make those traffic decisions (to control the traffic lights, reverse lanes, etc.).
With the CV52S edge AI vision SoC, those decisions can be made locally at each intersection by the camera itself. The camera would then take actions autonomously (for example, adjust traffic-light timing) and only report a status update to the main traffic control centre. So now, instead of having one central location trying to manage 1,000 intersections, a city can have 1,000 smart AI cameras, each managing its own location and providing updates and metadata to a central server.
Superior privacy
Privacy is always a concern with video. In this case, doing AI on-camera is inherently more private than streaming the video to a server for analysis. Less data transmission means fewer points of entry for a hacker trying to access the video.
On Ambarella’s CV5S and CV52S SoCs, the video can be analysed locally and then discarded, with just a signature or metadata of the face being used to find a match. No actual video needs to be stored or transmitted, which ensures total privacy.
In addition, the chips contain a very secure hardware cyber security block, including OTP memory, Arm TrustZones, DRAM scrambling and I/O virtualisation. This makes it very difficult for a hacker to replace the firmware on the camera, providing another level of security and privacy at the system level.
Privacy Masking
Another privacy feature is the concept of privacy masking. This feature enables portions of the video (say a door or a window) to be blocked out, before being encoded in the video stream. The blocked portions of the scene are not present in the recorded video, thus providing a privacy option for cameras that are facing private areas.
“With on-camera AI, each device becomes its own smart endpoint, and can be reconfigured at will to serve the specific physical security needs of its installation,” said Jerome Gigot, adding “The possibilities are endless, and our mission as an SoC maker is really to provide a powerful and easy-to-use platform, complete with computer-vision tools, that enable our customers and their partners to easily deploy their own AI software on-camera.”
Physical security in parking lots
With a CV5S or CV52S AI-enabled camera, the camera will be able to cover a much wider portion of the parking lot
One example is physical security in a parking lot. A camera today might be used to just record part of the parking lot, so that an operator can go back and look at the video if a car were broken into or some other incident occurred.
With a CV5S or CV52S AI-enabled camera, first of all, the camera will be able to cover a much wider portion of the parking lot. Additionally, it will be able to detect the licence plates of all the cars going in and out, to automatically bill the owners.
If there is a special event, the camera can be reprogrammed to identify VIP vehicles and automatically redirect them to the VIP portion of the lot, while reporting to the entrance station or sign how many parking spots are available. It can even tell the cars approaching the lot where to go.
Advantages of using edge AI vision SoCs
Jerome Gigot said, “The possibilities are endless and they span across many verticals. The market is primed to embrace these new capabilities. Recent advances in edge AI vision SoCs have brought about a period of change in the physical security space. Companies that would have, historically, only provided security cameras, are now getting into adjacent verticals such as smart retail, smart cities and smart buildings.”
He adds, “These changes are providing a great opportunity for all the camera makers and software providers to really differentiate themselves by providing full systems that offer a new level of insights and efficiencies to, not only the physical security manager, but now also the store owner and the building manager.”
He adds, “All of these new applications are extremely healthy for the industry, as they are growing the available market for cameras, while also increasing their value and the economies of scale they can provide. Ambarella is looking forward to seeing all the innovative products that our customers will build with this new generation of SoCs.”