The future is here: Artificial intelligence to become standard for smart cities

Download PDF version Contact company

A tipping point is defined as: “The point at which a series of changes becomes significant enough to cause a larger, more important change”. In the same way that IP video changed surveillance a decade ago, our industry is now feeling the impact of recent developments in Artificial Intelligence, Machine Learning, Deep Learning, Big Data, and Intelligent Video Analysis.

Keyword definitions

Let’s start with a few more definitions. Artificial Intelligence (AI) deals with the simulation of intelligent behaviour in computers. Machine Learning (ML) deals with developing computer algorithms that access data and use it to learn for themselves. Neural networks are computer systems that loosely mimic human brain operation.

Deep Learning is a subset of ML based on neural networks that has been proven to provide breakthrough capabilities in many problems that were previously unsolvable, and Big Data, or metadata, refers to huge amounts of structured and/or unstructured data -- in our case, the immense quantities of video information being generated daily by security cameras deployed in cities around the world. Deep Learning is tipped to change Intelligent Video Analysis (IVA), the digital video technology integrated with analytical software that is a basic tool for our industry.

AI in surveillance

Traditionally, the main benefit of surveillance cameras is the ability to collect evidence for debriefing or investigation, as well as the ability to view events remotely in real-time. A decade ago, video analytics technologies were introduced to solve the problem of human inattention -- computers don’t get tired, bored or distracted, and can monitor a camera continuously.

And then, camera costs dropped, deployment skyrocketed, and video management systems began collecting reams of useless, costly unstructured data. AI technology seemed to answer the pressing new industry needs of how to use this Big Data effectively, make a return on the investment in expensive storage, while maintaining (or even lowering) human capital costs

Three limiting factors

All this was theoretical, however, as multiple technological barriers prevented AI solutions from real-world utilisation. Despite decades of research on how to cause a computer to accurately recognise different objects in a video stream, the quality of the results, especially in urban environments, was, to put it mildly, underwhelming.

Deep Learning software must be able to differentiate between different objects and under various circumstances

Deep Learning has matured to the point where it can accurately detect and classify objects both in still images and in video

AI was limited primarily by these three factors:

Lack of understanding -- The software must be able to differentiate between different objects (person, vehicle, animal, etc.), and under various circumstances (day, night, seasonal weather conditions, etc.).
Inability to learn -- Traditional IVA applications relied on a rule-based approach that required software configuration -- by a human operator -- for each monitoring camera and each type of alert. Although effective in some scenarios, the exponential growth in camera counts rendered this approach impractical, given the amount of manual labor required to configure, reconfigure, and maintain rules.
High cost -- The hard truth is that budgets for security and safety will always be constrained. Until recently, implementing real-time AI was extremely cost-prohibitive, sometimes requiring a 1:1 server to camera ratio.

Meeting challenges

That was yesterday. Today, the application of AI in security applications has reached its tipping point, meeting the above-mentioned challenges.

Understanding -- Deep Learning has matured to the point where it can accurately detect and classify objects both in still images and in video. DL technology is fast becoming the basic building block for IVA.
Ability to Learn -- As an AI solution collects and analyses data over time, it creates metadata that describes all objects in each video stream. Machine Learning techniques process this metadata to generate models for “normally observed” behaviour. These models are applied in real-time to detect behaviours deviating from the norm. Only those flagged as suspicious events require review by a human operator. This technique allows the solution to scale to an unlimited number of cameras, with no need for a human to configure each new device.
Lower cost -- The rapid increase in GPU computational capacity, coupled with mass market adoption, has lowered server costs to a reasonable level. Today, with the correct implementation, a single server can be deployed across hundreds and even thousands of cameras.

The convergence of Deep Learning for video analysis, advances in AI for fully automated event detection, plus the significant reduction in cost to implement these techniques – including cloud-based software as a service (SaaS) models -- means that the fully automated video surveillance solution for cities is fast becoming a reality. We’ll see more of this type of solution being deployed over the coming months, and within the next few years, it will be standard in any Smart City deployment.

Download PDF version Download PDF version

Author profile

Zvika Ashani Chief Technology Officer, Agent Vi (Agent Video Intelligence)

Related companies
Agent Vi (Agent Video Intelligence)

Related links
Articles by Zvika Ashani

View all news from
Agent Vi (Agent Video Intelligence)

Articles by Zvika Ashani

Video surveillance in 2017: Deep learning and cloud-based analytics broke through

2017 witnessed a continued decline in the cost of cameras. While this creates a challenge for camera companies, it creates two clear opportunities: (1) Product differentiation now relies more heavily...

Deep learning algorithms broaden the scope of video analytics

Over the years, video analytics has gained an unfavourable reputation for over-promising and under-delivering in terms of performance. One of the biggest complaints regarding video analytics has been...

Show all

In case you missed it

Anviz Global expands palm vein tech for security

The pattern of veins in the hand contains unique information that can be used for identity. Blood flowing through veins in the human body can absorb light waves of specific wavelen...

Bosch sells security unit to Triton for growth

Bosch is selling its Building Technologies division’s product business for security and communications technology to the European investment firm Triton. The transaction enc...

In age of misinformation, SWEAR embeds proof of authenticity into video data

The information age is changing. Today, we are at the center of addressing one of the most critical issues in the digital age: the misinformation age. While most awareness of thi...