In most ways, Internet of Things analytics is like any other analytics. However, the need to distribute some IoT analytics to edge sites, and to use some technologies not commonly employed elsewhere, requires business intelligence and analytics leaders to adopt new best practices and software.
There are certain prominent challenges that Analytics Vendors are facing in venturing into building a capability. IoT analytics use most of the same algorithms and tools as other kinds of advanced analytics. However, a few techniques occur much more often in IoT analytics, and many analytics professionals have limited or no expertise in these. Analytics leaders are struggling to understand where to start with Internet of Things (IoT) analytics. They are not even sure what technologies are needed.
Also, the advent of IoT also leads to collection of raw data in a massive scale. IoT analytics that run in the cloud or in corporate data centers are the most similar to other analytics practices. Where major differences appear is at the “edge” — in factories, connected vehicles, connected homes and other distributed sites. The staple inputs for IoT analytics are streams of sensor data from machines, medical devices, environmental sensors and other physical entities. Processing this data in an efficient and timely manner sometimes requires event stream processing platforms, time series database management systems and specialized analytical algorithms. It also requires attention to security, communication, data storage, application integration, governance and other considerations beyond analytics. Hence it is imperative to evolve into edge analytics and distribute the data processing load all across.
Hence, some IoT analytics applications have to be distributed to “edge” sites, which makes them harder to deploy, manage and maintain. Many analytics and Data Science practitioners lack expertise in the streaming analytics, time series data management and other technologies used in IoT analytics.
Some visions of the IoT describe a simplistic scenario in which devices and gateways at the edge send all sensor data to the cloud, where the analytic processing is executed, and there are further indirect connections to traditional back-end enterprise applications. However, this describes only some IoT scenarios. In many others, analytical applications in servers, gateways, smart routers and devices process the sensor data near where it is generated — in factories, power plants, oil platforms, airplanes, ships, homes and so on. In these cases, only subsets of conditioned sensor data, or intermediate results (such as complex events) calculated from sensor data, are uploaded to the cloud or corporate data centers for processing by centralized analytics and other applications.
The design and development of IoT analytics — the model building — should generally be done in the cloud or in corporate data centers. However, analytics leaders need to distribute runtime analytics that serve local needs to edge sites. For certain IoT analytical applications, they will need to acquire, and learn how to use, new software tools that provide features not previously required by their analytics programs. These scenarios consequently give us the following best practices to be kept in mind:
Develop Most Analytical Models in the Cloud or at a Centralized Corporate Site
When analytics are applied to operational decision making, as in most IoT applications, they are usually implemented in a two-stage process – In the first stage, data scientists study the business problem and evaluate historical data to build analytical models, prepare data discovery applications or specify report templates. The work is interactive and iterative.
A second stage occurs after models are deployed into operational parts of the business. New data from sensors, business applications or other sources is fed into the models on a recurring basis. If it is a reporting application, a new report is generated, perhaps every night or every week (or every hour, month or quarter). If it is a data discovery application, the new data is made available to decision makers, along with formatted displays and predefined key performance indicators and measures. If it is a predictive or prescriptive analytic application, new data is run through a scoring service or other model to generate information for decision making.
The first stage is almost always implemented centrally, because Model building typically requires data from multiple locations for training and testing purposes. It is easier, and usually less expensive, to consolidate and store all this data centrally. Also, It is less expensive to provision advanced analytics and BI platforms in the cloud or at one or two central corporate sites than to license them for multiple distributed locations.
The second stage — calculating information for operational decision making — may run either at the edge or centrally in the cloud or a corporate data center. Analytics are run centrally if they support strategic, tactical or operational activities that will be carried out at corporate headquarters, at another edge location, or at a business partner’s or customer’s site.
Distribute the Runtime Portion of Locally Focused IoT Analytics to Edge Sites
Some IoT analytics applications need to be distributed, so that processing can take place in devices, control systems, servers or smart routers at the sites where sensor data is generated. This makes sure the edge location stays in operation even when the corporate cloud service is down. Also, wide-area communication is generally too slow for analytics that support time-sensitive industrial control systems.
Thirdly, transmitting all sensor data to a corporate or cloud data center may be impractical or impossible if the volume of data is high or if reliable, high-bandwidth networks are unavailable. It is more practical to filter, condition and do analytic processing partly or entirely at the site where the data is generated.
Train Analytics Staff and Acquire Software Tools to Address Gaps in IoT-Related Analytics Capabilities
Most IoT analytical applications use the same advanced analytics platforms, data discovery tools as other kinds of business application. The principles and algorithms are largely similar. Graphical dashboards, tabular reports, data discovery, regression, neural networks, optimization algorithms and many other techniques found in marketing, finance, customer relationship management and advanced analytics applications also provide most aspects of IoT analytics.
However, a few aspects of analytics occur much more often in the IoT than elsewhere, and many analytics professionals have limited or no expertise in these. For example, some IoT applications use event stream processing platforms to process sensor data in near real time. Event streams are time series data, so they are stored most efficiently in databases (typically column stores) that are designed especially for this purpose, in contrast to the relational databases that dominate traditional analytics. Some IoT analytics are also used to support decision automation scenarios in which an IoT application generates control signals that trigger actuators in physical devices — a concept outside the realm of traditional analytics.
In many cases, companies will need to acquire new software tools to handle these requirements. Business analytics teams need to monitor and manage their edge analytics to ensure they are running properly and determine when analytic models should be tuned or replaced.
Increased Growth, if not Competitive Advantage
The huge volume and velocity of data in IoT will undoubtedly put new levels of strain on networks. The increasing number of real-time IoT apps will create performance and latency issues. It is important to reduce the end-to-end latency among machine-to-machine interactions to single-digit milliseconds. Following the best practices of implementing IoT analytics will ensure judo strategy of increased effeciency output at reduced economy. It may not be suffecient to define a competitive strategy, but as more and more players adopt IoT as a mainstream, the race would be to scale and grow as quickly as possible.