Processing huge amounts of data requires highest scalability of the infrastructure as well as applied technologies and algorithms to handle data ingest, storage, data analysis, and anomaly detection in real-time.
Our research in the field of real-time analytics is driven by the need to process and analyze millions of metrics and events in real-time. Why? Because processing the huge amounts of data requires high scalability not only of the infrastructure but also of the applied technologies and algorithms. This way data ingest and storage can be handled and large-scale data analysis and anomaly detection enabled in real-time.
Technologies: The ongoing emergence of new technologies requires continuous monitoring and evaluation of their relevance and applicability in the field of real-time data analysis. Extensive feasibility studies including performance tests of different technology stacks are necessary to make the right design decisions for the next product generation. The selection of the most suitable architecture is crucial for future success.
Algorithms: The large data volumes involved make highly efficient algorithms indispensable. Sketching and sampling or other data compaction algorithms enable significant data reduction without losing important information. Fast hashing algorithms are essential for real-time indexing. Distributed agreement and load balancing algorithms are needed to orchestrate the data flow in the cluster. And finally, the best algorithms for real-time analysis and anomaly detection must be found that offer the best trade-off between accuracy and resource consumption.