Cybersecurity Machine Learning

Jordan Carabolla-vega

Overview

Deploying and maintaining a high-performance computing facility such as the NASA Center for Climate Simulation (NCCS) requires services to monitor and report hardware and software operational statistics in real-time. Each day the NCCS receives 115 to 120 million log messages from thousands of assets and a diverse range of services.

The aim of this project is to enhance and improve our ability to view, analyze, and monitor our systems logs through unsupervised machine learning (ML) techniques and by building a security information and event management (SIEM) infrastructure. SIEM combines SIM (security information management) and SEM (security event management) for consolidated analysis of logs from multiple perspectives. SEM centralizes log storage and allows real-time analysis, and SIM collects data and provides automated trend analysis that will lead to a complete and centralized service report.

Project Details

We built and configured an ELK (Elasticsearch, Logstash, Kibana) + X-Pack infrastructure. Multiple NCCS production systems provided log data, which was filtered and ingested by Logstash and Metricbeat, and analyzed through X-Pack ML models.

Our system analyzed logs from services including, but not limited to, SLAPD, Apache HTTPD, and SSHD, as well as CPU utilization metrics. We created ML jobs with fields like detectors, types of data, period detection, influencers, and analysis functions for detecting anomalies within the logs. The representations presented in this work are examples of X-Pack’s ML advanced job option and are designed to monitor multiple events with a variety of detectors and influencers.

Results and Impact

ML techniques were highly effective and useful for analyzing real time and archived data. The implemented jobs were able to detect sensitive anomalies that would have required large investments in time and effort to detect manually. Interesting features such as Beats tools for ingesting logs, X-Pack security and alerting APIs for detecting threats, and Kibana’s powerful dashboards proved to be effective and convenient. ML was shown to be a great engine for SIEM without high CPU consumption and will emerge as a powerful analytics technique for log analysis.

Why HPC Matters

Adding ML to our log monitoring infrastructure gives us increased efficiency and flexibility in getting the most out of our data. As data increase in volume and velocity, it becomes nearly impossible for humans to uncover error causes or incoming threats. Implementing an automated log infrastructure is crucial for an environment like the NCCS. By combining software and hardware resources, we are able to monitor systems in a centralized infrastructure while meeting storage and analysis needs.

What’s Next

Future work includes monitoring Elasticsearch X-Pack ML releases for updates and improvements. To date, jobs are analyzed independently. In the near future, we will combine related jobs to detect incoming threats that may harm multiple services. The final result will be a centralized environment with the capacity of analyzing, monitoring, and visualizing logs through ML models with the capabilities of a SIEM infrastructure.