Predicting Cyber-Attacks Using Publicly Available Data
Cyber-attacks are often detected too late. According to reports on reported cyber-attack incidents, most victim organizations do not know that their systems have been breached until they are informed by organizations or individuals external to the victim organization's physical or logical network. This is a significant problem for cyber security professionals and organizations. To further understand this problem, I investigated the following questions in this study: How are external organizations able to detect cyber-attack incidents using only publicly available information? How can cyber-attacks be predicted based on only publicly available data? I collected data on indices representing mentions of a certain type of attack (brute-force/ password guessing attack) from public data repositories as well as ground truth data for a target organization. I extracted and stored the data daily. I used the collected data as training data in a machine learning algorithm. After limited training, the system was able to predict future attacks. The results suggest that it is possible to predict cyber-attacks based on publicly available data.