Machine Learning and Data Mining for Computer Security: Methods and Applications (Advanced Information and Knowledge Processing)
Format: PDF / Kindle (mobi) / ePub
"Machine Learning and Data Mining for Computer Security" provides an overview of the current state of research in machine learning and data mining as it applies to problems in computer security. This book has a strong focus on information processing and combines and extends results from computer security.
The first part of the book surveys the data sources, the learning and mining methods, evaluation methodologies, and past work relevant for computer security. The second part of the book consists of articles written by the top researchers working in this area. These articles deals with topics of host-based intrusion detection through the analysis of audit trails, of command sequences and of system calls as well as network intrusion detection through the analysis of TCP packets and the detection of malicious executables.
This book fills the great need for a book that collects and frames work on developing and applying methods from machine learning and data mining to problems in computer security.
traﬃc. We observed classiﬁcation accuracies by protocol ranging from 85% to 100% for both the aggregate and host models. The peer-to-peer traﬃc was classiﬁed correctly for 100% of the unseen ﬂows. This is an especially interesting result because Kazaa ﬂows carry a port label that is user-deﬁned. Thus, we are able to correctly classify peer-to-peer ﬂows behaviorally – without the use of the port number. These results indicate that our classiﬁcation method is eﬀective for real network traﬃc. The
prevent future incidents by adding a locking device to the steering wheel or parking in a locked garage. If we ﬁnd that the car was broken into and the alarm did not sound, we might choose also to improve the alarm system. ity tial en nfid y grit ty Inte bili a l i a Pro Av ces sin g Sto rag e Tra nsm issi on Co Education Policy&Practice Technology Fig. 2.2. The standard model of information assurance 10 Machine Learning and Data Mining for Computer Security 2.3 Information Assurance
system calls. A system call sequence (SCS ) s is deﬁned as a ﬁnite sequence of system calls and is represented as (c1 c2 c3 · · · cn ), where ci ∈ Σ, 1 ≤ i ≤ n. After processing the audit data into process executions, system call sequences are obtained as ﬁnite length strings. Each system call is then mapped to a unique symbol using a translation table. Thereafter, they are ranked by utilizing prior knowledge as to how susceptible the system call is to malicious usage. A ranking scheme similar to
the computational complexity of various cost functions and allows us a rigorous way to trade oﬀ complexity against descriptiveness. To our knowledge, this is the ﬁrst result on the formal complexity of decision-making in the intrusion detection task. We also spend some discussion on possible directions for addressing the intractability of these models. In particular, reducing the IDS problem to learning and planning in POMDPs opens it up to a wide variety of approximate, yet eﬀective, methods
“What you have” describes some token that is carried by a person that the system expects only that person to have. This token can take many forms. In a physical system, a key could be considered an access token. Most people have some form of identiﬁcation, which is a token that can be used to show that the issuer of the identiﬁcation has some conﬁdence in the carrier’s identity. For computer systems, there are a variety of authentication tokens. These commonly include devices that generate pass