antecedentes del uso de agentes inteligentes

Upload: percy-rojas

Post on 03-Jun-2018

236 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 Antecedentes Del Uso de Agentes Inteligentes

    1/17

    PAID: A Probabilistic Agent-Based IntrusionDetection system

    Vaibhav Gowadia, Csilla Farkas*, Marco Valtorta

    Information Security Laboratory, Department of Computer Science and Engineering,

    University of South Carolina, Columbia, SC 29208, USA

    Received 9 August 2004; revised 16 May 2005; accepted 16 June 2005

    KEYWORDSIntrusion detection;Network security;Computer security;Computer attack;Agents;Bayesian networks

    Abstract In this paper we describe architecture and implementation ofa Probabilistic Agent-Based Intrusion Detection (PAID) system. The PAID systemhas a cooperative agent architecture. Autonomous agents can perform specificintrusion detection tasks (e.g., identify IP-spoofing attacks) and also collaboratewith other agents. The main contributions of our work are the following: our modelallows agents to share theirbeliefs, i.e., the probability distribution of an eventoccurrence. Agents are capable to perform soft-evidential update, thus providinga continuous scale for intrusion detection. We propose methods for modellingerrors and resolving conflicts among beliefs. Finally, we have implemented a proof-of-concept prototype of PAID. 2005 Elsevier Ltd. All rights reserved.

    Introduction

    As the complexity of computer systems increasesand attacks against them become more and moresophisticated, high-assurance intrusion detectiontechniques need to be implemented. During thelast two decades, many strategies and methods for

    intrusion detection have been developed (fora survey seeAxelsson, 2000).

    The main goal of any IDS is to detect allintrusions and only intrusions in an efficient way.Correctness of an IDS is measured by the rate offalse positives and false negatives over all events.

    A false positive warning occurs when a non-in-trusive event is labelled intrusive. A false negativewarning occurs when an intrusive activity is notdetected. Negative effects of false positives in-clude false accusations, reduced system availabil-ity, and subsequent disregard of IDS warnings. Thenegative effects of false negatives include reduced

    trust in IDS and damages caused by the intrusions.For effective intrusion detection, it is necessarythat IDSs reduce the number of misclassificationsand find an acceptable balance between falsepositive and false negative rates. Dacier (2002)found that most of the false positives are gener-ated due to under-specified attack signatures,intent-guessing signatures, or lack of abstraction.Therefore, it is important to specify signatures

    * Corresponding author.E-mail address:[email protected](C. Farkas).

    ARTICLE IN PRESS

    0167-4048/$ - see front matter 2005 Elsevier Ltd. All rights reserved.doi:10.1016/j.cose.2005.06.008

    Computers & Security (2005) -, -e-

    www.elsevier.com/locate/cose

    mailto:[email protected]://www.elsevier.com/locate/cosehttp://www.elsevier.com/locate/cosemailto:[email protected]
  • 8/12/2019 Antecedentes Del Uso de Agentes Inteligentes

    2/17

    precisely and develop IDSs that can process thesesignatures efficiently.

    Moreover, network-based distributed attacksare difficult to detect because their detectionrequires coordination among different intrusiondetection components or systems (Snapp andBrentano, 1991; Neumann and Porras, 1999). Fail-

    ure to recognize these attacks leads to falsenegatives. Therefore, the development of modelsand protocols for information sharing among in-trusion detection components is critical. IDSs areoften categorized as distributed or centralized.Spafford and Zamboni (2000) defined centralizedIDSs as those where the analysis of data isperformed at a fixed number of locations, in-dependent of how many hosts are monitored.Distributed IDSs are defined as those where theanalysis of the data is performed at a number oflocations proportional to the number of hostsmonitored. Centralized IDSs are able to use all

    available audit data to form a decision, but theycreate large communication overhead, requirea powerful central processor, and represent a sin-gle point of failure. To overcome this problem,distributed IDSs process audit data at multiplelocations. Distributed IDSs, like DIDS (Snapp andBrentano, 1991) and AAFID (Balasubramaniyanet al., 1998; Spafford and Zamboni, 2000), canshare filtered raw data or binary (i.e., yes/no)decisions among their components. However, theycannot share probability distributions of intrusionbeliefs. Moreover, existing distributed IDSs do not

    support selective sharing of published data amongpeers. In this work, we propose a middle groundbetween centralized and distributed IDSs, whereeach IDS component shares its data or results onlywith those agents that subscribe for these results.

    We believe that precise representation of at-tack signatures in probabilistic intrusion detectionmodel requires: (1) ability to process observations(hard findings) and beliefs (probability distribu-tions) about system parameters, and (2) flexibilityin specifying threshold value of probability, abovewhich an alarm is generated.

    In this paper, we focus on distributed IDSs basedon Bayesian technology and multiagent technol-ogy. IDSs based on Bayesian technology may allowsharing of raw data and results (probability dis-tributions) among IDS components. A BayesianNetwork (BN) is a graphical representation of thejoint probability distribution for a set of discretevariables. The representation consists of a directedacyclic graph (DAG). Nodes of the DAG representvariables and edges represent causeeeffect rela-tions. The strength of each effect is modelled asa probability. These probabilities are represented

    by a conditional probability table (CPT). CPTsspecify conditional probability of the variablegiven its parents. For variables without parents,this is an unconditional distribution. Inference inBayesian Network means computing the condi-tional probability for some variables given infor-mation (evidence) on other variables.

    The traditional Bayesian inference (Jensen,2001; Pearl, 1988) can be performed only withhard findings or observations as input. However, inintrusion detection scenarios modelled with Bayes-ian Networks, we often find that the input varia-bles cannot be measured directly. Only a belief(probability distribution) in the current state ofthese variables may be computed. A typical exam-ple of such input is a probabilistic result computedby another IDS component. To accept results ofother IDS components, existing IDSs (DuMouchel,1999; Valdes and Skinner, 2000; Sebyala et al.,2002; Cho and Cha, 2004) based on the traditional

    Bayesian inference technique have to coerce theresult into a binary decision. Such coercions areperformed by assuming occurrence (or non-occur-rence) of represented event if the input proba-bility is greater (or smaller) than a thresholdvalue. We believe that IDSs that utilize suchbinary decisions have limited flexibility and havedifficulty in removing false positives and falsenegatives.

    We illustrate our above observation by a simpleexample given inFig. 1.Fig. 1(a) shows a BayesianNetwork that accepts two inputs: B (hard finding)

    and C (soft finding), and computes belief in A.Fig. 1(b) shows conditional probability tables forthe example BN.Fig. 1(c) shows the likelihood of Abeing in abnormal state with respect to thelikelihood of C being in abnormal state. Likeli-hood ofC is computed with the two methods, withand without coercion. In both calculations we haveassumed that B is observed to be in abnormalstate. We observe that the likelihood graph of A iscontinuous when soft-evidential update is used. Inthis case the security officer has large flexibility inchoosing a warning threshold for A. We alsoobserve that the likelihood graph of A is notcontinuous with traditional probability update.Moreover, the likelihood of A has only discretevalues that depend on the threshold set forC. Wehave developed a Bayesian Network-based tech-nique that allows the IDS components to shareresults of their analysis in the form of beliefs. Suchsharing enables our model to perform intrusiondetection on a continuous scale.

    Agents are software systems that functionautonomously to achieve desired objectives intheir environment. Recent research (Spafford and

    ARTICLE IN PRESS

    2 V. Gowadia et al.

  • 8/12/2019 Antecedentes Del Uso de Agentes Inteligentes

    3/17

    Zamboni, 2000; Jansen et al., 1999; Carver et al.,2000; Helmer et al., 2003) shows that agent-basedtechnology seems to be a promising direction fordeveloping collaborative intrusion detection sys-tems. We propose an agent-based, cooperative

    architecture where each IDS component is able toprocess its own data and to integrate local findingswith the findings of other IDS components. Eachagent acts as a wrapper for a Bayesian Network.Agents in our model utilize the communicationprotocols and languages (Bellifemine et al., 1999;FIPA, 2002a) developed by multiagent researchcommunity. In addition, they use Bayesian infer-ence with soft-evidential update to support in-tegration of beliefs and observations. We refer tothis multiagent architecture as Agent EncapsulatedBayesian Network (AEBN) (Bloemeke and Valtorta,2002). Although, Bayesian Network-based architec-tures have been considered for intrusion detection(DuMouchel, 1999; Valdes and Skinner, 2000; Bar-bara et al., 2001; Sebyala et al., 2002; Cho and Cha,2004), these models use traditional probability-update methods (Jensen, 2001; Pearl, 1988). Theycan effectively utilize only those parameters thatresult from an actual measure. This limitationoften results in under-specified signatures. To solvethis problem, in our model agents are enabled toshare their beliefs (soft findings) in addition tomeasured values (hard findings).

    More specifically, we propose an agent-basedand cooperative architecture, called ProbabilisticAgent-Based Intrusion Detection (PAID), to analyzesystem information and estimate intrusion proba-bilities. Agents in PAID accept facts and derived

    values as inputs. Agents may share their beliefswith, or request information (belief or data) fromthe other agents.

    Our model uses three types of agents: system-monitoring agents, intrusion-monitoring agentsand registry agents. System-monitoring agentsare responsible for collecting, transforming, anddistributing intrusion specific data upon requestand evoking information collecting procedures.Each intrusion-monitoring agent encapsulatesa Bayesian Network and performs belief updateas described in Valtorta et al. (2002) using bothfacts (observed values) and beliefs (derived val-ues). Intrusion-monitoring agents generate proba-bility distributions (beliefs) over intrusionvariables that may be shared with other agents.Each belief is called a soft finding. Soft findings canindicate that a system is in an abnormal state.Even in the absence of hard findings, soft findingscan affect the probability of intrusion occurrenceor attack against the monitored system. A proba-bilistic representation of attacks, using hard andsoft findings makes our model capable of identify-ing variations of known intrusions. Coordination

    P(A = normal) P(A = abnormal)

    0.97 0.03

    B P(A = normal) P(A = abnormal)

    normal 0.98 0.02

    abnormal 0.01 0.99

    C P(A = normal) P(A = abnormal)

    normal 0.95 0.05

    abnormal 0.03 0.97

    (b)

    A

    B C

    (a)

    (c)

    90

    100

    80

    60

    40

    50

    70

    20

    30

    10

    00 25 50

    Likelihood of C being abnormal

    LikelihoodofA

    being

    abnormal

    75 100

    Soft Evidential Update

    Traditional Bayesian

    Update

    Figure 1 (a) Example BN, (b) CPTs for the example BN, and (c) likelihood ofA being abnormal calculated with soft-evidential update and traditional probability update (CthresholdZ 60%).

    ARTICLE IN PRESS

    A probabilistic agent-based intrusion detection system 3

  • 8/12/2019 Antecedentes Del Uso de Agentes Inteligentes

    4/17

    between system-monitoring and intrusion-moni-toring agents is provided by a registry agent.Within an IDS collaborative model, there may existseveral registry agents that, upon failure, cancompensate for each other. However, each in-trusion-monitoring and system-monitoring agent isregistered with a registry agent central to the

    monitoring agents.Currently our model detects known intrusions byusing well-documented patterns of attacks. Eachintrusion-monitoring agentis looking for a particularintrusion pattern. If such a pattern is found, a pos-sible intrusion is indicated. Distributed intrusiondetection is achieved by enabling agents to sharetheir beliefs. Depending on the level of collabora-tions and privacy concerns of the collaboratingentities, each component may be able to build thefull, global decision stage or only a partial one.

    The organization of this paper is as follows. Nextsection gives a brief introduction to Bayesian

    Networks, and agent technology. Then, back-ground information and related work followed bythe description of the proposed framework (PAID)are given. Further, methodology for building BNsfor intrusion detection is presented which isfollowed by implementation of the proposed in-trusion detection framework. Finally, we concludeand recommend future research in last section.

    Background

    Bayesian Networks

    A Bayesian Network (BN) is a graphical represen-tation of the joint probability distribution for a setof discrete variables. The representation consistsof a directed acyclic graph (DAG), prior probabilitytables for the nodes in the DAG that have noparents and conditional probability tables (CPTs)

    for the nodes in the DAG given their parents. As anexample, consider the network inFig. 2.

    More formally, a Bayesian Network is a paircomposed of: (1) a multivariate probability distri-bution over n random variables in the set VZV1,., Vn, and (2) a directed acyclic graph (DAG)whose nodes are in one-to-one correspondence

    with V1,.

    , Vn. (Therefore, for the sake of conve-nience, we do not distinguish the nodes of a graphfrom variables of the distribution.)

    Bayesian Networks allow specification of thejoint probability of a set of variables of interest ina way that emphasizes the qualitative aspects ofthe domain. The defining property of a BayesianNetwork is that the conditional probability of anynode, given any subset of non-descendants, isequal to the conditional probability of that samenode given the parents alone. The chain rule forBayesian Networks (Neapolitan, 1990) given belowfollows from the above definition.

    Let P(Vi | p(Vi))be the conditional probability ofVigiven its parents. (If there are no parents for Vi,let this be P(Vi).) If all the probabilities involvedare nonzero, then P(V)Z

    QvVP(v| p(v)).

    Three features of Bayesian Networks are worthmentioning. First, the directed graph constrainsthe possible joint probability distributions repre-sented by a Bayesian Network. For example, in anydistribution consistent with the graph ofFig. 2,Disconditionally independent ofAgivenB andC. Also,Eis conditionally independent of any subset of the

    other variables given C.Second, the explicit representation of con-straints about conditional independence allowsa substantial reduction in the number of parame-ters to be estimated. In the example, assume thatthe possible values of the five variables are asshown inFig. 2(b).

    Then, the joint probability table P(A, B, C, D, E)has 2! 3! 2! 4! 4Z 192 entries. It would be

    C

    ED

    A

    B

    A

    B

    C

    D

    E

    A1, A2

    B1, B2, B3

    C1, C2

    D1, D2, D3, D4

    E1, E2, E3, E4

    A1 A2

    B1 0.2 0.1

    B2 0.6 0.6

    B3 0.2 0.3

    (a) (b) (c)

    Figure 2 (a) An example Bayesian Network, (b) variable states, and (c) conditional probability table forB givenA.

    ARTICLE IN PRESS

    4 V. Gowadia et al.

  • 8/12/2019 Antecedentes Del Uso de Agentes Inteligentes

    5/17

    very difficult to assess 191 independent parame-ters. However, the independence constraints en-coded in the graph permit the factorizationP(A, B,C, D, E)ZP(A)! P(B |A)! P(C |A)! P(D | B, C)!P(E| C) which reduces the number of parameters tobe estimated to 1C 4C 2C 18C 6Z 31. Thesecond term in the sum is the table for the con-

    ditional probability ofB givenA. This probability isshown in Fig. 2(c); note that there are only fourindependent parameters to be estimated since thesum of values by column is one.

    Thirdly, the Bayesian Network representationallows a substantial (usually, dramatic) reductionin the time needed to compute marginals for eachvariable in the domain. The explicit representationof constraints on independence relations is ex-ploited to avoid the computation of the full jointprobability table in the computation of marginalsboth prior and conditioned on observations. Limi-tation of space prevents the description of the

    relevant algorithms; see Jensen (2001) for a dis-cussion of the junction tree algorithm.

    The most common operation on a BayesianNetwork is the computation of marginal probabil-ities both unconditional and conditional uponevidence. Marginal probabilities are also referredas beliefs in the literature (Pearl, 1988). Thisoperation is called probability updating, beliefupdating, or belief assignment.

    We define evidence as a collection of findings. A(hard) findingspecifies which value a variable is in.A soft findingspecifies the probability distribution

    of a variable. These definitions of finding and ofevidence may be generalized, for example, byallowing specifications of impossible configurationsof pairs of variables (Cowell et al., 1999; Lauritzenand Spiegelhalter, 1988; Valtorta et al., 2002).However, applications rarely need the power ofthe more general definitions, and most BayesianNetwork software tools support only the definitionof (hard) evidence as a collection of (hard) findingsgiven here.

    Agent Encapsulated Bayesian Networks

    Although there is no universally accepted defini-tion of agent, most authors agree that agents sharethe following properties: each agent is autono-mous, has a set of goals, and has a local model ofthe part of the world that affects the achievementof its goals, and has a way of communicating withother agents. In an Agent Encapsulated BayesianNetwork (AEBN) (Bloemeke and Valtorta, 2002),each agent uses a single Bayesian Network (whichis also called an AEBN) as its model of the world.The agents communicate via passing messages that

    are distributions on variables shared between theindividual networks.

    The variables of each AEBN are divided intothree groups: those about which other agents havebetter knowledge (input set), those that are usedonly within the agent (local set), and those forwhich the agent has the best knowledge and which

    the other agents may want to use (output set). Thevariables in the input set and the output set areshared with other agents. The variables in thelocal set are not. An agent subscribes to zero ormore variables in the input set and publishes zeroor more variables in the output set.

    The mechanism for integrating the view of theother agents on a shared variable is to replacethe agents current belief (which is a probabilitydistribution) in that variable with that of thecommunicating agent. The update of a probabilitydistribution represented by a Bayesian Networkupon receipt of a belief is called a soft-eviden-

    tial update and is explained in detail by Valtortaet al. (2002). In this work, we have used the BigClique algorithm for soft-evidential update, im-plemented in the BCeHugin system (Kim et al.,2004).

    When a publisher makes a new observation, itsends a message to its subscribers. The subscribersin turn adjust their internal view of the world andsend their published values to their subscribers.Assuming that the graph of agent communication(which we simply call agent graph) is a directedacyclic graph (DAG), equilibrium is reached, and

    a kind of global consistency is assured because thebelief in each shared variable is the same in agentsthat subscribe to that variable.

    The restriction that an agent has correct andcomplete knowledge of the variables it publishesforces unidirectional communication, and it mayseem excessive. However, there is a good reason toinsist on this requirement. The alternative (i.e.,to allow bidirectional communication betweenagents) requires that the agent graph be a tree,as shown in Xiang (2002). Most agent-based sys-tems demonstrate acyclic graph communicationmodel. For example, it is possible to have multipleviews of the same parameter. That is, two agentsmay publish variables that correspond to theirmeasurement (or belief) of the same parameter.Moreover, nothing prevents another agent fromintegrating the published values of these twoagents, thus obtaining a new (and possibly moreaccurate) view of the parameter.

    Table 1summarizes some features of AEBNs andother related representation formalisms. AEBNshave very good scalability and shared variablesare independent of variables in descendant BNs.

    ARTICLE IN PRESS

    A probabilistic agent-based intrusion detection system 5

  • 8/12/2019 Antecedentes Del Uso de Agentes Inteligentes

    6/17

    Therefore, we chose to use the AEBN organizationfor the work described in this paper.

    We now briefly overview related research workon intrusion detection with the help of Bayesiannetworks or agent technology.

    Bayesian Networks based intrusiondetection

    IDSs using Bayesian Networks have been proposedby many researchers (DuMouchel, 1999; Valdes andSkinner, 2000; Barbara et al., 2001; Sebyala et al.,2002; Cho and Cha, 2004). However, these IDSmodels use only hard findings in their Bayesianmodels. We now briefly overview their IDS archi-tectures.

    DuMouchel (1999) proposed an anomaly detec-tion technique using the Bayes classifier. Theykeep a profile of commands issued by each userand compute command transition probabilities.Their IDS detects abnormal behavior based on theobserved command transitions.

    Valdes and Skinner (2000)proposed an adaptivemodel that detects attacks using probability theory.

    Their architecture analyzes the traffic from a givenclients TCP sessions. This analysis is done byBayesian inference at periodic intervals in a session,and the interval is measured in number of events orelapsed time. Between inference intervals, thesystem state is propagated according to a Markovmodel. After each inference, the system may givealerts for suspicious sessions.

    Sebyala et al. (2002) have incorporated Bayes-ian Network in their IDS as anomaly detector. Theykeep a profile of CPU and memory utilization byproxylets in active networks. They use a BayesianNetwork to compute state (good or bad) ofproxylet. A proxylet is in bad state if the CPUand memory utilization is anomalous.

    Cho and Cha (2004) proposed a technique todetect anomalies in web sessions. A web sessionconsists of sequence of page requests. Anomalousrequest in given web session may correspond torequest for secured pages without accessing thelogin page, or repeated access to a same page.Their model utilizes Bayesian parameter estima-tion technique (Friedman and Singer, 1998) tocompute probability that a user may request

    Table 1 Agent Encapsulated Bayesian Networks and related representation formalisms

    Name Granularity Topologicalrestrictions

    Constraints onindependencerelations

    Purpose Scalability

    Bayesian Network(Jensen, 2001)

    Individualvariable

    DAG ofvariables

    Local Markovcondition(d-separation)

    Efficientrepresentationof multivariate

    probabilitydistribution

    Poor

    Multiply SectionedBayesianNetwork(MSBN) (Xiang,2002)

    BayesianNetwork(BN)

    Tree (of BNs) d-Separation oncomposition ofBNs;

    Efficientdistribution ofcomputationamong processors

    Good: distributedcomputation,if treedecompositionis possible

    Multiple EntityBayesianNetworks(MEBN) (Laskeyet al., 2001)

    BayesianNetworkFragments(BNFrags)

    DAG (ofBNFrags)

    d-Separationon compositionof BNs;encapsulation

    Distributedrepresentationof BayesianNetworks

    Mediocre:representationdecomposed,computationcentralized

    AgentEncapsulatedBayesianNetworks (AEBN)(Bloemeke andValtorta, 2002)

    BayesianNetwork(BN)

    DAG (of BNs) Shared variablesindependent ofvariables indescendent BNsgiven parent BNs;encapsulation

    Construction ofinterpretationmodels bycollaboratingagents

    Very good:distributedcomputation,distributedrepresentation

    DecentralizedSensingNetworks (DSN)(Utete, 1998)

    Sensor Undirectedgraph (ofsensors)

    None: non-probabilisticapproach

    Distributedsensing anddata fusion

    Poor: rumorproblem isunsolvable inDSNs

    ARTICLE IN PRESS

    6 V. Gowadia et al.

  • 8/12/2019 Antecedentes Del Uso de Agentes Inteligentes

    7/17

    certain pages in given sequence. A web sessionmay consist of multiple sub-sessions. To combinethe anomaly scores of these sub-sessions, theysuggest use of either maximum value (for highsensitivity) or average value (for low sensitivity).

    In the above models, Bayesian inference isperformed when a hard evidence is received like

    a command sequence, TCP parameters, CPU ormemory utilization, or a page request. None ofthe above models describe Bayesian model forattacks when the input observation is a probabilitydistribution over the states of a system parameter.For example, there is 80% chance of a DOS attack onthe file server, or there is a 70%chance for commandsequence to be anomalous. Unlike existing models,our model allows accepts beliefs (probability dis-tributions) as input to the Bayesian models.

    Agent-Based Intrusion Detection systems

    Agent-based systems require a communication in-frastructure. Agents in our system communicatewith each other by sending messages in the AgentCommunication Language specified by FIPA (FIPA,2002a,b). JADE (Bellifemine et al., 1999) is a soft-ware framework to aid the development of agentapplications in compliance with the FIPA specifi-cations for inter-operable intelligent multiagentsystems. The purpose of JADE is to simplify de-velopment while ensuring standard compliancethrough a comprehensive set of system services

    and agents. To achieve such a goal, JADE offersa distributed agent platform, directory facilitator(DF), and library of interaction protocols. Theagent platform includes an agent managementsystem that allows monitoring and logging of agentactivities and performs life-cycle operations(start, suspend, and terminate) on agents. Inter-action protocols (e.g., request, query, subscribe,etc.) are used to design agents interaction, pro-viding a sequence of acceptable messages andsemantics for those messages.

    Agent communications can be divided into twocategories: communication among agents at thesame host and communication among agents atdifferent hosts. Balasubramaniyan et al. (1998)examine these methods in the context of intrusiondetection.

    Spafford and Zamboni (2000)andBalasubrama-niyan et al. (1998) presented a framework calledAAFID in which autonomous agents report theirfindings to entities called transceivers. Each hosthas a unique transceiver that collects informationfrom all other agents on its host machine. Agentsalso perform data reduction and send data to

    monitors that oversee the operation of severaltransceivers. Monitors have the capability to de-tect events that may be unnoticed by the trans-ceivers. In mobile agent-based systems, like theones presented byHelmer et al. (2003)andAsakaet al. (1999), mobile agents collect, integrate, andanalyze data from different components of a dis-

    tributed system. The agents findings are recordedin a database and/or reported to the users.

    System design goals

    One of the design goals of our IDS is to enable it tofunction as a stand-alone system or to supportexisting IDSs. Our main goal is to improve uponexisting IDS technologies by allowing flexible in-formation sharing among system components ina way that the shared data are easily incorporatedin the analysis of the components. Our model

    supports the calculation of intrusion probabilitieson a continuous scale of [0, 1]. A probability ofzero means it is certain no intrusion has occurred,and one means that an intrusion has definitelyoccurred. For each intrusion type there is anassociated variable that represents the probabilityof that intrusion. Each Bayesian network is able tomodify its own belief (probability distribution overan intrusion variable) and to import or exportbeliefs from or to other Bayesian networks. Theseinput variables are accepted during all states ofprocessing.

    Analysis of distributed attacks on a large net-work may require monitoring of numerous hostsand large volumes of network traffic. Thus a largeamount of data is generated that must be ana-lyzed. Our model supports local analysis of col-lected data and sharing of results (and partialresults). We also allow agents to share probabilitydistributions (beliefs) of intrusion occurrences andsystem states. This belief sharing carries moreinformation than sharing a binary decision and alsohas a lower overhead than raw data sharing.

    Each intrusion-monitoring site or network mayhave different sensitivity and selectivity require-ments. Our model allows security officers tocustomize these parameters according to the localrequirements. This customization does not affectthe probability distribution values shared amongthe agents.

    Finally, we address some of the issues related toreliability and ease of maintenance. Based on thedistributed nature of our model and the possibilityof replicated Bayesian Networks for monitoringintrusion, our model remains functional even ifsome of the IDS network nodes are unavailable.

    ARTICLE IN PRESS

    A probabilistic agent-based intrusion detection system 7

  • 8/12/2019 Antecedentes Del Uso de Agentes Inteligentes

    8/17

    Since the detection of an intrusion is based onseveral parameters, including local findings andfindings from several other agents. The misleadingdata from compromised agents are small if thenumber of non-compromised agents is large andthe number of compromised agents is small. EachBayesian Network is responsible to monitor a par-

    ticular intrusion; therefore the modification of anintrusion pattern will affect only those networksthat monitor the intrusion. Similarly, protectionagainst new types of attacks can be added easily tothe model.

    Probabilistic Agent-Based IntrusionDetection

    In our model, we use agent graphs to representintrusion scenarios. Each agent is associated with

    a set of input variables, a set of output variables,and a set of local variables. The agent at eachnode of the graph encapsulates a Bayesian Net-work. Nodes of the Bayesian Network are variablesthat represent suspicious events, intrusions, orsystem and network parameter values. A variablecan have any number of states, and the belief inthe variable is the distribution on its states. Theencapsulated Bayesian Network is used to modelintrusion scenarios. It is also able to incorporatemeasurement errors and handle multiple beliefson input variables.

    PAID architecture

    The PAID architecture uses agent technology tocollect and analyze data and to distribute infor-mation among the PAID components. PAID supportsthree types of agents: (1) system-monitoringagents, (2) intrusion-monitoring agents, and (3)registry agents.

    (1) System-monitoring agents: The system-moni-toring agents perform either online or offlineprocessing of log data, communicate with theoperating system, and monitor system resour-ces. These agents publish their output variables(facts and beliefs derived from observations)that can be utilized by other agents.

    (2) Intrusion-monitoring agents: Each intrusion-monitoring agent computes the probability fora specific intrusion type. These agents sub-scribe to variables and/or beliefs publishedby the system-monitoring agents and otherintrusion-monitoring agents. The probabilityvalues for each agent are updates, calculated

    according to the values of input variables andbeliefs.

    (3) Registry agent: The registry agent maintainsinformation about the published variables andmonitored intrusions for each system-monitor-ing or intrusion-monitoring agent. It is requiredthat all agents of PAID must register with the

    registry agent. The registry agent also maintainsthe location and current status of all theregistered agents. Agent status is a combinationof two parameters alive and reachable. Thestatus of a communication link between any twoagents is determined by attempting to achievea reliable UDP communication between them.The registry agent is used to find information(e.g., name and location) about agents who maysupply required data. The PAID architecturecan support multiple registry agents also asdescribed later in section Scalability andcomplexity analysis. For simplicity, we de-

    scribe the examples with a single registry agent.

    Agent communication

    The interactions among the components of PAIDare shown inFig. 3. The messages are sent in XMLsyntax (Bray et al., 2001) among the agents. Thesemessages correspond to registration requests, in-formation requests and other agent actions. Abrief overview of agent actions and the corre-sponding messages is given below.

    1. Registration of an agent with the registryagent: Each agent in PAID must register with

    Intrusion

    Probability

    Registry

    Agent System

    Monitoring

    Agents

    Log

    Files

    Intrusion

    Monitoring

    Agents

    Bayesian

    Networks

    Agent

    Search

    1.Register1.Register

    Agent Communication

    Figure 3 Probabilistic Agent-Based Intrusion Detection(PAID).

    ARTICLE IN PRESS

    8 V. Gowadia et al.

    http://-/?-http://-/?-http://-/?-http://-/?-
  • 8/12/2019 Antecedentes Del Uso de Agentes Inteligentes

    9/17

    the registry agent. A registration messageincludes the registering agents agent-id, IPaddress, list of published variables and theirpossible states, digital signature, and digitalcertificate. The registry agent issues an ac-knowledgment message upon successfully en-tering the new agent in its database.

    2. Information request by an agent about otheragents: Each intrusion-monitoring agent hasa set of input variables (determined from theencapsulated Bayesian Network). To find agentscapable of providing required input data, theintrusion-monitoring agent sends a search re-quest to the agent registry. The search requestincludes the requesters agent-id, IP address,and the required input variables. The messageis digitally signed by the requester.

    3. Registry agents reply to an information re-quest: Upon receiving a search request, theregistry agent verifies that the request is

    legitimate before searching its database todetermine which agents can supply the re-quested variables and the status of theseagents. The message from the registry agent tothe requester includes the requested variablename, the agent-id of the agent publishing thevariable, its IP address, and status. The messageis digitally signed by the registry agent.

    4. Request for belief subscription: Upon receivingthe list of agents capable of providing therequired input from the registry, the subscrib-ing agent sends requests directly to these

    agents. A subscription request consists of therequesters agent-id, requesters IP address,requested input variable name, the duration ofsubscription time, the desired time intervalbetween subsequent updates, a request-id,and the timestamp of the request. The mes-sage is digitally signed by the requester.

    5. Belief-update messages: Upon receiving a be-lief subscription request the publishing agentsends regular updates within the agreed inter-vals and duration of the subscription. Themessage contains the request-id, the sendersid, and the probability distribution of therequested variable. The message is signed bythe publisher.

    Communication security and reliability

    Reliable and secure communication is achieved byusingcommercially available encryption techniquesto achieve communication security and authentica-tion. Reliability is supported by periodic status

    update of the active agents. We use secret keyencryption for message content to reduce encryp-tion overhead. Message and agent authentication isguaranteed by public-key cryptosystem and the useof digital certificate. Each message is digitallysigned by the sending agent. In addition, we requirethat agents authenticate themselves to the registry

    by their digital certificates.Status probes of registered system-monitoringagents and network links are periodically per-formed by the registry agent. Responses to theprobing messages carry information about thestate of the system-monitoring and intrusion-mon-itoring agents. The status of a communication linkbetween any two agents is determined by at-tempting to achieve a reliable UDP communicationbetween them. Compromised agents can be iden-tified by periodically launching attacks over themonitored network and verifying that the ex-pected results are generated. This approach was

    proposed byDacier (2002).

    Scalability and complexity analysis

    The factors affecting the scalability of our modelare the costs of data transfer, belief updates andregistry operations (i.e., register, deregister, andquery). During normal operation, agents sharetheir beliefs; thus, PAID has a low bandwidthrequirement. Sharing of data or partial data is

    required only to analyze suspicious events.Pearl (1988) has shown that belief update can

    be performed in linear time in trees and (moregenerally) singly connected networks. Unfortu-nately, belief update in general Bayesian Networksis NP-hard (Cooper, 1990). The computationalcomplexity of the algorithm found to be the bestin practice, the junction tree algorithm, is expo-nential in a graphical parameter called the tree-width of the Bayesian Network. This negativeresult holds, even for some notions of approxima-tion and for many restrictions on the structure ofthe Bayesian Network. Despite these negativetheoretical results, update in most Bayesian Net-works, using the junction tree algorithmLauritzenand Spiegelhalter (1988)is very fast because mostpractical Bayesian Networks compile (after anintermediate step that converts them into anindirected graph) into a junction tree where thelargest clique is small. The process is described indetail in the literature, for example inNeapolitan(1990). More precisely, the computational com-plexity of the junction tree algorithm, which iswidely found to be the fastest algorithm in

    ARTICLE IN PRESS

    A probabilistic agent-based intrusion detection system 9

  • 8/12/2019 Antecedentes Del Uso de Agentes Inteligentes

    10/17

    practice is exponential in a graphical parametercalled the treewidth of the Bayesian Network.

    PAID can provide scalability by supporting mul-tiple registries. Each subnet may have its ownagent registry. The agent-registries can forwardrequests and replies to neighboring registriesbased on the IP address of the receiving agent.

    Dynamic routing algorithms for IP networks (Moy,1997; Perlman, 1992) are applicable for thispurpose.

    Modelling Bayesian Networks for PAID

    To assess the probability of an intrusion, we useBayesian Networks. Modelling a domain withBayesian Networks involves two major steps. First,a domain expert needs to specify the qualitativestructure of the network, which depends solely on

    the independence relation among the variables ofthe domain of interest. Second, the numericalparameters need to be assessed; these parametersare the prior probabilities of variables that have noparents, and the conditional probabilities of everyother variable given its parents. The graph and theprobabilities uniquely and completely specify thejoint probability of the variables in the domain ofinterest.

    We now present methodologies for modellingBayesian Networks for attack patterns, systemparameters, incorporating errors, and resolvingconflicts.

    Bayesian Network building methodology

    There are two methods of building Bayesian Net-works for a particular application domain. The firstmethod consists of asking the domain expert toconstruct the network (DAG) and assess the priorand conditional probabilities manually. This is howwe build our networks. The second method buildsthe network from data. There are several algo-rithms available to accomplish this learning task.These are: BIFROST (Lauritzen and Spiegelhalter,1988), K2 (Cooper and Herskovits, 1992), and CB(Singh and Valtorta, 1993, 1995). The prior andconditional probabilities can also be computedfrom data. The models are validated by comparisonwith the performance of an expert (Spiegelhalteret al., 1993; Neapolitan, 2004). We plan to extendour model to incorporate these algorithms to buildBayesian Networks.

    We now illustrate our method of building Bayes-ian Network model for attack patterns, with anexample of a Mitnick attack.

    Modelling computer attacks with BayesianNetworks: an example

    The Mitnick attack (see Fig. 4) is difficult toidentify due to its distributed nature. In a networkvulnerable to Mitnick attack, the victim hostauthenticates a trusted host using an IP address

    only. The trust relationship between the victimand trusted hosts implies that the users logged inon the trusted host or applications running on thetrusted host can access resources on the victimhost without secure authentication. Mitnick attackexploits the weakness of IP based authenticationsystems and a flaw in TCP packet sequence numbergeneration algorithm. An attacker launches a dis-tributed denial of service attack on the trustedhost making it temporarily unavailable. The at-tacker is then able to gain access to the victim hostby pretending to be a user from the trusted host.Hence, the identification of a Mitnick attackrequires evidence of both IP spoofing and DOSattacks on different machines in the victim net-work. Soft and hard findings detected on thevictims network can be used to identify theattack. We now examine the Mitnick attack indetail and model possible findings and their de-pendencies with a Bayesian Network model.

    In preparation for the attack, the intruderinstalls malicious programs (zombies) on manyvulnerable computers over the Internet. Mean-while, the intruder gathers information about thereal victim. This information will allow the attacker

    to successfully guess TCP sequence numbers ofthe victim host. At a specific time, the attackeractivates the zombies to launch a denial of service(DOS) attack against the host trusted by the victim.As a result, the trusted host is unable to reply topackets sent by the victim host. Under sucha situation, the intruder tries to open a TCPconnection with the victim host by spoofing the IPaddress of the trusted host. The victim host sends

    Vulnerable

    systems on

    Internet

    Install

    Trojans

    Victim

    Network

    Trusted

    Host

    Multiple connection

    requests

    (Denial of Service Attack)

    PacketsSent

    areIgnored

    Attack TargetConnection Request

    with Spoofed IP of

    Trusted Host

    Figure 4 Mitnick attack.

    ARTICLE IN PRESS

    10 V. Gowadia et al.

  • 8/12/2019 Antecedentes Del Uso de Agentes Inteligentes

    11/17

  • 8/12/2019 Antecedentes Del Uso de Agentes Inteligentes

    12/17

    Based on the value of S, belief about a DOSattack is computed. We distinguish between threestates: low, medium and high probability. If allconnection requests are handled,S is equal to zeroor has a very small value, thus the probability ofthe attack is either zero or low. When the system isunder attack, with the result that many connec-

    tion requests cannot be handled, the probability ofthe attack is high. Actual values that differentiatebetween high and low states can be determined byapplying data mining techniques on log data.

    Belief calculation by system-monitoringagents

    System-monitoring agents perform simple process-ing and querying on log files and compute beliefson variables they publish. These agents use themethod of counts (Jensen, 2001; Cowell et al.,

    1999) to estimate the prior marginal probability ofa variable being in a certain state by dividing thenumber of cases in which the variable is in thatstate by the total number of cases. The method ofcounts estimates the prior conditional probabilityof a variable being in a certain state given that itsparents are in a certain configuration by diving thenumber of cases in which the child variable is inthat state and the parent variables are in thatconfiguration by the number of cases in which theparent variables are in that configuration.

    Modelling errors in measurement

    The calculation of a belief depends on factors suchas accuracy of measurement and conflicts amongbeliefs reported by various agents. In our model, ifan agent is not able to accurately determine thestate of a published variable, the agent publishesa probability distribution (belief) over the possiblestates of the variable. The publishing agent de-termines this distribution by incorporating mea-surement errors. Errors in the measurement ofa variable state are modelled within an agent withhelp of the Bayesian Network shown inFig. 7. Thisis achieved by representing the state of a variablewith a belief or soft finding. The parent node Srepresents the actual value of interest. The priordistribution of the actual values is P(S). Themeasured value is represented by variable Sobs.The measurement error is modelled by the condi-tional probability P(Sobs | S). In the absence oferror, this is a diagonal matrix. The magnitude ofnon-diagonal entries is directly proportional to themeasurement errors. In the special case of a 2 ! 2matrix, the two entries on the main diagonal

    quantify the specificity and sensitivity of themeasurement, and the other entries quantify thefalse positive and false negative ratios (Vomlel,

    2004). When the actual value is propagated toparent node S, we get a probability distributionover different states of the variable. The agentcan publish this distribution as its belief on thestate of the measured variable.

    Conflict resolution

    Conflicts among beliefs on a state of variable, dueto information provided by multiple agents on thesame underlying quantity, can be resolved usingsoft-evidential update. For example, letA1and A2

    be two agents that measure a variable v. Thevalues measured by them are B1 and B2, respec-tively. We design a Bayesian Network as shown inFig. 8. The computed posterior probability of veffectively fuses the information provided by thetwo agents in the context specified by variable CR.

    This approach requires estimating the priorprobabilities of B and CR. In most practical uses

    S

    Sobs

    Figure 7 Incorporating error in measurement of vari-able.

    v CR

    B2B1

    Figure 8 Conflict resolution.

    ARTICLE IN PRESS

    12 V. Gowadia et al.

  • 8/12/2019 Antecedentes Del Uso de Agentes Inteligentes

    13/17

    of the Bayesian Network, the value of CR is known,so the assessment of the prior probability of CRdoes not need to be accurate. The prior probabilityof B needs to be more accurate than CR. It isnormally possible to estimate B by using counts ofthe values of B in past cases. A similar technique(based on counts) can be used for the conditional

    probability tables P(B1 | v, CR) and P(B2 | v, CR).See Jensen (2001) and Cowell et al. (1999) fora discussion of the technique in general andValdesand Skinner (2000) for an application of thetechnique in an intrusion detection scenario. Forthe situation involving complete cases, i.e., casesin which all variables are observed, the techniqueconsists simply of replacing the (prior or condi-tional) probability of interest with the correspond-ing observed frequency (in the case of priorprobabilities) or with a ratio of frequencies (inthe case of conditional probabilities). In the moreinteresting and realistic case in which some vari-

    ables are not observed, a similar approach, calledfractional updating (or one of its improved var-iants) is used, seeJensen (2001)and Cowell et al.(1999) for details. To apply the technique inintrusion detection situation, we require thatcases be labelled by attack type.

    In special cases, B1and B2are statements thatvis in a particular value. In general, they areprobability distributions representing each agentsbelief that the variable v has a particular value.The unique feature of the AEBN approach is toallow such general situations, whereas other ap-

    proaches require the beliefs of the two agents tobe hard findings. The process of updating v in thepresence of the probability distributions on B1andB2is called soft-evidential update. In this work, wehave used the Big Clique algorithm for soft-evi-dential update, implemented in the BCeHuginsystem (Kim et al., 2004).

    Implementation

    In this section, we describe our implementation of

    the proposed architecture from two different per-spectives. First, we explain the developers view ofthe system, and then we describe how the user(System Administrator) can interact with the IDS.

    Developers perspective

    The PAID system uses a behavior-based agentmodel. In this model, agents are characterizedby certain behaviors. A behavior class describesthe action that an agent will perform during its life

    time. Domain specific behaviors are developed byextending class Behavior defined in JADE API.These behaviors may be either one shot behaviorsor cyclic behaviors. Once a behavior completes itstask, it may change its state to inactive by settingthe instance variable doneto true. The underlyingagent management system in the agent platform

    (JADE is this case) invokes agents active behaviorsin each simulation cycle. We now describe theconstituent modules of the PAID system:

    Main IDS agent (IDS): A singleton agent tosupervise the working of the entire system andprovide results. IDS agent provides the admin-istrative interface. It also controls other tasksin the PAID system including creation andtermination of system-monitoring and intru-sion-monitoring agents. IDS agent exhibitsStartAgentsBehaviorand StopAgentsBehavior.

    System-Monitoring Agents (SMAgent): A class

    representing the system-monitoring agents inthe IDS. This class is responsible for registeringitself with the JADE DF, and for executingPublishingBehavior and a custom behavior toquery log files or measure system performance.The name of custom behavior class is de-termined by the main IDS agent from thesystem-audit configuration file and is invokedduring runtime with the help of Java ReflectionClass API.

    Intrusion-Monitoring Agents (IMAgent): A classrepresenting the intrusion-monitoring agents

    responsible for detecting intrusions. This classis responsible for registering itself with theJADE DF. A Bayesian Network model of in-trusion to be monitored by this agent isprovided as an argument to this agent onstartup. From the input intrusion model, theagent determines required input beliefs andqueries the directory facilitator to locateagents publishing those beliefs. This agentthen subscribes to beliefs of other agents andupdates its belief on intrusion periodically. Inother words, intrusion-monitoring agents ex-hibit SubscriptionBehavior, BeliefUpdateBe-havior, and PublishingBehavior.

    Administrators perspective

    The administrative interface provided by ourimplementation is shown in Fig. 9. As describedearlier in section PAID architecture, the PAIDsystem contains several system-monitoring agentsand intrusion-monitoring agents that utilize thedirectory facilitator (DF) provided by JADE as the

    ARTICLE IN PRESS

    A probabilistic agent-based intrusion detection system 13

    http://-/?-http://-/?-
  • 8/12/2019 Antecedentes Del Uso de Agentes Inteligentes

    14/17

    registry agent. The interface allows the adminis-trator to choose communication frequency amongthese agents. To start the IDS, administrator mustspecify a system-audit configuration file and di-rectory where Bayesian network models for variousintrusions are stored. The system-audit configura-tion file provides bootstrap information (name ofBehavior class and input log files) for system-monitoring agents. Multiple intrusion-monitoring

    agents are started, each with a different Bayesiannetwork model as input. The output of the PAIDsystem is the overall probability for the hostcomputer to be under attack. This value is graph-ically shown in the user interface. When IDSidentifies a probable attack, it brings up detailedprobability information of that attack. For exam-ple,Fig. 10shows detailed information for Mitnickattack.

    The PAID system goes through the following fourphases:

    (1) Initialization phase: All the agents registerthemselves with the agent registry on boot-up.JADE provides APIs for enabling the agents toregister themselves with the AMS and DF agentsfor the system. This provides every agent witha globally unique identifier, the Agent-ID (AID),through which the other agents can interact bytaking advantage of the white page servicesprovided by the JADE AMS. In addition, eachagent has to provide its service descriptionduring registration, which the JADE DF uses toprovide yellow page services to other agents.

    (2) Analysis phase: After initialization, agentsenter analysis phase. In this phase agentsexecute SubscriptionBehavior, BeliefUpdate-Behavior, and PublishingBehavior. These be-haviors are cyclic, i.e. they are repeatedindefinitely after every few seconds deter-mined by the communication frequency setfor the session.

    (3) Resetting phase: The Administrative interface

    allows the user to reset all the agents bystopping all agents with the Stop button andstarting the system again. When the system isstopped, all agents are deregistered andterminated. The administrator may then startall the agents again by pressing the Startbutton, or exit the system. This feature canbe useful if some agents terminate abnormallyduring execution and need to be restarted.

    (4) Termination phase: When the administratorExits the system, all the agents are deregis-tered and the IDS shuts down.

    Conclusions

    In this paper, we demonstrated the feasibility ofprobabilistic intrusion detection technique usingsoft-evidential updates. We developed and imple-mented an intrusion detection architecture calledProbabilistic Agent-Based Intrusion Detection(PAID). The advantages of our framework overexisting models follow.

    Figure 9 Administrative interface.

    ARTICLE IN PRESS

    14 V. Gowadia et al.

  • 8/12/2019 Antecedentes Del Uso de Agentes Inteligentes

    15/17

    PAID requires low volume of data sharing overnetwork in contrast to centralized data analysis.

    Although communication overhead is higher thanin IDS that allow only binary decision sharing, theimproved processing power makes PAID more suit-able for sophisticated intrusion detection. PAIDalso provides a continuous scale to represent eventprobabilities. This feature allows easy explorationof the trade-off between sensitivity and selectivitythat affects the rate of false positive and falsenegative decisions.

    The current version of PAID was illustrated inmisuse detection mode, but the same principlescan be applied for anomaly-based intrusion de-tection. Distributed intrusion detection is achievedby allowing each agent to cooperate with othersand to build full or partial, global intrusion graphs.Distributed processing not only increases efficiencybut also eliminates single point of failure.

    A proof-of-concept prototype of our model hasbeen developed using agents developed with Javaand C alone. At present we are migrating thecomplete agent model to JADE framework. We areplanning to improve and fine-tune our currentmodel to address agent trust management anddynamic agent-activation protocols.

    Acknowledgments

    This material is based upon work supported byNational Science Foundation under Grant No. IIS-0237782. Any opinions, findings, conclusions, orrecommendations expressed in this material arethose of the authors and do not necessarily reflectthe views of the U.S. Government.

    References

    Asaka M, Taguchi A, Goto S. The implementation of IDA: anintrusion detection agent system. In: Proceedings of 11thannual FIRST conference on computer security incidenthandling and response. Brisbane, Australia; 1999.

    Axelsson S. Intrusion detection systems: a taxonomy and survey.Tech. Rep. 99-15. Goteborg, Sweden: Department ofComputer Engineering, Chalmers University of Technology;2000.

    Balasubramaniyan J, Garcia-Fernandez JO, Isacoff D,Spafford EH, Zamboni DM. An architecture for intrusiondetection using autonomous agents. Coast 98-05. WestLafayette, IN: Department of Computer Science, PurdueUniversity; 1998.

    Barbara D, Wu N, Jajodia S. Detecting novel network intrusionsusing bayes estimator. In: Proceedings of first SIAM confer-ence on data mining; 2001.

    Figure 10 Calculation of attack probability using Big Clique algorithm.

    ARTICLE IN PRESS

    A probabilistic agent-based intrusion detection system 15

  • 8/12/2019 Antecedentes Del Uso de Agentes Inteligentes

    16/17

    Bellifemine F, Poggi A, Rimassa G. JADE e a FIPA compliantagent framework. In: Proceedings of the fourth internationalconference and exhibition on the practical application ofintelligent agents and multi-agents. London; 1999.

    Bloemeke M, Valtorta M. The rumor problem in multiagentsystems. Tech. Rep. 2002e006. Columbia, SC: Departmentof Computer Science and Engineering, University of SouthCarolina; 2002.

    Bray T, Paoli J, Sperberg-McQueen CM, Maler E. Extensible

    Markup Language (XML) 1.0 specification. W3C Recommen-dation, Retrieved October 16, 2002 from http://www.w3.org/TR/2000/REC-xml-20001006 ; 2001.

    Carver CA, Hill JM, Surdu JR, Pooch UW. A methodology forusing intelligent agents to provide automated intrusionresponse. In: Proceedings of the IEEE systems, man, andcybernetics information assurance and security workshop.WestPoint, NY; 2000.

    Cho S, Cha S. SAD: web session anomaly detection based onparameter estimation. Computers and Security 2004;23(4):312e19.

    Cooper GF. The computational complexity of probabilisticinference using Bayesian networks. Artificial Intelligence1990;42(2e3):393e405.

    Cooper GF, Herskovits E. A Bayesian method for the induction of

    probabilistic networks from data. Machine Learning 1992;9(4):309e47.

    Cowell RG, Lauritzen SL, David AP, Spiegelhalter DJ, Nair V,Lawless J, et al. Probabilistic networks and expert systems.New York, Inc.: Springer-Verlag; 1999.

    Dacier M. Design of an intrusion-tolerant intrusion detectionsystem. Tech. Rep. D10. IBM Zurich Research Laboratory;2002.

    DuMouchel W. Computer intrusion detection based on Bayesfactors for comparing command transition probabilities.Tech. Rep. 91. National Institute of Statistical Sciences;1999.

    FIPA. FIPA ACL Message structure specifications. RetrievedOctober 16, 2002 from http://www.fipa.org/specs/fipa00061;2002a.

    FIPA. FIPA-OS Developers guide. Retrieved October 16, 2002from http://fipa-os.sourceforge.net/docs/Developers/_Guide.pdf; 2002b.

    Friedman N, Singer Y. Efficient Bayesian parameter estimationin large discrete domains. In: Proceedings of neural in-formation processing systems (NIPS 98). MIT Press; 1998.

    Helmer G, Wong JSK, Honavar V, Miller L, Wang Y. Lightweightagents for intrusion detection. Journal of Systems andSoftware 2003;67(2):109e22.

    Jansen W, Mell P, Karygiannis T, Marks D. Applying mobileagents to intrusion detection and response. Tech. Rep. 6416.Gaithersburg, MD: Computer Security Response Center,National Institute of Standards and Technology; 1999.

    Jensen FV. Bayesian networks and decision graphs. SpringerVerlag; 2001.

    Kim YG, Valtorta M, Vomlel J. A prototypical system for softevidential update. Applied Intelligence 2004;21(1):81e97.

    Laskey KB, Mahoney SM, Wright Ed. Hypothesis management insituation-specific network construction. In: Proceedings ofthe 17th annual conference on uncertainty in artificialintelligence (UAI-01). Seattle, WA; August 2001. p. 301e9.

    Lauritzen SL, Spiegelhalter DJ. Local computations with prob-abilities on graphical structures and their application toexpert systems. Journal of the Royal Statistical Society,Series B (Statistical Methodology) 1988;50(2):157e224.

    Moy J. OSPF Version 2. Internet draft, RFC-2178; 1997.Neapolitan RE. Probabilistic reasoning in expert systems: theory

    and algorithms. New York, NY: John Wiley and Sons; 1990.

    Neapolitan RE. Learning Bayesian networks. Upper Saddle River,NJ: Pearson Prentice Hall; 2004.

    Neumann PG, Porras PA. Experiences with EMERALD to date. In:Proceedings of the first USENIX workshop on intrusiondetection and network monitoring. Santa Clara, CA; 1999.

    Pearl J. Probabilistic reasoning in intelligent systems: networksof plausible inference. San Mateo, CA: Morgan-KaufmannPublishers; 1988.

    Perlman R. Interconnections: bridges and routers. Reading, MA:

    Addison-Wesley Professional; 1992.Sebyala AA, Olukemi T, Sacks L. Active platform security

    through intrusion detection using Nave Bayesian networkfor anomaly detection. In: Proceedings of London commu-nications symposium; 2002.

    Singh M, Valtorta M. A new algorithm for the construction ofBayesian network structures from data. In: Proceedings ofthe ninth annual conference on uncertainty in artificialintelligence (UAI-93). Washington, DC; July 1993. p. 259e64.

    Singh M, Valtorta M. Construction of Bayesian belief networksfrom data: a brief survey and an efficient algorithm.International Journal of Approximate Reasoning February1995;12(2):111e31.

    Snapp S, Brentano J. DIDS (Distributed Intrusion DetectionSystem)e motivation, architecture, and an early prototype.

    In: Proceedings of the 1991 national computer securityconference; 1991.

    Spafford EH, Zamboni D. Intrusion detection using autonomousagents. Computer Networks 2000;34(4):547e70.

    Spiegelhalter DJ, Dawid AP, Lauritzen SL, Cowell RG. Bayesiananalysis in expert systems. Statistical Science 1993;8(3):219e83.

    Utete SW. Local information processing for decision makingin decentralized sensingnetworks. In: Proceedingsof the 11thinternational conference on industrial and engineering appli-cations of artificial intelligence and expert systems (IEA/AIE-98). Benicassim, Castellon, Spain; 1998. p. 667e76.

    Valdes A, Skinner K. Adaptive, model-based monitoring forcyber attack detection. In: Proceedings of the third in-ternational workshop on recent advances in intrusion de-

    tection. Springer-Verlag; 2000. p. 80e92.Valtorta M, Kim Y-G, Vomlel J. Soft evidential update for

    probabilistic multiagent systems. International Journal ofApproximate Reasoning January 2002;29(1):71e106.

    Vomlel J. Probabilistic reasoning with uncertain evidence.Neural Network World. International Journal on Neural andMass-Parallel Computing and Information Systems 2004;14(5):453e6.

    Xiang Y. Probabilistic reasoning in multiagent systems: a graph-ical models approach. Cambridge: Cambridge UniversityPress; 2002.

    Vaibhav Gowadia is a doctoral student at the Department ofComputer Science and Engineering and Research Assistant inInformation Security Laboratory at University of South Carolina,Columbia, SC. His current research interests include informationsystems security, intrusion detection, security protocols,threshold cryptography, and Semantic Web security. Heobtained his Master of Science degree in Computer Engineeringat the University of South Carolina in 2003, and Bachelor ofTechnology degree in Instrumentation and Control Engineeringat the National Institute of Technology, Jalandhar, India in 2000.

    Csilla Farkas is an Assistant Professor at the Department ofComputer Science and Engineering, and Director of the In-formation Security Laboratory at the University of SouthCarolina. She received a B.S. degree in Geological Sciencesform the Eotvos Lorand University (1985), Hungary, a B.S. in

    ARTICLE IN PRESS

    16 V. Gowadia et al.

    http://www.w3.org/TR/2000/REC-xml-20001006http://www.w3.org/TR/2000/REC-xml-20001006http://www.fipa.org/specs/fipa00061http://fipa-os.sourceforge.net/docs/Developers-Guide.pdfhttp://fipa-os.sourceforge.net/docs/Developers-Guide.pdfhttp://fipa-os.sourceforge.net/docs/Developers-Guide.pdfhttp://fipa-os.sourceforge.net/docs/Developers-Guide.pdfhttp://fipa-os.sourceforge.net/docs/Developers-Guide.pdfhttp://www.fipa.org/specs/fipa00061http://www.w3.org/TR/2000/REC-xml-20001006http://www.w3.org/TR/2000/REC-xml-20001006
  • 8/12/2019 Antecedentes Del Uso de Agentes Inteligentes

    17/17

    Computer Science from SZAMALK (1989), Hungary, and a B.S. inComputer Science (1993) and Ph.D. in Information Technology(2000) from George Mason University, VA. Dr. Farkas researchinterests include information security, data inference problem,economic and legal analysis of cyber crime, and security andprivacy on the Semantic Web. She is a recipient of the NationalScience Foundation Career award. The topic of her award isSemantic Web: Interoperation vs. Security e A New Paradigmof Confidentiality Threats. Dr. Farkas actively participates in

    international scientific communities as program committeemember and reviewer.

    Marco Valtorta is an Associate Professor at the Department ofComputer Science and Engineering at the University of SouthCarolina. He obtained a Laurea in Electrical Engineering fromthe Politecnico di Milano in 1980, an M.A. in Computer Sciencefrom Duke University in 1984, and a Ph.D. in ComputerScience from Duke University in 1987. Between 1985 and 1988,

    he was a project officer for ESPRIT at the Commission of theEuropean Communities in Brussels, where he supervisedprojects in the Advanced Information Processing area. Asa faculty member at the University of South Carolina since1988, he has conducted research funded by ARDA, SPAWAR,DARPA, the Office of Naval Research (ONR), the U.S. De-partment of Agriculture (DOA), CISE (an Italian laboratorycontrolled by ENEL, the state electricity company), and theSouth Carolina Law Enforcement Division. He is the author of

    over 30 refereed publications, an associate editor of theInternational Journal of Approximate Reasoning, and a memberof the editorial boards of Applied Intelligence and of theInternational Journal of Applied Management and Technology.His research interests are in the areas of normative reasoningunder uncertainty (especially Bayesian Networks, influencediagrams, and their use in stand-alone and multiagentsystems), heuristics for problem solving, and computationalcomplexity in artificial intelligence.

    ARTICLE IN PRESS

    A probabilistic agent-based intrusion detection system 17