How to make intelligent agents coordinated and work for us? Prof. Von-Wun Soo

How to make intelligent
agents coordinated and
work for us?
Prof. Von-Wun Soo
Department of Computer Science
National Tsing Hua University
Outline of the talk
Š Introduction of intelligent agents
Š Agent Coordination techniques
Š Game theoretical
Negotiation
Š Research topics on agents
The advent of ubiquitous agents
Š The computing and communication technologies
on wireless LAN, PDA, mobile phones, GPS,
and Bluetooth, WAP, GPRS and so on have
made the advent of a new pervasive computing
era. *any time, any place, any platform, peer-topeer
Š Personal assistants or software agents on the
hand-held devices for many different applications
and services have been conceived and becomes
more and more feasible…
Characteristics of Intelligent
Agents
– Autonomous, Proactive, and Rational:
Intelligent agents have their own goals, and
execute their tasks that optimize certain
performance measures
– Interactive/communicative: Interacting with
environment and other agents
What’s software agent
technology?
Š A paradigm shift of information utilization from
direct manipulation to indirect access and
delegation
Š A kind of middleware between information
demand (client) and information supply (server)
Š A software that has autonomous, personalized,
adaptive, mobile, communicative, social abilities
What is software agent
technology?
Š Agent theories:
– cognitive representation (belief, goal, plan,
desire, intention, commitment)
– reasoning and problem solving (planning)
Š Agent architecture:
– components, layers, models, multi-agent teams,
brokers, community, society
– control, communication and coordination
mechanisms among multi-agents
What is software agent
technology?
Š Agent Language:
– agent communication language KQML
– agent construction language AgentBuilder;
WebL; Agent0
– domain ontology KIF
Š Agent Environment and Supporting Platform
– creation, registration, deletion, mobility,
communication, authorization, authentication,
security, etc.
Why is multi-agent coordination
important?
Š Real world problems are quite complex and
human problem solving are multi-agent in nature
Š resolve conflicts among multi-agents
Š satisfy global constraints and by working as a
team, enhance global welfare or performance
Š sharing information, synchronizing actions, avoid
redundancy, avoid deadlock, avoid livelock.
Š prevent an anarchy or chaos
What is coordination?
Š Coordination is a coherent task assignment and
execution.
Š Coordination can be achieved by means of
Planning + Control + Communication
(Negotiation) + Conflict resolving
+ Resource/information sharing
Š Coordination can result in
–
–
–
efficiency enhancing
avoiding both deadlock & livelock
reducing resource contention
A taxonomy of coordination
coordination
Cooperation
Competition
Planning
Negotiation
Distributed
Planning
Centralized
Planning
Techniques of coordination
Š Coordination without communication:
–
–
–
–
–
Compiled of social laws and conventions
Reason by focal points
Inferring other agents via observation
Partial/global planning distributed/centralized
Organization-structure
Š Coordination with communication
– Knowledge-transfer protocol -- blackboard
– Contract net protocol
– Negotiation approaches (game-theoretic)
– Market mechanisms
Distributed planning
[Edmond Durfee]
Š Centralized planning for distributed plans
Š Distributed planning for centralized plans
Š Distributed planning for distributed plans
Blackboard– A cooperative
problem solving model model
blackboard
Executing
Activated
KS
events
Control
components
Pending
KS
Activations
Library
of
KSs
Coordination by organization
structure
Š Pre-defined roles, responsibilities and
preferences of agents
Š Pre-defined control and communication
protocols among agents
Š Prioritizing tasks over agents (allow
overlapping responsibilities)
Š Organizational agents are not necessary
cooperative; they can be competitive
Coordination by a market
mechanism
Š Coordination with a large number of unknown agents
Š Coordination with minimal number of direct
communication among agents
Š The market reach competitive equilibrium when
– 1) consumer bids to maximize their utility, subjected to budget
constraints
– 2) provider bids to maximize their profit, subjected to technology
capacity
– 3) net demand of good is zero
Competitive equilibrium = Pareto efficient solution
Coordination of tasks
Š Decomposition of tasks
Š Distribution of tasks
Š Control or coordination
–
–
–
–
Determine shared goal
Determine common tasks
Avoid unnecessary conflicts
Pool knowledge and evidence
Task decomposition
Š Divide and conquer
– AND/OR tree
– Spatial decomposition vs functional
decomposition
Š Depends on designer’s choice
Š Must consider resources and capabilities of
agents
Distribution of tasks
Š Market mechanisms: generalized
agreement and mutual selection
Š Contract net
Š Multi-agent planning
Š Organizational structure
Š Recursive allocations
Š Agent-mediated matchmaking/brokerage
Contract net protocol
Š Manager announces tasks via (possible
selective) multicast
Contract net protocol
Š Agents evaluate the announcement, Some
of the agents submit bids
Contract net protocol
Š The manager awards a contract to the most
appropriate agent
Why game theory?
Š Previous work [Rosenschein and
Genesereth, 1985; Rosenschein, 1994,
Haynes and Sen, 1996, Wu and Soo, 1998a,
1999]
Š Provide fundamental and theoretical
explanation on how multi-agent might
behave under at various conflicting
situations.
Underlying assumptions
Š Rational agent assumption
– maximize its own expected utility (selfish)
– mutual rationality
Š Each agent has as set of strategies to choose and
each agent’s payoff is a utility function of the
strategy combination (the outcome) of all agents’
choices.
Š The question is how can we predict the outcome
of different game situations based on rational
agent’s decision making?
Nash equilibrium
Š A strategy combination is a Nash equilibrium
Š Nash equilibrium is a stable solution concept
– if none agent will gain more payoff by leaving from
the strategy combination alone.
– Namely, no single agent will deviate from the solution
based on individual rationality
Pareto-optimality
Š A strategy combination is Pareto optimal if no
other strategy combination could increase the
payoff of one agent without decreasing the
payoff of any other agent.
– Namely, a Pareto-optimality is a kind of quality of
measure on a solution that is a solution that is not
Pareo-dominated by any other possible solution
What are difficult games?
ƒ Prisoner’s dilemma [A Nash equilibrium that is
not Pareto-optimal]
ƒ Games with no Nash equilibrium
ƒ Games with multiple Nash equilibria
The need for a trusted third
party
Š Traditional game theory cannot easily
resolve the difficulty games
Š We propose to use a trusted third party to
to enforce the commitments and contracts
negotiated by both agents
Roles of a trusted third party
Š an intermediary agent who
– temporary holds the deposit of guarantee or
compensation from one agent
– forfeits or returns the guarantee or compensation
deposited if the other side does not obey the
agreement
– Ensures that rules or principles of negotiation be
obeyed by each side of agents
Š Examples: bank; government; court; referee
Create an Equilibrium
by TTP Negotiation
Š We propose two communication actions in
the TTP negotiation process
– Ask guarantee; offer guarantee
– Ask compensation; offer compensation
The guarantee communication action
in TTP negotiation
1. Ask for guarantee
Agent P
Agent Q
4. Play the game
2. Deposit guarantee
5. Return Guarantee
3. Notify P
Trusted third
party
The compensation communication action
In TTP negotiation
Agent P
1. Offering compensation
2.Agree
Agent Q
4. Pay the game
3. Deposit compensation
Trusted third
party
5. Send compensation
A prisoner’s dilemma game
P
Q
Cooperate
Fink
-1
Cooperate
-1
-1
Fink
-1.1
-10
-10
-8
-8
(a)
Cooperate
-1
Cooperate
-10
0
Q
0
-10
Fink
P
Fink
-1.1
-9.1
-9.1
(b)
Prisoner's Dilemma game matrix (a) A special case of
a PD game matrix. (b) A dilemma-free game matrix.
(Both P&Q promised to play cooperate and deposit
Guarantee 1.1)
Note: (-1,-1) Pareto-dominates all three other strategy
combinations
Battle of Sexes Game
with Multiple Nash Equilibria
Woman
Man
Wrestle
Ballet
1
Wrestle
2
-1
-1
-5
Ballet
-5
2
1
After Negotiation with
communication actions
Man offering compensation 0.5 to Woman to play “Wrestle”
& Woman offering 0.5 compensation to Man to play "Ballet".
Woman
Wrestle
Man
Ballet
1.5
Wrestle
-1
-4.5
-5.5
Ballet
1
-1
Wrestle
-1
1.5
Ballet
Woman
Wrestle
Man
2
-5.5
2
1
-1
Ballet
-4.5
1.5
1.5
A Welfare Game without Nash
Equilibrium
G
P
Work
Idle
2
Aid
3
Idle
3
1
-1
1
0
0
(a)
Work
2
Aid
-1
-1
P
3
1
No Aid
G
No Aid
-1
-2
0
(b)
A Welfare Game (a) The original game matrix. (b) The
game matrix after negotiation.
Some assumptions and
properties in TTP negotiation
Š Proper quantum principle
Š Equal concession principle
Š Order independent property
Š Existence of Nash and Pareto-optimality
property
Š Convergence of negotiation property
Negotiation without knowing
other agent’s payoff
Š Previous game theoretic reasoning assume
a complete payoff matrix
Š In real situations, it is often difficult to
obtain other agent’s payoff information.
Š How can agents reason and negotiate
without knowing other agent’s payoffs?
QP
P1
P2
-2
2
P’s view Q1
?
Q
P
P1
P2
2
Q
Q2
-2
1
-1
-1
1
-1
?
?
-2
2
Q1
Q2
?
1
A complete payoff
matrix
QP
Q’s view Q1
Q2
P1
P2
?
?
2
-2
?
?
-1
1
Min-max strategies under
incomplete information
QP
P1
P2
?
Q2
-2
2
Q1
?
?
P1
P2
Q1
Q2
?
?
2
1
-1
?
QP
-2
?
?
-1
1
Q will choose Q2
P will choose P1
Without communication, P and Q using min-max
strategies will end up in the state (Q2, P1) which is not a
Pareto-optimal state
Q asks P to pay guarantee
QP
P1
Q1
?
2
Q2
? -4
-2 +4
?
-1
P2
? -4
1+4
Q asks P to pay guarantee 4 to play P1
P asks Q to pay guarantee
QP
P1
-2
2
Q1
?
Q2
P2
?
-1+3
1+3
?-3
?-3
P asks Q to pay guarantee 3 to play Q1
Final results
Q
P
P1
P2
Q1
2
Q2
-2-4
2
-2+4
-1+3
-1-3
1-4+3
1+4-3
Agreed at the created NFD (P1,Q1),
P pays guarantee 4
Q pays guarantee 3
A Prisoner’s dilemma game
Q
P
Q1
Q2
P1
-1
-1
1
0
P2
1
0
Created Nash (P1,Q1)
-8
Q asks P to deposit guarantee 1
to play P1
-10
-10
-8
Complete information:
P asks Q to deposit guarantee 1
to play Q1
TTP negotiation under incomplete
payoff information
Q
P
P1
P2
-1
0
Q1
-1 99 -10
-8
-10
Q2
0
-8
Reach created NFD equilibrium
P asks Q to pay guarantee
9 to play Q1
Q asks P to pay guarantee
9 to play P1
Summary
Prisoner’s dilemma
games
Battle of Sexes
Games
Social Welfare Games
Traditional
game theory
without
communicatio
n
Trap into suboptimal Nash
Mini-max doesn’t
help
Cannot predict result
with pure strategy
Mini-max strategy
may lead to
suboptimal solution
No Nash eqilibrium
with pure strategy
Mini-max strategy may
lead to suboptimal
solution
Allows TTP
negotiation
Complete
Payoff
information
Created Paretooptimal Nash
Q pays guarantee 1
P pays guarantee 1
Created Paretooptial Nash
P pays compensation
1
Created Pareto-optimal
Nash
P pays compensation
0.5
Q pays guarantee 1
Allows TTP
negotiation
incomplete
Payoff
information
Created Pareto
optimal NFD
Q pays guarantee 9
P pays guarantee 9
Created Paretooptimal NFD
P pays compensation
1 and pays guarantee
6
Q pays guarantee 3
Created Pareto-optimal
NFD
P pays compensation
0.5
and pays guarantee 3
Q pays guarantee 3
Comments and current research
Š Only two agents are involved
Š When more than 2 agents are involved in the
negotiation, there are collusion problems that any
subgroup of agents might deviate a negotiated
solution simultaneously by sacrificing the other
agents.
Š The TTP negotiation protocol needs to be
designed in a more general way to take the
collusion into consideration.
Rosenschein’s Work on
Rules of Encounter
Š Negotiation on different domains
– Task oriented domain (postmen, database, fax)
– State oriented domain (block world)
– Worth oriented domain (agents rank the worth
on different states)
Deception in negotiation–
postmen domain
Post office
h
a
True task Assignment:
b
Must return to office
Agent 1
g
f
Agent 2
c
e
d
{b,f}
{e}
Flip a coin to decide who
Deliver all the mails
Hiding task
h
Post
office
a
g
f
Task claim during
negotiation
b
Agent 1
Agent 2
c
e
d
{f}
{e}
{b}
They decide agent 2
Delivers all the mails
Phantom task
True task assignment
Agent 1
Agent 2
a
c
b
{b,c}
{b,c}
Phantom letter
{b,c,d}
{b,c}
d
They agree agent 1
Goes to c
Negotiation over mixed deals
Š Divide the tasks {D1,D2}
Š Agent 1 has probability p to take D1 and 1p to take D2
Š Agent 2 has probability 1-p to take D1 and
p to take D2
Š Change discrete deals to continuous deals
Hiding letters with mixed all-ornothing deals
h
Post
office
a
g
f
Task claim during
negotiation
b
Agent 1
Agent 2
c
e
d
{f}
{e}
{b}
They will agree on the mixed deal where agent 1 has
3/7 chance of delivering to f and e
Mixed deal prevents hiding
tasks
• The expected utility of agent 1 with honest bid is
½*8= 4
• Agent 1 might still have a chance to delivery all
letters even if he hide the delivery task b
– The expected utility of agent 1 with deception is
(3/7)*8 + 2 = 5.43,
– there is no reason to hide the task under the mixed allor-nothing deal
Phantom letters with
Mixed deals
Agent 1
a
2
Agent 2
1
c
{b,c}
{b,c}
b
Phantom letter
{b,c,d}
{b,c}
They agree on a mixed deal that d
Agent 1 has 5/8 to deliver all letters
2
Mixed deal prevents phantom
tasks
Š The agent 1 with honest bid has expected
utility ½*6= 3
Š With phantom letter, the agent 1 has the
expected utility
5/8* 6 = 3.75
Š no reason to propose a phantom letter
Incentive compatibility
mechanism
Š Theorem
– For all encounters in all sub-additive TODs,
when using a PMM(product-maximizing
mechanism) over all-or-nothing deals, no
agent has an incentive to hide a task.
Recommending a Trip
Plan by Negotiation
with a Software
Travel Agent
Introduction
Traveler
Recommendation
By
Negotiation
Time Constraints
Budget Constraints
Preference
Travel
Agent
Spatial, Temporal,
Physical Constraints
Transportation
Accommodation
Cost and Profit
User Preference and Constraints
Š an origin and a destination of the trip,
Š a travel time and a return time the of trip,
Š an upper bound budget for the trip,
Š a preferred hotel class and hotel price,
Š a set of spots that the user prefer to visit.
Constraints on a valid trip plan
Š Time constraint: any activities scheduled in the plan
should be before the return time of the trip and after the
travel time.
Š Budget constraint: the total expense of a trip plan should
not exceed the budget specified by the user.
Š Necessary component constraint: a valid and minimal set
of constraints in trip plan.
Š Preference satisfaction constraint: the travel agent must
also try his best to satisfy users’ preferences and personal
constraints while building a trip plan.
Components in a travel agent
Travel
Database
Trip
Planner
Context
User Model
Service
Behaviours
Domain
Constraints
Inconsistency
Manager
Travel Agent
Communication
Toolkits
User Interface
Agent
Trip planner – stage 1: skeletal
plan formation
Š Prepares a round-trip transportation tickets,
hotels for the nights during the trip, and a
number of day slots
Š Check consistency with local domain
constraints
Trip planner – stage 2: visiting
spots routing and scheduling
1) Searching a route with the shortest
traveling time,
2) Determining the duration to stay at each
visiting spot, and
3) Allocating the visiting spots to a particular
period of a day.
Trip planner-- stage 3: plan
refinement
1) Recommending more visiting spots to the
trip plan,
2) Upgrade or downgrade the
recommendation of travel components
3) Elongation or reduction on the staying
period of a visiting spot.
The Context User Model
Š It models the likes and the dislikes of a
user in the current dialogue context.
Š The model is initialized and updated
during the negotiation dialogue.
Š Consistency maintenance
– By timestamps
The Inconsistency Manager
Š Derives several
candidate solutions
and persuade the user
to relax some
constraints.
Š Carried out by a
negotiation protocol.
• user constraint 1
• user constraint 2
•…
• user constraint N
• budget constraint
• time constraint
• minimal staying period
• necessary component
Trip Planner
Inconsistency
Manager
Resolving the violation of time
constraints
Š Vacant time problem
– Adding a deprecated spot
– Elongating the staying period
– Advancing the return time of the trip
Š Not-enough-time problem
– Shortening the staying periods
– Postponing the return time of the trip
Resolving the violation of
budget constraint
Š Strategies for resolving the violation of the
budget constraint
– Raising the budget
– Removing expensive spots
– Choosing a cheaper hotel
Resolving infeasible user
constraints
Š The ones inconsistent with the application
content, or do not comply with the limitations.
– Specifying a non-existing hotel
– Specifying a staying period that is less than the
minimal one
Š The resolution
– complying with the constraints and application content
– similar to the user’s original specification
Scenario 1(1): User Request on
Planning a Trip
Build an initial trip plan that satisfies a user’s request
Origin:
Destination:
Travel time:
Return time:
Budgets:
Hotel:
Visiting
spots:
HsinChu
Taipei
5/1 12:00PM
5/1 19:00PM
Below 3350 NTD
Normal class, below 1000 NTD
SKM skyscraper,
The President’s Office Building,
CKS Memorial Hall,
LoongSha Temple
National History Museum
Scenario 1 (2) Simple Trip Plan
May 1
Arrival Location
Cost
Remark
12:10 HsinChu station
0
Origin of the trip
13:20 Taipei station
180
13:49 LoongSha Temple
20
S1,*
14:54 SKM Skyscraper
200
S2,*
15:56 The President’s O.B.
20
S3,*
16:58 CKS Memorial. Hall
20
S4,*
17:59 N. History Museum
40
S5,*
18:53 Taipei station
40
19:58 HsinChu station
139
Time
Total traffic cost: 459
Total hotel cost: 0
Total entrance fee: 180
Estimated meals cost: 400
Total cost: 1059
Scenario 2(1): User Request by
Adding the Botanical Garden
Scenario 2 (2) time constraint
violation & recommend to
remove a spot
Scenario 2 (3) time constraint
violation & recommend to
postpone the return time
Scenario 3 infeasible hotel &
recommend a feasible hotel
A final trip plan
May 2nd
May 1st
Arrival Location
Cost
Remark
Arrival Location
Cost
Remark
Time
Time
12:10 HsinChu station
0
08:00 The Prince Hotel
0
13:20 Taipei station
180
08:41 History Museum
60
S5,*
13:46 The Prince Hotel
1520
11:51
15
S7
Science Museum
14:58 LoongSha Temple 40
S1,*
13:01 Botanical Garden
0
S8,*
16:08 ZhuShe Temple
0
S6
14:42 Simen Street
20
S9
17:21 SKM skyscraper
200
S2,*
16:52 Haunted House
150
S10
18:53 President’s O.B.
20
S3,*
18:40 Taipei station
20
19:54 CKS M. Hall
20
S4,*
19:58 HsinChu station
139
22:00 The Prince Hotel
20
Total traffic cost: 539
Total hotel cost: 1500
Total entrance fee: 365
Estimated meals cost: 900
Total cost: 3304
Comments
Š Previous methods of recommendation
– Collaborative filtering
– Content-based
Š Recommendation by negotiation can personalize
the recommendation by complying user’s
preference and domain constraints
Š Recommendation can be conducted at many
stages of trip planning in different forms
– Remove or adding a spots, raise budget, prolong a stay,
change hotels, and so on
Conclusions
Š Coordinate with sharable ontology
– For agents to share knowledge and
communicate with the same language and
content is also important for coordinate among
agents
– Semantic Web concepts provide such an
opportunity to share domain knowledge
among agents
Conclusions
Š Automated agent negotiation needs still further
investigation including
– Learning to negotiate (both tactics and opponent
model)
– Negotiation under non-cooperative situations or
hostile opponents (were threatening and cheating
might be unavoidable)
– Negotiation with different risk and trust attitudes and
under time constraints
– Negotiation and combining with other coordination
techniques such as agents in organization
– Ontology for negotiation and coordination
`