Project 3: Inference in Hunt the Wumpus

Introduction

Hunt the Wumpus is a classic text-based hide and seek game, originally written in BASIC (yes, BASIC) in the early 1970s. Subsequent versions introduced many variations on the original theme, but in each adaptation the player assumes the role of an Agent who moves throughout a maze in the hunt for a deadly Wumpus.

Little is known about the elusive Wumpus, as all those who stumble upon it are immediately eaten. There does, however, seem to be a fairly universal consensus that it has suckers, which allow it to navigate the deadly Pits scattered throughout its labyrinthine home. The only hope the Agent has of killing the Wumpus without being eaten is to fire an arrow at it from afar, taking the Wumpus by surprise.

Gameplay

World

The Wumpus world is composed of R rooms, with W Wumpii, P Pits, and B Bats scattered uniformly at random throughout the rooms. A Wumpus, a Pit, and a Bat can all coexist in the same room, but two hazards of the same type cannot be in the same room at once.

Each room has C portals on average, but may have as many as 8 or as few as 1. The only guarantee is that the rooms and portals form a connected graph. Be careful: the magic portals connecting rooms don't necessarily respect the normal concept of adjacency; a portal on the North side of one room may connect to the West side of a room at the other end of the cave!

The Agent begins in a random room, which is guaranteed not to contain any hazards. At the start of the game, the Agent has A arrows.

Hazards

On entering a room containing a Wumpus or a Pit, the Agent is immediately killed. On entering a room containing a Bat, there is a probability M that the Bat carries the Agent to another room (including the original room) uniformly at random. If the Agent is carried to a room containing a Wumpus or a Pit, the Agent is killed as usual. Otherwise, if the Agent is carried to a room containing another Bat, that Bat carries the agent to another random room with probability 1. This repeats until the Agent is either killed or carried to a currently safe room.

Senses

In rooms adjacent to or containing a Pit, the Agent perceives a Breeze with probability 1. In rooms containing (but not adjacent to) a Bat, the Agent perceives Chittering with probability 1.

In rooms adjacent to or containing a Wumpus, the agent perceives a Stench with probability S1 if there is no Breeze and probability S2 if there is a Breeze. In rooms not containing or adjacent to a Wumpus, the Agent perceives a Stench with probability S3 if there is no Breeze, and probability S4 if there is a Breeze. The locations of stenches are fixed at the beginning of the game, so re-entering a room or killing a Wumpus won't change the Stench of a room.

Actions

During each turn, the Agent has the option of either moving or shooting in any direction. Moving or shooting in a direction where there is no portal causes the Agent to perceive a Bump. Moving in a valid direction sends the Agent through the appropriate portal to another room. An arrow fired through a portal will move into the adjacent room, striking any Wumpus or Agent in its path.

Striking a Wumpus with an arrow kills it, and causes the Agent to perceive a Scream. Striking the Agent with an arrow kills them, and ends the game. If the Agent runs out of arrows without killing every Wumpus in the maze, the remaining Wumpii attack and eat the now helpless Agent. The only way for the Agent to win is to kill every Wumpus in the maze before running out of arrows or being killed by a hazard.

Running the Code

The stencil code is in /course/cs141/pub/src/wumpus. Try playing a round of the game yourself with the command python wumpus.py (this begins a game using the default settings, using the HumanAgent solver which asks for instructions via the command line).

To play the game, simply enter an action ("M" to move or "S" to fire an arrow) and a direction (one of the 8 compass directions, e.g. "N" or "SW"), separated by a space. The HumanAgent will let you know what room you're in, which room it connects to in each direction, and any sensory information you detect in the current room.

Parameters

In this default version of the game, the maze has 20 rooms (R=20), 1 Wumpus (W=1), and 3 Pits (P=3, B=0). The Agent begins with 1 arrow (A=1). The Stench probabilities are deterministic and noiseless by default, with S1=S2=1 and S3=S4=0, meaning that the Agent perceives a Stench if and only if the current room contains or is adjacent to a Wumpus. Although there are no Bats by default, the default setting for M (the probability that a Bat will move the Agent) is 0.5. The default Agent implementation is HumanAgent.

In general, the command to run the game is python wumpus.py [<agent>] [<flag> <value>]*, where <agent> is the name of an Agent class (e.g. "Human"), and <flag> is a hyphen followed by a variable as named in the Gameplay section (e.g. "-W" for the number of Wumpii or "-S2" for the probability of perceiving a real stench in the presence of a breeze).

Testing

In addition, you can use the "-T" flag to specify a test file, from which the game will load seed data if the file exists and into which the game will write seed data if it doesn't. This will hopefully make testing much simpler, as you can isolate any bugs that occur in your code by repeatedly running with the same pseudorandom numbers. A test file contains two random number generator seeds. One is for internal use by the game, to generate the maze and to determine when bats move. The other is used to seed the random number generator passed to your Agent at each round. If your code uses randomness to decide on an action, we highly recommend that you use this provided RNG in order to allow for consistency in testing.

Finally, you can use the "-V" flag to specify verbose output (this is only on by default when using HumanAgent), and the "-Y" flag to set a delay in milliseconds, to watch your Agent in action.

Inference

Your primary task for this project will be to write code that infers the probability of various hazards. A very small percentage of your total grade will be based on an Agent you build yourself, which should go beyond the naive Agents we provide in trying to maximize the overall probability of winning the game. Every class you will need to modify is located in wumpus_agents.py

KnownWorld

Every method you are responsible for filling in includes as one of its parameters a KnownWorld object, which encodes what you've learned so far about the Wumpus world. The KnownWorld can be queried for things like the rooms you've visited and their contents, the neighbors of a room, and game parameters like Stench probabilities and the number of Pits. The complete specification for this class is in wumpus_world.py. You should familiarize yourself with this class, as the information it provides will be crucial for making inferences about the world.

RationalAgent

In order to successfully navigate the Wumpus world, an Agent absolutely must be able to infer the probability that a room contains a certain hazard. The RationalAgent class provides this capability. Any Agent implementation which works by formally reasoning about the probability of encountering a hazard extends this class (a HumanAgent is, of course, not a RationalAgent).

You are responsible for filling in the following methods in the RationalAgent class. You will probably want to do them in this order:

bat_prob(known_world), returns a map from known room numbers to the probability that each room contains a Bat, given the information encoded in known_world (15 points)
pit_prob(known_world), returns a map from known room numbers to the probability that each room contains a Pit, given the information encoded in known_world (25 points)
wumpus_prob(known_world), returns a map from known room numbers to the probability that each room contains a Wumpus, given the information encoded in known_world (35 points)

HybridAgent

This is a HumanAgent augmented with the power of rationality! It still leaves its actions up to a human, but it prints out the probability of each hazard before asking what to do. Use this class to test out your RationalAgent inference methods, and to show those Wumpii who's boss!

NaiveSafeAgent

This Agent extends RationalAgent. It first makes all moves which are known to be safe, then chooses the move with the lowest probability of encountering a hazard. The Agent will fire an arrow only if it will strike a Wumpus with probability at least 1-S3 (one minus the probability of perceiving a false stench).

You are responsible for filling in the following method in the NaiveSafeAgent class:

danger_prob(known_world), returns a map from known room numbers to the probability that each room contains any hazard, given the information encoded in known_world. The union bound is unacceptable; this method must compute the exact probability. (10 points)

BatSafeAgent

This Agent behaves like NaiveSafeAgent, except that it makes the move with the smallest probability of death, as opposed to the smallest probability of encountering a hazard. This means that being dropped by a Bat in a safe room is acceptable, whereas in NaiveSafeAgent simply encountering a bat is to be avoided.

For up to 5 points of extra credit, you may fill in the following method in the BatSafeAgent class:

lethal_prob(known_world), returns a map from known room numbers to the probability for each room that entering will cause immediate death, given the information encoded in known_world. Immediate death is defined as landing on a Wumpus or a Pit, either because there is one in room or because a Bat (or chain of Bats) will cause the Agent to land on one.

Building Your Own Agent

To give you a chance to play around in the Wumpus world, you'll also be implementing your own CleverAgent, which should perform somewhat better than BatSafeAgent. The skeleton of this class is located in wumpus_agents.py. As inference is the primary focus of this project, the performance of your CleverAgent is only weighted at 15 points, as described in the Grading section.

You are responsible for filling in the following method in the CleverAgent class:

action(known_world), returns the action to be performed given the information encoded by known_world, as a tuple of (action, direction)

RationalAgent provides a number of useful utility methods for implementing your CleverAgent:

safe_rooms(known_world), returns a list of all known rooms with wumpus_prob=pit_prob=bat_prob=0, as computed by your inference methods
safe_paths(room, known_world), returns directions to move safely from room to every room it can reach, as a dictionary keyed on room number. Intermediate moves are guaranteed to be through safe_rooms, but it need not be safe to enter the final room of each path (fringe nodes will have entries in the result).
reachable(room, known_world), returns a set of rooms reachable through safe_paths from room

In addition, the KnownWorld class can serve as a Python dictionary, allowing you to cache and retrieve arbitrary data using the normal dictionary syntax known_world[key].

Grading

You'll be graded on two things in this assignment.

You'll be graded on whether or not your inference methods compute the correct probabilities.
You'll be graded on how often your CleverAgent wins the game, relative to our reference implementation of NaiveSafeAgent. Full credit will be given for a slight improvement in success rate, and some extra credit may be available for especially good performance.

Each case below is worth 5 points. We'll be running each case 100 times, with the same seeds for the whole class, and basing your score on your CleverAgent's average performance.

R=20, W=1, P=3, B=1, A=1, M=0.5, S1=0.90, S2=0.80, S3=0.10, S4=0.05
Time for 100 runs on TA NaiveSafeAgent: ~2 seconds
Success rate for TA NaiveSafeAgent: 3763/10000
CleverAgent target success rate: 40/100
R=40, W=2, P=5, B=4, A=3, M=0.5, S1=0.85, S2=0.75, S3=0.15, S4=0.10
Time for 100 runs on TA NaiveSafeAgent: ~30 seconds
Success rate for TA NaiveSafeAgent: 84/1000
CleverAgent target success rate: 10/100
R=60, W=3, P=7, B=7, A=5, M=0.5, S1=0.80, S2=0.70, S3=0.20, S4=0.15
Time for 100 runs on TA NaiveSafeAgent: ~15 minutes
Success rate for TA NaiveSafeAgent: 2/200
CleverAgent target success rate: 2/100

We'll allow up to 30 minutes in total to run all three 100-run test cases while grading. If you find your code is running too slowly, try caching frequently-computed information in the known_world dictionary. In addition, make sure not to query for danger probabilities unless absolutely necessary, as this is an expensive operation. Instead, take a hint from the NaiveSafeAgent and decide on a destination and a path, then move there without making additional queries.

Please note that one thing you absolutely must not do to improve your CleverAgent's performance is to make your RationalAgent inference methods less exact. If you want to speed up your CleverAgent by writing separate approximate versions of these methods that's okay, but don't approximate in RationalAgent or you'll lose points for accuracy.

When testing your code, we will be running every submission using the same 100 seeds for each test case, and we guarantee that these seeds will elicit as good or better performance than above from our NaiveSafeAgent implementation. In other words, don't worry about us randomly testing the whole class on a set of impossible mazes, or about testing one person with a good set but another with a bad set.

Handing in

The command is cs141_handin wumpus. You only need to turn in wumpus_agents.py along with a readme declaring anything we should know such as requiring python2.7.