CMPS140, Winter 2012, Section 01: RPS-Safari Rules

Optional Article on Why is RPS Important? http://www.bloomberg.com/news/2012-02-07/a-bar-may-be-best-place-to-understand-markets-commentary-by-mark-buchanan.html

4.0 P4:

CMPS240 (in addition to below:)  (100 points)

                        Report on how your P3 agent performed in the actual tournament. Update your paper to include

your P4 agent's design, strategy and expected performance.   Also add a paragraph about what you learned about AI

in the project and a note about whether you are interested in doing further individual or group work to publish a paper.

 

EVERYBODY  (Cmps140 and Cmps 240):

You may now use at most 300 lines for your code and 3-semilocals that may grow to 300.

 You may now select an agent to bid on (e.g. (3 7) is a bid of 3 on agent7), instead of R, P or S. In order to do so, use a 0-based index to select the agent. When bidding on an agent, you can bid  an integer between 1 and 10. You cannot bid a negative number. If your score is less than or equal to 0, you can only bid 1. Any other bid is illegal. Any other number (higher than the max number of agents or negative) will also mark an agent as being illegal.

  For playing R, P or S, you have to pay the same penalty as in P3 rules. That penalty is the total return.   For playing an agent, you have to pay a penalty based on the avg return of that agent (and the sum of any agents that agent calls).   Penalties are additive, whereas bids are multiplicative.   Your "return" becomes: (Product(Bids))*Sign(result)-(Sum(penalties)) Where penalties is the total return for R, P or S, and the Average Return for agents.   In the case of more than one agent, each penalty is summed and the bids are multiplied.  

Agent 1's avg return is 50

Agent 2's avg return is 200

Agent 3's avg return is -300

R's total return is 200  

Agent 1 bids 3, selects Agent2

Agent 2 bids 5, selects Agent3

Agent 3 bids 100, selects 'R  

Agent 1's effective bid becomes 3*5*100=1500 on R. 

Agent 2's effective bid becomes 5*100=500 on R

Agent 3's bid is still 100 on R.  

so Agent 1's score is: (3*5*100 * Sign (# of S  - # of P) - (200+-300+200) = 1500 - 100 = 1400

Agent 2's score is 5 * 100 - (-300+200) = 600

Agent 3's score is 100 - 200 = -100.  

 

If a cycle occurs, e.g. some agent is called which has already been called before (including selecting your agent's own number), then an effective 0 nullifies the bid, but all penalties/bonuses remain, except the final penalty/bonus (the one calculated using total-returns).

In the above example, if agent 3 had instead bid 4 and select Agent 1, then the scores would be as follows:  

 

Agent 1's effective bid is 0, and the score changes by:-(200-300+50)=+50, with no entries (R, P or S) into the tournament.

Agent 2's effective bid is 0, and the score changes by: -(-300+50+200)=+50, with no entries (R, P or S) into the tournament.

Agent 3's effective bid is likewise.  

3.1 P3

Rules for P3:

1.   Due: Tuesday March 6   (70 pts. for compiling, 0-35 points for your rank versus other agents).  Out of 100 pts.

                                          (P2 agents will be included, your score will be based on whichever of your P2  or P3

agents does better).

         CMPS240 (in addition to above:)

                        Report on how your P2 agent performed in the actual tournament. Update your paper to include

your P3 agent's design, strategy and expected performance.   (70 pts.).

                             


2.   All the rules are the same as P2 except:
    a. You may use 3 semi-local variables:

          These may be lists provided that the original program code is not more than 150 lines

     and each semi-local variable does not grow to more than 150 lines.
    b. Up to 150 80-character lines of code
    c. Up to 4 seconds per move.
    d. Each round, the total past return of the play (R,P, or S) you make, times the sign of your bet size

       will be subtracted from your score.
         (if negative, subtracting will produce a bonus :) for playing it!).

         For example, suppose the total return of P  (the second entry) is 4.
         Suppose you bet (7 P) and there are 4 net rocks and 6 net scissors
         then your return is 7*[sign(4-6)]  -1*4 = -11
   
 and the total  return of P will be then updated at the end of the round using -1)...

 

net = bet * sign(victims(x)-killers(x)) - sign(bet) * *total-returns*(x).

2.1 -- 2/9/2012

Each agent starts with 1 point. There are 1000 rounds per tournament.

  1. Each agent receives 3 inputs
    •   h - a list of lists(one per round) of net rock, paper, and scissors played in that round. where the rounds are ordered most recent to oldest.

      ex: (after two rounds: ((1 7 1) (2 3 -3)))

    •  s - a list giving the total score for each agent

      ex: if there are 7 agents the list starts out as (1 1 1 1 1 1 1)

    • The current agent's  total points.    ex: 1  at first.
  2. A legal play must be a list of an integer bid followed by R, P or S.
    If the agent has n (more than 0) points, he may bid -n,.....,-1,1,....,n (note that 0 is omitted  and we WILL NOT allow nil) else (n<=0) he/she must bid -1 or 1.
    For example, on the first move   (-3 R) is a legal play.
  3. A player's net on each round = bid*sign(#victims-#killers) where sign is either -1 if #victims-#killers < 0, 0 if #victims-killers==0 and 1 if #victims-killers>0. The net is added to the player's total for the next round. (R has victims S and killers P, e.g.) This means you can lose as much as you bet each round, depending on the total outcome.
  4. Players are ranked by their total points after 1000 rounds.
  5. Agents are limited to 800 ASCII charcaters and .002 seconds per move.

P2 continued:

Player's score will be based on their average rank after 1000 tournaments.

[specifically, in P2, if your agent plays legally you will get 70 + YourRank/2].

For P2, you will be expected to do the following to the monitor provided below: (50 pts total)

  1. Add to the legalp function. There are cases not handled. (15)
  2. Write a 1-2 line comment explaining what each defun function does. (20)
  3. Create a global list *avg-returns* that contains the average return of (1 R), (1 P), (1 S) and each agent from the beginning of the game. (15). For example, if there are 4 agents, *avg-returns* will be a list of length 7  of numbers,   with the first 3 slots for R, P and S returns respectively followed by the 4 agent returns.

This global list will be used in P3. However, you may also access it in P2 (but may access no other globals).

        If in CMPS140, your total score will be out of 150 points.

        If In CMPS240, you must also (60 pts):

              4. Write a 1-2 page paper describing the game, your strategy, and why and

                  how you expect your agent to perform. Use of figures or diagrams is highly desirable.

                  [Your paper will be further expanded for P3 and P4.].

You will be submitting two files:

  • agent.lisp: your agent, which contains only your agent lisp function (agent program). You may give your agent a one character name, or name it "agent." 
  • monitor.lisp: the revised monitor file. The revised monitor file should contain the "fixed" legalp function, which may involve changing how that function is called. It will also contain the additional code such that the *global-list* variable, after running monitor, contains the average returns as described above (3). You may modify any function or add additional functions to the file to accomplish this.


To test your agent, you can load your agent file, then load the monitor file, and then run the monitor the usual way from the interpreter: (monitor '(simple-agent agent random-agent) 1000), where 1000 is the number of times to run tournaments and agent is the name of the function.

Any other questions, feel free to email me or see me during section. The latest submission instructions will be posted to the course website,   http://courses.soe.ucsc.edu/courses/cmps140/Winter12/01

NOTE: To compile the monitor and agents, execute the sbcl command (compile-file "monitor-given.lisp")

and then (load "monitor-given.fasl").

Attachments