## Tada2011.dvi

**An Adaptive Proportional Value-per-Click Agent for Bidding in Ad Auctions**
**Kyriakos C. Chatzidimitriou**
**Lampros C. Stavrogiannis**
Centre for Research and Technology Hellas

**Andreas L. Symeonidis **and

**Pericles A. Mitkas**
Aristotle University of Thessaloniki and Centre for Research and Technology Hellas

**Abstract**
or CPC) rather than a per impression (cost-per-mille or CPM)basis.

Sponsored search auctions constitutes the most im-
Against this background, we present the strategy of agent
portant source of revenue for search engine com-

*Mertacor*, our entrant that participated in the TAC Ad Auc-
panies, offering new opportunities for advertisers.

tions 2010 competition [Jordan

*et al.*, 2010] and was placed
The Trading Agent Competition (TAC) Ad Auc-
3rd in the finals. At a high level, Mertacor’s strategy can
tions tournament is one of the first attempts to study
be decomposed into two parts: (a) estimating the Value-per-
the competition among advertisers for their place-
Click (VPC) for each query and (b) choosing a proportion
ment in sponsored positions along with organic
of VPC for bidding in each auction based on the state of the
search engine results. In this paper, we describe
game (the adaptive proportional part). The approach is sim-
agent Mertacor, a simulation-based game theoretic
ilar to the QuakTAC agent [Vorobeychik, 2011], which par-
agent coupled with on-line learning techniques to
ticipated in the 2009 competition, with two extensions: (a)
optimize its behavior that successfully competed in
a k-nearest-neighbors algorithm to help in the estimation of
the 2010 tournament. In addition, we evaluate dif-
VPC and (b) an associative to the state of the game, n-armed
ferent facets of our agent to draw conclusions about
bandit formulation of the problem of selecting the proportion
The remainder of this paper is organized as follows: Sec-

**Introduction**
tion 2 provides a brief description of the game. Section 3
The advent of the Internet has radically altered current busi-
presents strategies of agents participated in the previous com-
petition. Section 4 builds the background upon which our
line advertising in search engine results, known as sponsored
agent was based and gives a detailed description of the ex-
search, where paid advertisements are shown along with regu-
tension points. An discussion of the conducted experiments
lar results (called impressions). Sponsored search is the high-
is given in Section 5. Finally, Section 6 concludes the paper
est source of revenue for on-line advertisement, yielding a
and provides our future research directions.

profit of approximately $10.67 billions for 2009 only in theU.S. [PwC, 2010].

**The TAC Ad Auctions Game**
In the sponsored search setting, whenever a user enters
a query in the search engine (publisher), an auction is run
Sponsored search auctions are open, highly complex mech-
among interested advertisers, who must select the amount
anisms, that are non-dominant-strategy-solvable, hence bid-
of their bids, as well as the advertisements that they deem
ding strategies are a topic of active research. To investigate
their behavior, a realistic agent-based simulator seems essen-
positions (slots) for placement, but higher slots are more
tial [Feldman and Muthukrishnan, 2008]. The Ad Auctions
desirable, given that they generally yield higher levels of
(AA) platform in the international Trading Agent Competi-
Click-Through-Rate (CTR). This field started in 1998 by
tion (TAC) is such a system. The TAC AA game specifica-
GoTo.com, where slots were allocated via a Generalized First
tions are defined in detail in [Jordan

*et al.*, 2010]. To famil-
Price (GFP) auction, but received its current form in 2002,
iarize the reader with the game, we will provide some basic
when GFP was replaced by the well known Generalized Sec-
information about the entities involved and the interactions
ond Price (GSP) auction [Jansen and Mullen, 2008]. Accord-
ing to this auction, bids are sorted by bid (that is usually mul-
In TAC AA tournament, there are three main types of
tiplied by an advertiser-specific quality factor), and the win-
entities, the publisher, a population of 90000 users,
ner of a slot pays the minimum bid needed to get this position,
and eight advertiser entrants represented by autonomous
which is slightly higher than the next bidder’s offer and inde-
software agents. The advertisers compete against each other
pendent of her bid. What makes this type of auctions different
for advertisement (ad) placement, across search pages. Each
is the fact that payment is made on a per click (cost-per-click
one of the search pages contains search engine results for one

of the queries of 16 different keyword sets. In order to pro-
an increase in profit per unit sold for the later. Addition-
mote their products, the agents participate in ad auctions by
ally, entrants are assigned a weekly maximum stock capac-
submitting a bid and an ad to the publisher for the query (set
ity Ccap ∈ {CLOW , CMED, CHIGH }, so conversions above
of keywords) they are interested in. Ads are ranked on each
this threshold are less likely to happen during this week (5
search page, based on a generalized method that interpolates
working days). During its daily cycle of activities the adver-
between rank-by-bid and rank-by-revenue schemes.

day, users, according to their preferences and state, remain
• Send the bids for ad placement per query for day d + 1.

idle, search, click on ads and make purchases (conversions)
• Select an ad for ad placement per query for day d + 1.

from the advertisers’ websites. The products being traded are
The ad can be either generic (i.e. referring to a gen-
combinations of three brands and three types of components
eral electronics shop) or targeted (i.e. stating a specific
from the domain of home entertainment. The small number of
manufacturer-component combination). A targeted ad
products enables competing teams to focus only on a small set
that is a match to the user preferences increases the prob-
of predefined keywords, abstracting away from the problems
of keyword selection. The three manufacturers (namely, Li-oneer, PG and Flat) and the three types of devices (TV, Audio
• Set spending limits for each query and across all queries
and DVD) constitute a total of nine products. The simulation
runs over 60 virtual days, with each day lasting 10 seconds. A
• Receive reports about the market and its campaign for
schematic of the interactions between game entities is found

**Publishers**
As mentioned above, the publisher runs a GSP auction to de-termine the rank of bids and determine the payment per click.

The ad placement algorithm takes into account predefined re-serve scores. There is one reserve score below which an adwill not be posted, and one above which, an ad will be pro-moted. If the spending limit set by an agent is passed, therankings are recalculated. The auction implemented is a GSP,where the ranking takes into account the quality of the ad-vertisements, weighted by a squashing parameter that is dis-closed to the entrants at the beginning of the game.

Each user has a unique product preference and can be indifferent states representing his or her searching and buyingbehavior (i.e. non-searching, informational searching, shop-ping, with distinct levels of shopping focus, and transacted).

The product preference distribution is even for all products.

Users submit three kinds of queries, defined by their focuslevel for a total of 16 queries. There is one (1) F 0 query,where no manufacturer or component preference is revealed,six (6) F 1 queries, where only the manufacturer or the prod-uct type is included in the query and nine (9) F 2 queries,where the full product definition (manufacturer and type) isexposed. Users’ daily state transition is modeled as a Markovchain. Non-searching and transacted agents do not submitqueries. Informational agents submit one of the three queriesby selecting any one of the them uniformly and focused users
Figure 1: The entities participating in a TAC AA game along
submit a query depending on their focus level. While both
with their actions. The majority of the users submit queries,
information seeking and focused users could click on an ad,
fewer click on ads and an even smaller percentage makes
only focused users make purchases and go to the transacted
state. Users click on the advertisements based on an extendedversion of the cascade model [Das

*et al.*, 2008]. After click-ing on an ad, whether a conversion will be made or not de-

**Advertisers**
pends on user’s state, advertisers’ specialty and remaining
Each advertiser is a retailer of home entertainment prod-
ucts and can supply the user with any of the nine prod-ucts available. Upon initialization of each simulation, adver-

**Related Work**
tisers are given a component and a manufacturer specialty,
Given that the tournament started two years ago, relevant pub-
yielding an increase in conversion rates for the former and
lished work on TAC AA is limited. The majority of strategies
is focused on two target metrics, namely the

*Return on In-*
**Agent Mertacor**
*vestment *(ROI), i.e. the ratio of profit to cost, and the

*Value*
**Background**
*per Click *(VPC), i.e. the expected profit from a conversiongiven a click, and combined with multiple choice knapsack
The baseline strategy of agent Mertacor is a modified version
(MCKP) models to deal with the distribution constraint ef-
of the aforementioned QuakTAC strategy for the 2009 tour-
nament [Vorobeychik, 2011]. This is one of the few reported

*TacTex *[Pardoe

*et al.*, 2010], the winner in the previous
strategies in TAC that employs simulation based game theo-
competitions, implements a two-stage strategy of estimation
retic analysis and was proven quite successful in that tourna-
and optimization. The former incorporates self and oppo-
ment, as QuakTAC was placed 4th in the finals. It is a sound,
nent related predictions of desired variables as well as user
elegant and yet simple strategy. For the bidding part, Vorob-
state estimation. More specifically, this agent tries to ex-
eychik considers a simple strategy space, with bids that are
tract the range of impressions, ranking and amount of bids,
linear to the valuation of the player. For the AA scenario, this
as well as the type of ads shown. Then, it estimates the pro-
valuation is the advertiser’s VPC, υ, and the agent bids a pro-
portion of users in each state and utilizes this information to
portion (shading), α, of this value. For each query q, the bid
predict the most profitable number of conversions per query
type. The optimization takes into consideration the distribu-
tion constraint effect, hence both short-term and long-term
optimizers are incorporated in the strategy. Finally, there is
Given that GSP auctions are not incentive compatible [La-
an estimator for the unknown game parameters, i.e. continu-
haie, 2006], we expect that α < 1. An estimation of the VPC
ation probability and quality factor.

for a keyword can be expressed as the product of the proba-
Another competitor for 2009,

*Schlemazl*, employs rule-
bility of converting after a click on the ad and the expected
based as well as dynamic programming techniques. Accord-
revenue from such a conversion. So for each query q, the
ing to the latter, the bidding process is modeled as a penalized
agent’s estimated VPC value can be calculated as:
MCKP, where the value of each query is the profit made and
the weight is the number of sales [Berg

*et al.*, 2010]. For the
υq = P r {conversion|click} · E[revenueq|conversion]
rule-based method, the agent’s strategy targets for the sameROI in all queries. A similar principle is implemented in
The expected revenue for a query q given a conversion,

*EPFLAgent 2010 *[Cigler and Mirza, 2010], targeting for the
(E[revenueq|conversion]), solely depends on the adver-
same profit per conversion over all queries, and distributes the
tiser’s Manufacturer Specialty (MS) and can be calculated
number of sales uniformly on the five most recent days. If the
with no additional information for the three possible cases
number of sales exceeds this threshold, the bid for the most
profitable keyword is increased, otherwise the bid on the least

*Tau *[Mansour and Schain, 2010] follows a reinforcement
learning (soft greedy) approach, based on regret minimiza-tion, where the agent selects a value from a distribution on the
where USP is the unit sales profit ($10 in TAC 2010) and
space of bids, so that the regret, which is the difference be-
MSB the manufacturer specialty bonus (0.4 in TAC 2010).

tween actual profit and maximum profit, is minimized. On the
However, for calculating the probability of conversion, we
other hand,

*DNAgent *[Munsey

*et al.*, 2010] follows an evo-
need two additional estimations, the proportion of focused
lutionary computation technique, by creating a population of
users and past and future conversions that have an impact on
agents (referred as finite state machines) that are genetically
the conversion probability due to the distribution constraint
evolved until the fittest is determined. Each agent can be in
seven different states, based on its previous position, which is
{conversion|click} = f ocusedP ercentage ·
combined with the matching of the advertisement to the user
to determine bid adjustments for the next day.

Finally,

*AstonTAC *[Chang

*et al.*, 2010] and

*Quak-*
*TAC *[Vorobeychik, 2011] follow VPC-based strategies. The
To calculate a value for the f ocusedP ercentage esti-
former estimates the market VPC, which is the VPC minus
mate, we used the following procedure. If we fix advertis-
relevant cost, and then bids a proportion of this value based on
ers’ policies, then the proportion of focused users is equal to
the critical capacity (i.e. capacity beyond which the expected
the ratio of the clicks that result in conversions divided by
cost is higher than the corresponding revenue) and the esti-
their individual probability of conversion (incorporating the
mated quality factor for the advertiser. Priority is also given
distribution constraint effect). The user population model in
to the most profitable queries. On the other hand, QuakTAC
TAC AA is a Markov chain and, despite the incorporation of
follows a simulation based Bayes-Nash equilibrium estima-
the burst transitions, it will typically converge to a station-
tion technique to find the optimum percentage of the VPC to
ary distribution. Hence, instead of using a more sophisticated
bid. One important advantage of the VPC strategy is that it
and accurate particle filter, we have used Monte Carlo sim-
does not require any opponent modeling techniques, which in
ulations, running 100 games with symmetric, fixed strategy
turn, demand historical information about opponents’ bidding
profiles, making sure that their bids are always higher than
behavior, difficult to obtain in reality.

those requested by reserve scores, and have then recorded the
mean values for our desired ratios (every day results for indi-
αsingle deviation from this profile and use this last value as
vidual queries are aggregated for each focus level). Historical
the new homogeneous profile. This process is repeated un-
proportions are kept in log files and are updated incremen-
til self-response is a Bayes-Nash equilibrium. We have val-
tally. In our experiments, we have used last day’s probability
idated the speed of this method, being able to get the best
of conversion as a close estimate of the requested value for
value of αsingle = 0.3 in three iterations (1, 0.4 and 0.3).

The alpha value of 0.3 and the new method for calculating
Our first differentiation compared to the original strat-
Id are the differentiation we made to the agent playing in the
egy lies in the estimation of the conversion probability,
qualifications round of TAC AA 2010 and further defined as
the distribution constraint effect, Id. As Jordan has noticed

**Ad Selection Strategy**
in his thesis [Jordan, 2010], the distribution constraint effect,I
It is also important to describe our ad selection strategy. This
d, is the second most influential factor in an advertiser’s per-
formance after the manufacturer specialty in terms of the co-
task is straightforward for F 0 (no keywords) and F 2 (con-
efficient of determination values. This effect radically affects
taining both brand and product keywords) type queries. For
the probability of purchase. Based on the specifications of
the first case, there is a 1 probability of matching a user’s
the game for the calculation of the distribution constraint ef-
preference, so a generic ad seems most appropriate. On the
fect for day d + 1, we need, but don’t have the conversions, c,
other hand, users reveal truthfully their preferences, hence a
targeted F 2 ad will always match them. For the F 1 keywords(containing either brand or product keywords), we have also
used a targeted advertisement, where our agent’s respective
where g is a function defined in the specification that gives
specialty is shown when the manufacturer or the component
are not disclosed, hoping that our increased profit or conver-
Given that an entrant must calculate a bid with a 2-day
sion probability benefits will outweigh our higher probabili-
lag in available information, QuakTAC estimates tomorrow’s
ties of mismatch. This strategy also proved to be effective in
the original QuakTAC agent [Vorobeychik, 2011].

d+1 value using a two-step bootstrapping technique. More
specifically, it estimates conversions on day d as the product

**Extensions**
of current game’s historical click-through rate, recorded his-torical impressions and (optimistically) estimated conversion
For the final rounds, we tried to improve our agent in the
probability. Then it uses this information to estimate day’s
two critical calculations of: a) estimating the distribution con-
d+1 conversion rate and corresponding conversions and uses
straint effect and b) finding an appropriate α value by adapt-
ing it on-line according to the current market conditions.

Having implemented that strategy, we realized that it per-
In the first case, we used the k-Nearest-Neighbors (k-NN)
forms poorly on the 2010 server specifications.

algorithm to estimate the capacity to be committed for the
that it underestimates the VPC, systematically bidding much
current day cd and then using that estimate to further predict
lower than what is allowed by the publisher’s reserve scores.

the capacity to be purchased on day cd+1. We chose k = 5
It is important to note that these scores for 2010 are much
and stored 5-day tuples for the last 10 games (600 samples).

higher than 2009. Jordan describes the effect of the reserve
We adopted this kind of modeling because the agent executes
score on publisher’s revenue as well as players’ equilibrium
a clear 5-day cycle in its bidding and conversion behavior.

showing that AstonTAC and Schlemazl perform much bet-
This behavior is derived from the strategy of the agent since
ter than TacTex for these higher reserve scores. To rectify
when having enough remaining capacity, it will produce high
this shortcoming, we have implemented a simpler but more
bids due to high VPC values and get ranked in the first posi-
reasonable expectation: we aggregate last three days’ conver-
tions. That way it will receive many clicks and a high volume
sions and set 1 of this value as our number of conversions for
of conversions will take place. This behavior will continue
day d as well as day d+1. This is slightly lower than their cor-
for one more day, until the agent’s store is depleted. Then the
responding mean value so as to intentionally be more conser-
bids will be low and the agent will maintain a no conversions
vative. However, when our capacity is CLOW , we have used
instead the mean value, which was experimentally shown to
In order to choose the proportion of the VPC to bid in a real
antagonistic environment, we have formulated the problem of
Finally, we had to select the percentage of our VPC to
choosing α into an associative n-armed bandit problem [Sut-
bid. Following the methodology of Vorobeychik, we restrict
ton and Barto, 1998]. Under this formulation, we switched
our interest in simple symmetric equilibria and discrete val-
bidding from b(υ) = α · υ to b(α, υ) = α · υ.

ues ranging from 0 to 1 with a step of 0.1. This means that
Based on the experience gained from the TAC SCM do-
all but one advertisers follow the same strategy (bid = αυ)
main, we tried to maintain a low dimensionality in the state
and a single player bids another αsingle percentage, which
and action spaces for sampling efficiency. We chose VPC
varies among games. The values for αsingle are determined
and query types as a state variables. The choice of VPC was
via an iterative best response method, where we start from a
made because its calculation incorporates many parameters
truthful homogeneous strategy profile (α = 1, αsingle = 1).

that characterize the state of the agent and the market such as:
This is a reasonable value to start, as GSP is considered a
specialties, distribution capacity constraint, and proportion of
generalization of the Vickrey auction. Then we find the best
focused users. The query type was added as state variable in
order to provide additional information about the current statefrom another dimension. VPC was quantized over 11 states,
Table 2: TAC AA 2010 tournament results.

10 for VPC values between $0 and $10 and 1 for VPC values
above $10. The 16 query types were mapped into 8 states as
presented in Table 1. There were 6 discrete actions picking
values of α between equal spaced points from 0.275 to 0.325

**Mertacor**
inclusive. So a total of 528 Q(s,a) values need to be stored
and updated. For exploration purposes ǫ-greedy policy was
used with ǫ = 0.1. The goal is to learn a policy for select-
ing one of the 6 actions that would maximize the daily profit
given the state of the game. Updates to Q values were made
Q(s, a)k+1 = Q(s, a)k + 0.1 · (rk − Q(s, a)k)
the 2010 tournament.

*Mertacor-kNN *has only the k-NN ca-
Reward is the daily profit calculate as the revenue minus the
pability, while it continues to bid with α = 0.3.

*Mertacor-RL*
cost for the corresponding state and action.

has the ability to adapt α, but is not equipped with k-NN.

Last but not least,

*Mertacor-Finals *combines all the modifi-
cations and extensions described in this paper and is the agentparticipated in the 2010 finals. In the first tournament, two
On day d we receive the report for day d − 1 from which we
agents from each one of the four versions of the agent com-
can calculate the reward for day d − 1. That reward concerns
peted, having all the same storage capacity equal to CMED.

the bids made on day d − 2 for day d − 1.

In the second tournament the storage capacities were selectedin competition terms (2 agents with CLOW , 4 with CMED
Table 1: Mapping query types to states. One state for the F 0
query, 3 states for F 1 queries and 4 states for F 2 queries.

Results in Table 3 indicate that the extension added to the
Mertacor-Finals version of the agent are giving the agent a
small boost. Both versions were able to get the first two po-
sitions in this tournament. Even though the differences are
not large enough and not statistical significant under paired
t-testing, in tightly played competitions they could make the
difference over ranking positions, like in 2010. Moreover,
we believe that by optimizing these techniques, now imple-
mented rather crudely, more profit is possible. It is also ev-
ident that including both extensions has more benefits than

**Analysis**
Table 3: Average scores over 88 games of different versionsof agent Mertacor with equal distribution capacity constraint
To validate the effectiveness of this iterative best response
technique, after the tournament we have also repeated the ex-
periments for all possible combinations of homogeneous and
single α values and have plotted the best αsingle as a func-
tion of the symmetric opponent profile. Each experiment was
repeated 5 times for a total of 500 log files. Results are given
in Figure 2. As can be seen, there are only two close values
for αsingle(0.3, 0.4), although 0.3 is best response for more
reasonable strategies (α < 0.5). Hence, this technique yieldsrobust results in only 3% of total required time for extensive
experiments. Moreover, we should note that the values of 0.1
and 0.2 used by QuakTAC last year are not profitable at all inthis year’s platform due to the aforementioned reserve score
When dealing with different capacities between games,
the domain becomes more challenging, especially for on-line
The final version of agent Mertacor got the third place in
learning methods. In Table 4, one can find the results of
the TAC AA 2010 competition. The standings are shown in
the second tournament. Again the differences are not large
enough, but there are small deviations between versions. The
To evaluate the effectiveness of the agent and the exten-
k-NN version is able to estimate better the capacity to be used
sions incorporated, two tournaments were conducted. Four
by the agent and this benefits the final scoring of the agent In
versions of agent Mertacor were constructed.

Figure 3, one can obverse the quality of the k-NN predictions.

*Quals *is the agent that participated in the qualifications of
It is possible that by optimizing this algorithm, extremely ac-

curate results could be possible. As general comment, all ver-sions of the agent, in both tournaments, were able to maintaintheir bank accounts to the level agent Mertacor scored in thefinals. This is evidence of a robust strategy against both sim-ilar and different opponents.

Table 4: Average scores over 88 games of different versionsof agent Mertacor in competition mode with respect to distri-bution capacity constraints.

**Conclusions and Future work**
In this report we have described our agent,

*Mertacor*, forthe TAC AA 2010 tournament. We have seen how empiri-cal game theory can provide robust and effective results in arestricted real-world setting, when operationalized appropri-ately using simple abstractions. We were also able to discussabout the importance of the distribution constraint effect andthe reserve score in our results, which significantly influencedour agent’s performance before and during the tournament.

**Conversions prediction**
Last but not least, we have elaborated on two extensions, k-nearest neighbors and reinforcement learning that differenti-ate our agent from related work and provide added value to
the agent with respect to making better predictions and adapt-ing to the environment and the competition.

As future work, we would first need to use a more ac-
curate user state predictor, such as the one implemented byTacTex. Moreover, we would like to extend our strategy toquadratic functions of the VPC, to incorporate the decrease in
α for smaller corresponding valuations [Vorobeychik, 2009],
as was also implemented in QuakTAC during the finals of
2009. Other non-linear function approximators could also be
tested, under off-line parameter learning schemes for boost-ing time to converge to an optimal policy. Finally, it would
be desirable to identify key parameters that influence more orless the optimal bidding percentage.

**References**
[Berg

*et al.*, 2010] Jordan Berg, Amy Greenwald, Victor
Naroditskiy, and Eric Sodomka. A knapsack-based ap-proach to bidding in ad auctions. In

*Proceeding of the 2010conference on ECAI 2010: 19th European Conference on*
Figure 3: k-NN enabled predictions of conversions for the

*Artificial Intelligence*, pages 1013–1014, Amsterdam, The
future day (dashed line). The solid line indicates the actual
Netherlands, The Netherlands, 2010. IOS Press.

[Chang

*et al.*, 2010] Meng Chang, Minghua He, and Xudong
Luo. Designing a successful adaptive agent for tac ad auc-tion. In

*Proceeding of the 2010 conference on ECAI 2010:*
*19th European Conference on Artificial Intelligence*, pages
[Vorobeychik, 2011] Yevgeniy Vorobeychik. A Game The-
587–592, Amsterdam, The Netherlands, The Netherlands,
oretic Bidding Agent for the Ad Auction Game. In

*Third*
*International Conference on Agents and Artificial Intelli-*
*gence (ICAART 2011)*, January 2011.

Cigler and Mirza, 2010] Ludek Cigler and Elias Mirza.

EPFLAgent: A Bidding Strategy for TAC/AA. Presenta-tion in Workshop on Trading Agent Design and Analysis(TADA), ACM EC 2010, July 2010.

[Das

*et al.*, 2008] Aparna Das, Ioannis Goitis, Anna R. Kar-
lin, and Claire Mathieu. On the effects of competing ad-vertisements in keyword auctions. Working Paper, 2008.

S. Muthukrishnan. Algorithmic methods for sponsoredsearch advertising. In Zhen Liu and Cathy H. Xia, editors,

*Performance Modeling and Engineering*, 2008.

[Jansen and Mullen, 2008] Bernard J. Jansen and Tracy
Mullen. Sponsored search: an overview of the concept,history, and technology.

*International Journal of Elec-tronic Business*, 6(2):114–131, 2008.

[Jordan

*et al.*, 2010] Patrick R. Jordan, Ben Cassell, Lee F.

Callender, Akshat Kaul, and Michael P. Wellman. Thead auctions game for the 2010 trading agent competition.

Technical report, University of Michigan, Ann Arbor, MI48109-2121 USA, 2010.

[Jordan, 2010] Patrick R. Jordan.

*Practical Strategic Rea-*
*soning with Applications in Market Games*. PhD thesis,University of Michigan, Ann Arbor, MI, 2010.

[Lahaie, 2006] S´ebastien Lahaie. An analysis of alternative
slot auction designs for sponsored search. In

*EC ’06: Pro-ceedings of the 7th ACM conference on Electronic com-merce*, pages 218–227, New York, NY, USA, 2006. ACM.

[Mansour and Schain, 2010] Yishay Mansour and Mariano
Schain. TAU Agent. Presentation in Workshop on TradingAgent Design and Analysis (TADA), ACM EC 2010, July2010.

[Munsey

*et al.*, 2010] M. Munsey, J. Veilleux, S. Bikkani,
A. Teredesai, and M. De Cock. Born to trade: A genet-ically evolved keyword bidder for sponsored search. In

*Evolutionary Computation (CEC), 2010 IEEE Congresson*, pages 1 –8, 2010.

[Pardoe

*et al.*, 2010] David Pardoe, Doran Chakraborty, and
Peter Stone. TacTex09: A champion bidding agent for adauctions. In

*Proceedings of the 9th International Confer-ence on Autonomous Agents and Multiagent Systems (AA-MAS 2010)*, May 2010.

[PwC, 2010] PwC. IAB U.S. Internet Advertising Revenue
[Sutton and Barto, 1998] Richard S. Sutton and Andrew G.

Barto.

*Reinforcement Learning: An Introduction*. MITPress, Cambridge, MA, 1998.

[Vorobeychik, 2009] Yevgeniy Vorobeychik.

based game theoretic analysis of keyword auctions withlow-dimensional bidding strategies. In

*Twenty-Fifth Con-ference on Uncertainty in Articial Intelligence*, 2009.

Source: http://kyrcha.info/wp-content/uploads/2010/03/tada2011.pdf

Colonoscopy/Halflytely Prep 208-557-7523 and ask for the doctor on call for GRAND TETON This prep will allow your doctor to achieve the best GASTROENTEROLOGY, PA possible diagnostic view of your colon. It will also provide the most comfort during the procedure and The day of your procedure: Please take your mediations as prescribed the morning of the procedure with a small

Long Acting Reversible Contraceptive (LARC) Information Leaflet – Old Farm Surgery All the methods of contraception listed below are effective. However, no method is absolutely 100% reliable. The reliability for each method is given in percentages. For example, the contraceptive injection is more than 99% effective. This means that less than 1 woman in 100 will bec