Tada2011.dvi

An Adaptive Proportional Value-per-Click Agent for Bidding in Ad Auctions
Kyriakos C. Chatzidimitriou
Lampros C. Stavrogiannis
Centre for Research and Technology Hellas Andreas L. Symeonidis and Pericles A. Mitkas
Aristotle University of Thessaloniki and Centre for Research and Technology Hellas Abstract
or CPC) rather than a per impression (cost-per-mille or CPM)basis.
Sponsored search auctions constitutes the most im- Against this background, we present the strategy of agent portant source of revenue for search engine com- Mertacor, our entrant that participated in the TAC Ad Auc- panies, offering new opportunities for advertisers.
tions 2010 competition [Jordan et al., 2010] and was placed The Trading Agent Competition (TAC) Ad Auc- 3rd in the finals. At a high level, Mertacor’s strategy can tions tournament is one of the first attempts to study be decomposed into two parts: (a) estimating the Value-per- the competition among advertisers for their place- Click (VPC) for each query and (b) choosing a proportion ment in sponsored positions along with organic of VPC for bidding in each auction based on the state of the search engine results. In this paper, we describe game (the adaptive proportional part). The approach is sim- agent Mertacor, a simulation-based game theoretic ilar to the QuakTAC agent [Vorobeychik, 2011], which par- agent coupled with on-line learning techniques to ticipated in the 2009 competition, with two extensions: (a) optimize its behavior that successfully competed in a k-nearest-neighbors algorithm to help in the estimation of the 2010 tournament. In addition, we evaluate dif- VPC and (b) an associative to the state of the game, n-armed ferent facets of our agent to draw conclusions about bandit formulation of the problem of selecting the proportion The remainder of this paper is organized as follows: Sec- Introduction
tion 2 provides a brief description of the game. Section 3 The advent of the Internet has radically altered current busi- presents strategies of agents participated in the previous com- petition. Section 4 builds the background upon which our line advertising in search engine results, known as sponsored agent was based and gives a detailed description of the ex- search, where paid advertisements are shown along with regu- tension points. An discussion of the conducted experiments lar results (called impressions). Sponsored search is the high- is given in Section 5. Finally, Section 6 concludes the paper est source of revenue for on-line advertisement, yielding a and provides our future research directions.
profit of approximately $10.67 billions for 2009 only in theU.S. [PwC, 2010].
The TAC Ad Auctions Game
In the sponsored search setting, whenever a user enters a query in the search engine (publisher), an auction is run Sponsored search auctions are open, highly complex mech- among interested advertisers, who must select the amount anisms, that are non-dominant-strategy-solvable, hence bid- of their bids, as well as the advertisements that they deem ding strategies are a topic of active research. To investigate their behavior, a realistic agent-based simulator seems essen- positions (slots) for placement, but higher slots are more tial [Feldman and Muthukrishnan, 2008]. The Ad Auctions desirable, given that they generally yield higher levels of (AA) platform in the international Trading Agent Competi- Click-Through-Rate (CTR). This field started in 1998 by tion (TAC) is such a system. The TAC AA game specifica- GoTo.com, where slots were allocated via a Generalized First tions are defined in detail in [Jordan et al., 2010]. To famil- Price (GFP) auction, but received its current form in 2002, iarize the reader with the game, we will provide some basic when GFP was replaced by the well known Generalized Sec- information about the entities involved and the interactions ond Price (GSP) auction [Jansen and Mullen, 2008]. Accord- ing to this auction, bids are sorted by bid (that is usually mul- In TAC AA tournament, there are three main types of tiplied by an advertiser-specific quality factor), and the win- entities, the publisher, a population of 90000 users, ner of a slot pays the minimum bid needed to get this position, and eight advertiser entrants represented by autonomous which is slightly higher than the next bidder’s offer and inde- software agents. The advertisers compete against each other pendent of her bid. What makes this type of auctions different for advertisement (ad) placement, across search pages. Each is the fact that payment is made on a per click (cost-per-click one of the search pages contains search engine results for one

of the queries of 16 different keyword sets. In order to pro- an increase in profit per unit sold for the later. Addition- mote their products, the agents participate in ad auctions by ally, entrants are assigned a weekly maximum stock capac- submitting a bid and an ad to the publisher for the query (set ity Ccap ∈ {CLOW , CMED, CHIGH }, so conversions above of keywords) they are interested in. Ads are ranked on each this threshold are less likely to happen during this week (5 search page, based on a generalized method that interpolates working days). During its daily cycle of activities the adver- between rank-by-bid and rank-by-revenue schemes.
day, users, according to their preferences and state, remain • Send the bids for ad placement per query for day d + 1.
idle, search, click on ads and make purchases (conversions) • Select an ad for ad placement per query for day d + 1.
from the advertisers’ websites. The products being traded are The ad can be either generic (i.e. referring to a gen- combinations of three brands and three types of components eral electronics shop) or targeted (i.e. stating a specific from the domain of home entertainment. The small number of manufacturer-component combination). A targeted ad products enables competing teams to focus only on a small set that is a match to the user preferences increases the prob- of predefined keywords, abstracting away from the problems of keyword selection. The three manufacturers (namely, Li-oneer, PG and Flat) and the three types of devices (TV, Audio • Set spending limits for each query and across all queries and DVD) constitute a total of nine products. The simulation runs over 60 virtual days, with each day lasting 10 seconds. A • Receive reports about the market and its campaign for schematic of the interactions between game entities is found Publishers
As mentioned above, the publisher runs a GSP auction to de-termine the rank of bids and determine the payment per click.
The ad placement algorithm takes into account predefined re-serve scores. There is one reserve score below which an adwill not be posted, and one above which, an ad will be pro-moted. If the spending limit set by an agent is passed, therankings are recalculated. The auction implemented is a GSP,where the ranking takes into account the quality of the ad-vertisements, weighted by a squashing parameter that is dis-closed to the entrants at the beginning of the game.
Each user has a unique product preference and can be indifferent states representing his or her searching and buyingbehavior (i.e. non-searching, informational searching, shop-ping, with distinct levels of shopping focus, and transacted).
The product preference distribution is even for all products.
Users submit three kinds of queries, defined by their focuslevel for a total of 16 queries. There is one (1) F 0 query,where no manufacturer or component preference is revealed,six (6) F 1 queries, where only the manufacturer or the prod-uct type is included in the query and nine (9) F 2 queries,where the full product definition (manufacturer and type) isexposed. Users’ daily state transition is modeled as a Markovchain. Non-searching and transacted agents do not submitqueries. Informational agents submit one of the three queriesby selecting any one of the them uniformly and focused users Figure 1: The entities participating in a TAC AA game along submit a query depending on their focus level. While both with their actions. The majority of the users submit queries, information seeking and focused users could click on an ad, fewer click on ads and an even smaller percentage makes only focused users make purchases and go to the transacted state. Users click on the advertisements based on an extendedversion of the cascade model [Das et al., 2008]. After click-ing on an ad, whether a conversion will be made or not de- Advertisers
pends on user’s state, advertisers’ specialty and remaining Each advertiser is a retailer of home entertainment prod- ucts and can supply the user with any of the nine prod-ucts available. Upon initialization of each simulation, adver- Related Work
tisers are given a component and a manufacturer specialty, Given that the tournament started two years ago, relevant pub- yielding an increase in conversion rates for the former and lished work on TAC AA is limited. The majority of strategies is focused on two target metrics, namely the Return on In- Agent Mertacor
vestment (ROI), i.e. the ratio of profit to cost, and the Value Background
per Click (VPC), i.e. the expected profit from a conversiongiven a click, and combined with multiple choice knapsack The baseline strategy of agent Mertacor is a modified version (MCKP) models to deal with the distribution constraint ef- of the aforementioned QuakTAC strategy for the 2009 tour- nament [Vorobeychik, 2011]. This is one of the few reported TacTex [Pardoe et al., 2010], the winner in the previous strategies in TAC that employs simulation based game theo- competitions, implements a two-stage strategy of estimation retic analysis and was proven quite successful in that tourna- and optimization. The former incorporates self and oppo- ment, as QuakTAC was placed 4th in the finals. It is a sound, nent related predictions of desired variables as well as user elegant and yet simple strategy. For the bidding part, Vorob- state estimation. More specifically, this agent tries to ex- eychik considers a simple strategy space, with bids that are tract the range of impressions, ranking and amount of bids, linear to the valuation of the player. For the AA scenario, this as well as the type of ads shown. Then, it estimates the pro- valuation is the advertiser’s VPC, υ, and the agent bids a pro- portion of users in each state and utilizes this information to portion (shading), α, of this value. For each query q, the bid predict the most profitable number of conversions per query type. The optimization takes into consideration the distribu- tion constraint effect, hence both short-term and long-term optimizers are incorporated in the strategy. Finally, there is Given that GSP auctions are not incentive compatible [La- an estimator for the unknown game parameters, i.e. continu- haie, 2006], we expect that α < 1. An estimation of the VPC ation probability and quality factor.
for a keyword can be expressed as the product of the proba- Another competitor for 2009, Schlemazl, employs rule- bility of converting after a click on the ad and the expected based as well as dynamic programming techniques. Accord- revenue from such a conversion. So for each query q, the ing to the latter, the bidding process is modeled as a penalized agent’s estimated VPC value can be calculated as: MCKP, where the value of each query is the profit made and the weight is the number of sales [Berg et al., 2010]. For the υq = P r {conversion|click} · E[revenueq|conversion] rule-based method, the agent’s strategy targets for the sameROI in all queries. A similar principle is implemented in The expected revenue for a query q given a conversion, EPFLAgent 2010 [Cigler and Mirza, 2010], targeting for the (E[revenueq|conversion]), solely depends on the adver- same profit per conversion over all queries, and distributes the tiser’s Manufacturer Specialty (MS) and can be calculated number of sales uniformly on the five most recent days. If the with no additional information for the three possible cases number of sales exceeds this threshold, the bid for the most profitable keyword is increased, otherwise the bid on the least Tau [Mansour and Schain, 2010] follows a reinforcement learning (soft greedy) approach, based on regret minimiza-tion, where the agent selects a value from a distribution on the where USP is the unit sales profit ($10 in TAC 2010) and space of bids, so that the regret, which is the difference be- MSB the manufacturer specialty bonus (0.4 in TAC 2010).
tween actual profit and maximum profit, is minimized. On the However, for calculating the probability of conversion, we other hand, DNAgent [Munsey et al., 2010] follows an evo- need two additional estimations, the proportion of focused lutionary computation technique, by creating a population of users and past and future conversions that have an impact on agents (referred as finite state machines) that are genetically the conversion probability due to the distribution constraint evolved until the fittest is determined. Each agent can be in seven different states, based on its previous position, which is {conversion|click} = f ocusedP ercentage · combined with the matching of the advertisement to the user to determine bid adjustments for the next day.
Finally, AstonTAC [Chang et al., 2010] and Quak- TAC [Vorobeychik, 2011] follow VPC-based strategies. The To calculate a value for the f ocusedP ercentage esti- former estimates the market VPC, which is the VPC minus mate, we used the following procedure. If we fix advertis- relevant cost, and then bids a proportion of this value based on ers’ policies, then the proportion of focused users is equal to the critical capacity (i.e. capacity beyond which the expected the ratio of the clicks that result in conversions divided by cost is higher than the corresponding revenue) and the esti- their individual probability of conversion (incorporating the mated quality factor for the advertiser. Priority is also given distribution constraint effect). The user population model in to the most profitable queries. On the other hand, QuakTAC TAC AA is a Markov chain and, despite the incorporation of follows a simulation based Bayes-Nash equilibrium estima- the burst transitions, it will typically converge to a station- tion technique to find the optimum percentage of the VPC to ary distribution. Hence, instead of using a more sophisticated bid. One important advantage of the VPC strategy is that it and accurate particle filter, we have used Monte Carlo sim- does not require any opponent modeling techniques, which in ulations, running 100 games with symmetric, fixed strategy turn, demand historical information about opponents’ bidding profiles, making sure that their bids are always higher than behavior, difficult to obtain in reality.
those requested by reserve scores, and have then recorded the mean values for our desired ratios (every day results for indi- αsingle deviation from this profile and use this last value as vidual queries are aggregated for each focus level). Historical the new homogeneous profile. This process is repeated un- proportions are kept in log files and are updated incremen- til self-response is a Bayes-Nash equilibrium. We have val- tally. In our experiments, we have used last day’s probability idated the speed of this method, being able to get the best of conversion as a close estimate of the requested value for value of αsingle = 0.3 in three iterations (1, 0.4 and 0.3).
The alpha value of 0.3 and the new method for calculating Our first differentiation compared to the original strat- Id are the differentiation we made to the agent playing in the egy lies in the estimation of the conversion probability, qualifications round of TAC AA 2010 and further defined as the distribution constraint effect, Id. As Jordan has noticed Ad Selection Strategy
in his thesis [Jordan, 2010], the distribution constraint effect,I It is also important to describe our ad selection strategy. This d, is the second most influential factor in an advertiser’s per- formance after the manufacturer specialty in terms of the co- task is straightforward for F 0 (no keywords) and F 2 (con- efficient of determination values. This effect radically affects taining both brand and product keywords) type queries. For the probability of purchase. Based on the specifications of the first case, there is a 1 probability of matching a user’s the game for the calculation of the distribution constraint ef- preference, so a generic ad seems most appropriate. On the fect for day d + 1, we need, but don’t have the conversions, c, other hand, users reveal truthfully their preferences, hence a targeted F 2 ad will always match them. For the F 1 keywords(containing either brand or product keywords), we have also used a targeted advertisement, where our agent’s respective where g is a function defined in the specification that gives specialty is shown when the manufacturer or the component are not disclosed, hoping that our increased profit or conver- Given that an entrant must calculate a bid with a 2-day sion probability benefits will outweigh our higher probabili- lag in available information, QuakTAC estimates tomorrow’s ties of mismatch. This strategy also proved to be effective in the original QuakTAC agent [Vorobeychik, 2011].
d+1 value using a two-step bootstrapping technique. More specifically, it estimates conversions on day d as the product Extensions
of current game’s historical click-through rate, recorded his-torical impressions and (optimistically) estimated conversion For the final rounds, we tried to improve our agent in the probability. Then it uses this information to estimate day’s two critical calculations of: a) estimating the distribution con- d+1 conversion rate and corresponding conversions and uses straint effect and b) finding an appropriate α value by adapt- ing it on-line according to the current market conditions.
Having implemented that strategy, we realized that it per- In the first case, we used the k-Nearest-Neighbors (k-NN) forms poorly on the 2010 server specifications.
algorithm to estimate the capacity to be committed for the that it underestimates the VPC, systematically bidding much current day cd and then using that estimate to further predict lower than what is allowed by the publisher’s reserve scores.
the capacity to be purchased on day cd+1. We chose k = 5 It is important to note that these scores for 2010 are much and stored 5-day tuples for the last 10 games (600 samples).
higher than 2009. Jordan describes the effect of the reserve We adopted this kind of modeling because the agent executes score on publisher’s revenue as well as players’ equilibrium a clear 5-day cycle in its bidding and conversion behavior.
showing that AstonTAC and Schlemazl perform much bet- This behavior is derived from the strategy of the agent since ter than TacTex for these higher reserve scores. To rectify when having enough remaining capacity, it will produce high this shortcoming, we have implemented a simpler but more bids due to high VPC values and get ranked in the first posi- reasonable expectation: we aggregate last three days’ conver- tions. That way it will receive many clicks and a high volume sions and set 1 of this value as our number of conversions for of conversions will take place. This behavior will continue day d as well as day d+1. This is slightly lower than their cor- for one more day, until the agent’s store is depleted. Then the responding mean value so as to intentionally be more conser- bids will be low and the agent will maintain a no conversions vative. However, when our capacity is CLOW , we have used instead the mean value, which was experimentally shown to In order to choose the proportion of the VPC to bid in a real antagonistic environment, we have formulated the problem of Finally, we had to select the percentage of our VPC to choosing α into an associative n-armed bandit problem [Sut- bid. Following the methodology of Vorobeychik, we restrict ton and Barto, 1998]. Under this formulation, we switched our interest in simple symmetric equilibria and discrete val- bidding from b(υ) = α · υ to b(α, υ) = α · υ.
ues ranging from 0 to 1 with a step of 0.1. This means that Based on the experience gained from the TAC SCM do- all but one advertisers follow the same strategy (bid = αυ) main, we tried to maintain a low dimensionality in the state and a single player bids another αsingle percentage, which and action spaces for sampling efficiency. We chose VPC varies among games. The values for αsingle are determined and query types as a state variables. The choice of VPC was via an iterative best response method, where we start from a made because its calculation incorporates many parameters truthful homogeneous strategy profile (α = 1, αsingle = 1).
that characterize the state of the agent and the market such as: This is a reasonable value to start, as GSP is considered a specialties, distribution capacity constraint, and proportion of generalization of the Vickrey auction. Then we find the best focused users. The query type was added as state variable in order to provide additional information about the current statefrom another dimension. VPC was quantized over 11 states, Table 2: TAC AA 2010 tournament results.
10 for VPC values between $0 and $10 and 1 for VPC values above $10. The 16 query types were mapped into 8 states as presented in Table 1. There were 6 discrete actions picking values of α between equal spaced points from 0.275 to 0.325 Mertacor
inclusive. So a total of 528 Q(s,a) values need to be stored and updated. For exploration purposes ǫ-greedy policy was used with ǫ = 0.1. The goal is to learn a policy for select- ing one of the 6 actions that would maximize the daily profit given the state of the game. Updates to Q values were made Q(s, a)k+1 = Q(s, a)k + 0.1 · (rk − Q(s, a)k) the 2010 tournament. Mertacor-kNN has only the k-NN ca- Reward is the daily profit calculate as the revenue minus the pability, while it continues to bid with α = 0.3. Mertacor-RL cost for the corresponding state and action.
has the ability to adapt α, but is not equipped with k-NN.
Last but not least, Mertacor-Finals combines all the modifi- cations and extensions described in this paper and is the agentparticipated in the 2010 finals. In the first tournament, two On day d we receive the report for day d − 1 from which we agents from each one of the four versions of the agent com- can calculate the reward for day d − 1. That reward concerns peted, having all the same storage capacity equal to CMED.
the bids made on day d − 2 for day d − 1.
In the second tournament the storage capacities were selectedin competition terms (2 agents with CLOW , 4 with CMED Table 1: Mapping query types to states. One state for the F 0 query, 3 states for F 1 queries and 4 states for F 2 queries.
Results in Table 3 indicate that the extension added to the Mertacor-Finals version of the agent are giving the agent a small boost. Both versions were able to get the first two po- sitions in this tournament. Even though the differences are not large enough and not statistical significant under paired t-testing, in tightly played competitions they could make the difference over ranking positions, like in 2010. Moreover, we believe that by optimizing these techniques, now imple- mented rather crudely, more profit is possible. It is also ev- ident that including both extensions has more benefits than Analysis
Table 3: Average scores over 88 games of different versionsof agent Mertacor with equal distribution capacity constraint To validate the effectiveness of this iterative best response technique, after the tournament we have also repeated the ex- periments for all possible combinations of homogeneous and single α values and have plotted the best αsingle as a func- tion of the symmetric opponent profile. Each experiment was repeated 5 times for a total of 500 log files. Results are given in Figure 2. As can be seen, there are only two close values for αsingle(0.3, 0.4), although 0.3 is best response for more reasonable strategies (α < 0.5). Hence, this technique yieldsrobust results in only 3% of total required time for extensive experiments. Moreover, we should note that the values of 0.1 and 0.2 used by QuakTAC last year are not profitable at all inthis year’s platform due to the aforementioned reserve score When dealing with different capacities between games, the domain becomes more challenging, especially for on-line The final version of agent Mertacor got the third place in learning methods. In Table 4, one can find the results of the TAC AA 2010 competition. The standings are shown in the second tournament. Again the differences are not large enough, but there are small deviations between versions. The To evaluate the effectiveness of the agent and the exten- k-NN version is able to estimate better the capacity to be used sions incorporated, two tournaments were conducted. Four by the agent and this benefits the final scoring of the agent In versions of agent Mertacor were constructed.
Figure 3, one can obverse the quality of the k-NN predictions.
Quals is the agent that participated in the qualifications of It is possible that by optimizing this algorithm, extremely ac-

curate results could be possible. As general comment, all ver-sions of the agent, in both tournaments, were able to maintaintheir bank accounts to the level agent Mertacor scored in thefinals. This is evidence of a robust strategy against both sim-ilar and different opponents.
Table 4: Average scores over 88 games of different versionsof agent Mertacor in competition mode with respect to distri-bution capacity constraints.
Conclusions and Future work
In this report we have described our agent, Mertacor, forthe TAC AA 2010 tournament. We have seen how empiri-cal game theory can provide robust and effective results in arestricted real-world setting, when operationalized appropri-ately using simple abstractions. We were also able to discussabout the importance of the distribution constraint effect andthe reserve score in our results, which significantly influencedour agent’s performance before and during the tournament.
Conversions prediction
Last but not least, we have elaborated on two extensions, k-nearest neighbors and reinforcement learning that differenti-ate our agent from related work and provide added value to the agent with respect to making better predictions and adapt-ing to the environment and the competition.
As future work, we would first need to use a more ac- curate user state predictor, such as the one implemented byTacTex. Moreover, we would like to extend our strategy toquadratic functions of the VPC, to incorporate the decrease in α for smaller corresponding valuations [Vorobeychik, 2009], as was also implemented in QuakTAC during the finals of 2009. Other non-linear function approximators could also be tested, under off-line parameter learning schemes for boost-ing time to converge to an optimal policy. Finally, it would be desirable to identify key parameters that influence more orless the optimal bidding percentage.
References
[Berg et al., 2010] Jordan Berg, Amy Greenwald, Victor Naroditskiy, and Eric Sodomka. A knapsack-based ap-proach to bidding in ad auctions. In Proceeding of the 2010conference on ECAI 2010: 19th European Conference on Figure 3: k-NN enabled predictions of conversions for the Artificial Intelligence, pages 1013–1014, Amsterdam, The future day (dashed line). The solid line indicates the actual Netherlands, The Netherlands, 2010. IOS Press.
[Chang et al., 2010] Meng Chang, Minghua He, and Xudong Luo. Designing a successful adaptive agent for tac ad auc-tion. In Proceeding of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence, pages [Vorobeychik, 2011] Yevgeniy Vorobeychik. A Game The- 587–592, Amsterdam, The Netherlands, The Netherlands, oretic Bidding Agent for the Ad Auction Game. In Third International Conference on Agents and Artificial Intelli- gence (ICAART 2011), January 2011.
Cigler and Mirza, 2010] Ludek Cigler and Elias Mirza.
EPFLAgent: A Bidding Strategy for TAC/AA. Presenta-tion in Workshop on Trading Agent Design and Analysis(TADA), ACM EC 2010, July 2010.
[Das et al., 2008] Aparna Das, Ioannis Goitis, Anna R. Kar- lin, and Claire Mathieu. On the effects of competing ad-vertisements in keyword auctions. Working Paper, 2008.
S. Muthukrishnan. Algorithmic methods for sponsoredsearch advertising. In Zhen Liu and Cathy H. Xia, editors,Performance Modeling and Engineering, 2008.
[Jansen and Mullen, 2008] Bernard J. Jansen and Tracy Mullen. Sponsored search: an overview of the concept,history, and technology. International Journal of Elec-tronic Business, 6(2):114–131, 2008.
[Jordan et al., 2010] Patrick R. Jordan, Ben Cassell, Lee F.
Callender, Akshat Kaul, and Michael P. Wellman. Thead auctions game for the 2010 trading agent competition.
Technical report, University of Michigan, Ann Arbor, MI48109-2121 USA, 2010.
[Jordan, 2010] Patrick R. Jordan. Practical Strategic Rea- soning with Applications in Market Games. PhD thesis,University of Michigan, Ann Arbor, MI, 2010.
[Lahaie, 2006] S´ebastien Lahaie. An analysis of alternative slot auction designs for sponsored search. In EC ’06: Pro-ceedings of the 7th ACM conference on Electronic com-merce, pages 218–227, New York, NY, USA, 2006. ACM.
[Mansour and Schain, 2010] Yishay Mansour and Mariano Schain. TAU Agent. Presentation in Workshop on TradingAgent Design and Analysis (TADA), ACM EC 2010, July2010.
[Munsey et al., 2010] M. Munsey, J. Veilleux, S. Bikkani, A. Teredesai, and M. De Cock. Born to trade: A genet-ically evolved keyword bidder for sponsored search. InEvolutionary Computation (CEC), 2010 IEEE Congresson, pages 1 –8, 2010.
[Pardoe et al., 2010] David Pardoe, Doran Chakraborty, and Peter Stone. TacTex09: A champion bidding agent for adauctions. In Proceedings of the 9th International Confer-ence on Autonomous Agents and Multiagent Systems (AA-MAS 2010), May 2010.
[PwC, 2010] PwC. IAB U.S. Internet Advertising Revenue [Sutton and Barto, 1998] Richard S. Sutton and Andrew G.
Barto. Reinforcement Learning: An Introduction. MITPress, Cambridge, MA, 1998.
[Vorobeychik, 2009] Yevgeniy Vorobeychik.
based game theoretic analysis of keyword auctions withlow-dimensional bidding strategies. In Twenty-Fifth Con-ference on Uncertainty in Articial Intelligence, 2009.

Source: http://kyrcha.info/wp-content/uploads/2010/03/tada2011.pdf

Colonoscopy/fleets prep

Colonoscopy/Halflytely Prep 208-557-7523 and ask for the doctor on call for GRAND TETON This prep will allow your doctor to achieve the best GASTROENTEROLOGY, PA possible diagnostic view of your colon. It will also provide the most comfort during the procedure and The day of your procedure: Please take your mediations as prescribed the morning of the procedure with a small

Larc information leaflet

Long Acting Reversible Contraceptive (LARC) Information Leaflet – Old Farm Surgery All the methods of contraception listed below are effective. However, no method is absolutely 100% reliable. The reliability for each method is given in percentages. For example, the contraceptive injection is more than 99% effective. This means that less than 1 woman in 100 will bec