Auctions in Do-Not-Track Compliant Internet Advertising
easy to link tracking information with PII [16]. Concern Online tracking of users in support of behavioral advertis- over tracking continues to grow, and has led for instance to ing is widespread. Several researchers have proposed non- the Do-Not-Track initiative launched by the Federal Trace tracking online advertising systems that go well beyond the Commission (FTC), and proposals for a do-not-track reg- requirements of the Do-Not-Track initiative launched by the istry [23]. Do-not-track, however, requires a public that is US Federal Trace Commission (FTC). The primary goal of largely unaware of tracking to opt-out. It also forces a trade- these systems is to allow for behaviorally targeted advertis- off between tracking and behavioral targeting: a compromise ing without revealing user behavior (clickstreams) or user that industry is not prepared to make. Not surprisingly, do- profiles to the ad network. Although these designs purport not-track has not gained a lot of traction with the public, to be practical solutions, none of them adequately consider and ad networks are actively fighting it.
the role of the ad auctions, which today are central to the Several recent research projects take a different tack (Ad- operation of online advertising systems. This paper looks at nostic [24], Nurikabe [18], and Privad [13]). Rather than the problem of running auctions that leverage user profiles simply opting out of tracking, or pushing for data-protection for ad ranking while keeping the user profile private. We after the fact, they propose alternative advertising technolo- define the problem, broadly explore the solution space, and gies that allow for behavioral targeting without requiring discuss the pros and cons of these solutions. We analyze tracking. These are meant to be practical alternatives that the performance of our solutions using data from Microsoft industry finds attractive. None of these systems, however, Bing advertising auctions. We conclude that, while none of adequately explore how to operate the auctions that are our auctions are ideal in all respects, they are adequate and critical to current advertising systems. Without this com- ponent, these systems leave unanswered what revenue thebroker (i.e. an ad network like Google) can earn, therebyreducing the likelihood that a non-tracking1 advertising sys- Categories and Subject Descriptors
tem will be of commercial interest. In this paper, we look at the problem of running auctions that leverage a user profilefor ad ranking while keeping the user profile private.
General Terms
Although the non-tracking advertising systems cited above differ in significant ways, they all share several key de- sign components. All systems propose a software agent thatruns at the client and generates a user profile. All systems at Keywords
least share the privacy goal that this profile not be revealed.
All these designs propose that the broker transmit multiple Auctions, Online Advertising, Privacy, Targeting ads to the client, not all of which match the user profile. Forinstance, the ads may all be within a given interest category.
The client then locally selects from among these ads which Third-party tracking of users is widespread and increas- best match the user profile, and display these to the user.
ing [17]. One of the primary purposes of user tracking is The client reports the result of this selection anonymously, behavioral advertising. Although many companies that par- and without letting the broker link together different com- ticipate in tracking and behavioral profiling claim to not ponents of the user profile. The key privacy mechanisms are gather Personally Identifying Information (PII), it is often therefore anonymity and unlinkability.
common pricing model for online advertising systems to-day is Pay Per Click (PPC): the advertiser does not pay Permission to make digital or hard copies of all or part of this work for the broker for showing an ad to a user, rather it pays personal or classroom use is granted without fee provided that copies are only if the user clicks on an ad. The broker selects which not made or distributed for profit or commercial advantage and that copies ads to show through an auction whereby advertisers bid bear this notice and the full citation on the first page. To copy otherwise, to against each other. In a PPC system, the broker maxi- republish, to post on servers or to redistribute to lists, requires prior specific mizes revenue by ranking the competing ads according to permission and/or a fee.
CCS’11, October 17–21, 2011, Chicago, Illinois, USA.
Copyright 2011 ACM 978-1-4503-0948-6/11/10 .$10.00.
By non-tracking, we mean no third-party tracking.
the Bid × ClickP robability product, and transmitting the ability. Denoting bid as B, the ranking then is in order of highest ranking ads to the client where they are displayed in rank order. Of course, the broker doesn’t know the precise When a user clicks on an ad, the ad ID is transmitted to click probability for every ad. Rather, the broker tries to the broker. The broker computes Cost per Click (CPC), that predict the click probability as best it can. This prediction is, the price that the advertiser must pay, using a second- is based on a number of inter-related factors such as the ad price auction [8]. In this approach, the price paid is pegged keywords, the landing page keywords, the user search terms to the bid of the ad that is ranked immediately below the or keywords associated with the web-page being browsed, clicked ad. To give a simple example, suppose that adver- stored user characteristics, and so on.
tiser A is willing to pay as much as $5 for a click, and ad- The user profile has a strong effect on click probability. To vertiser B is willing to pay $10. In a second-price auction, give a simple example, say a user searches for “running shoe”.
A could go ahead and bid $5 and B could bid $10. B would Whether the user is a man or a woman, or prefers brand- win, but would only pay incrementally more than A’s (2nd- name products or discount products, plays an important role in which running shoe ad he or she is more likely to click on.
Second-price auctions allow bidders to bid the maximum In a do-not-track compliant advertising system, the broker that they are willing to pay, rather than frequently modify does not know the user profile: if the auction takes place their bid in search of the value incrementally higher than the at the broker in the same way that it does today, then the next lower bidder. Specifically, the CPC is computed as: user profile will not be factored into the result. Therefore the highest Bid × ClickP robability ads won’t be selected, leading to less revenue than should otherwise be possible.
This paper characterizes the problem, and proposes three where Bn and Qn are the bid and quality score of the next basic solutions, one of which is a variant of a method pro- lower ranked ad, and Qc is the quality score of clicked ad.
posed in [12]. Following the lead of the initial work on non- This CPC formula captures the minimum amount the ad- tracking advertising, this paper takes a pragmatic approach vertiser would have had to bid to beat the next-ranked ad to the problem. It looks for a good trade-off between strict in a first-price auction. Note that it prevents the advertiser privacy guarantees and practical business and deployment from paying more than it bid, even when the next-ranked in concerns. As such, this paper explores the pros and cons of the three approaches in terms of not just privacy (both userand advertiser), but also revenue, overhead, and vulnerabil- Abstract Non-tracking Advertising
ity to attack. It uses around 2TB of auction traces from In this section, we describe an abstract non-tracking ad- Microsoft Bing to guide and validate the design choices.
vertising system that captures key aspects of the three ex- Altogether, this paper makes the following contributions: isting non-tracking advertising designs.
1) It proposes two new non-tracking auction designs, and The principle components of the abstract non-tracking ad- a third based on the previous work but substantially im- vertising system include those of today’s tracking systems, proved. 2) It analyzes the trade-offs between these three the broker, the client, and the advertiser. A user profile is designs in terms of privacy properties, auction properties, stored at the client (i.e. the user’s computer, or a device and fraud resistance. 3) It analyzes the effect of bid churn trusted by the user: the distinction is not important for our and auction timing on revenue and ad ranking using a trace purposes). Each ad is associated with targeting information.
of Bing search advertising auctions, and uses this analysis The user profile is defined as that information needed to de- to argue for the feasibility of the solutions.
termine how well an ad’s targeting matches the user. Toproduce the user profile, the client monitors user behavior BACKGROUND
(i.e. the user’s searching, browsing, purchases, and so on).
The privacy goals of the abstract non-tracking system are: This section describes how current online advertising sys- tems such as Google and Microsoft work, and then describes • Anonymity: the broker cannot associate any unit of an abstract alternative advertising model patterned after ex- learned information with any user PII (including net- isting proposals for non-tracking advertising. In the process, we establish terminology and define the basic components.
• Unlinkability: the broker cannot associate separate Current 2nd Price Ad Systems
units of learned information with a single (anonymous)client. This prevents a broker from building up a user In current ad systems [7,11,20], advertisers submit ads to profile, and then associating it with a known user using a broker. Associated with each ad is a bid, one or more keywords, and optionally some targeting information likedemographics (location, age, gender) or interests. When The broker is assumed to be honest-but-curious. We be- a client computer does a search or receives a web page lieve that this is close to reality (brokers like Google can with adboxes (space to place an ad), the broker identifies generally be trusted to do what they claim they are doing).
the ads that match the search terms or keywords associ- Nevertheless, we believe it is wise to avoid making it pos- ated with the web page, and runs an auction. The auction sible for brokers to obtain high-value information through ranks the selected ads in order of highest expected revenue hard-to-detect cheating, and our designs reflect this belief.
(Bid × ClickP robability), and transmits some number of Figure 1 illustrates the basic architecture and message ex- ads to the client. As already discussed, many factors are change of an abstract non-tracking advertising systems. The considered in estimating click probability. In this paper, we network layer address of all messages is anonymized, which refer to all of these factors taken together as a quality score we represent as an anonymizing proxy. Messages are en- Q, where a higher value means higher expected click prob- crypted to prevent viewing by the anonymizing proxy. The less than 1 that raises or lowers the click probability propor-tionally to its effect on the click probability defined by G.
Section 7 briefly discusses how U may be computed.
In current tracking advertising systems, the click normally takes place almost immediately after the view, and so CPCis normally computed shortly after the ranking. As a result,the parameters that go into determining the ranking (B and Q) do not change much between ranking and CPC. In non-tracking advertising systems, as explained later, some time Figure 1: Abstract non-tracking advertising system.
may pass between when B is set by the advertiser and when B is the broker, C is the client. Communications is the ad is ranked, or between when an ad is ranked and when through an anonymizing proxy. [x] denotes encryp- CPC is calculated. Therefore, we set the following goals with • The B, G, and U used for CPC calculation are the client requests a set of ads of a given type (i.e. for a given same as the B, G, and U used for ranking. Note in product or service). The request must be generic enough particular that if they are not the same, then it is that a substantial set of clients can have legitimately made possible for instance for the CPC to be higher than the request (i.e. K-anonymity and L-diversity). A set of ads the submitted bid of the clicked ad.
matching the type, each with identifier A and associated tar-geting information, are transmitted to the client and stored.
• The delay between ranking and CPC calculation is When an adbox is presented to the client, for instance on a small enough that the churn in B, G, or U does not web page, the client selects among the stored ads those that have a significant impact on rankings, CPC values, and best match the user profile, and puts them in the ad box for viewing by the user. The client reports the view and click of What exactly comprises user score U depends on the client the ad A on a webpage with the given URL. There is noth- profiler and can vary from system to system. We can, how- ing in the messages that allows the broker to link different ever, classify user information into three time frames. At messages as coming from the same user.
the time frame of months or even years are user demograph- Our model has two necessary channels of communications ics like gender, location, language, age, salary, and so on.
between client and broker, ad delivery, and view and click User interests can also last years (e.g. coin collecting), but reporting. Both channels present opportunities for the bro- more typically last weeks (a new car), days (a new pair of ker to learn information contained in user profiles, and cur- shoes), or minutes (a pizza). If we assume that matching rent non-tracking advertising designs protect these channels.
ads to the content of a web page or search page increases Any new information that must be conveyed for the purpose click probability, then user score can change in seconds or of the auction must also be protected.
less. For instance, a user might be interested in tennis andmusic, but the user score for tennis ads may increase while AUCTION GOALS
the user is looking at a tennis website, and vice versa formusic ads.
The privacy goals for the auction component of a non- We do not make any assumptions about the relative im- tracking advertising system are the same as described in portance of B, G, or U . An ideal auction design however Section 2: anonymity and unlinkability. This section de- scribes the goals of the auction itself.
Besides the basic goal of running an auction that lever- The primary goal of the auction in a non-tracking adver- ages the user profile while prohibiting the broker from re- tising system is to provide a second-price auction mechanism constructing it, there are a few additional related goals that that achieves close-to-ideal ranking of ads (i.e., in order of Bid × ClickP robability). For today’s tracking advertisingsystems, leveraging the user profile is straightforward, since • to maintain the privacy offered to the advertisers them- the broker itself accumulates and maintains this informa- selves. In particular, to prevent advertisers from learn- tion. In a non-tracking system, the broker does not have ing each other’s bids and budgets.
user profile information, but does have other information • to maintain the level of click-fraud defense in current that goes into the quality score Q. In other words, part of the information used to produce Q is in the broker, and partis in the client. Therefore, we define a user score U which • to minimize the overhead of the auction.
directly reflects the effect of the user profile, when matchedagainst an ad’s targeting information, on click probability.
As will become apparent in subsequent sections, our de- We define a second quality score G, that reflects the remain- signs do not perfectly achieve all of these goals. Rather, our ing “global” information known to the broker.
designs offer trade-offs between these goals.
Specifically, this results in an ideal ranking and CPC of: Note finally that it is possible that the value of U may cor- relate with the probability that the user will buy the prod-uct or service being advertised assuming that the user has clicked. If this is the case, then the advertiser would want to express multiple bids as a function of U , since different U ’s would produce different revenues for the advertiser. We do not address this capability in this paper, but rather leave For example, U could be a positive real value greater or it for future work should it turn out to be important.
an ad, then the client computes the following values and Rank at Client (RaC)
(Bn × Gn) product of the next-ranked ad times the ratio (Un/Uc) of the user score of the next-ranked adUn and the user score of the clicked ad Uc.
[Bc,Gc]: The encrypted B and G for the clicked ad as re- Upon reception of this message, the broker decrypts [Bc,Gc]. It uses Gc and ((Bn × Gn) × (Un/Uc)) to com- Rank at Broker (RaB)
pute the CPC as shown in Equation 2. The broker alsocompares the resulting CPC with the decrypted Bc.
CP C > Bc, then the broker knows that the client is en- gaged in click fraud, and the broker can ignore the message.
If CP C ≤ Bc, then the broker can accept the message, al- though this doesn’t mean that the client is not engaged in click-fraud. Other mechanisms, such as statistical analysis, must be used to detect it as is done today.
Variation: It may not be necessary to transmit the en- crypted values [Bc,Gc]. This is because Bc and Gc can be looked-up using the ad ID Ac. The danger here is that the looked-up values may be different from the Bc and Gc val-ues used to rank the ads. How different depends on the Rank at 3rd Party (Ra3)
level of churn in B and G values, which we found to beminimal in the Bing auction trace (Section 6). Therefore, Figure 2: Three basic auction schemes. A is an ad it may well suffice to use looked-up values rather than val- ID unique to the ad, while ID is an Ad ID unique to ues stored along with the ad at the client. Note that this the combination of ad and client. The subscripts ‘c’ variation applies to RaB and Ra3 as well.
and ‘n’ refer to the clicked ad and the next ranked adrespectively. [x] denotes encryption of x. Messages Rank-at-Broker (RaB)
between client and broker pass through anonymiz- One concern with RaC is that the value (BxG) exposes ing proxies. Messages through solid-line proxies are information about the advertiser (see Section 5.1.3). This encrypted. Messages though dotted-line proxies are can be avoided if the ranking is done at the broker. The RaB scheme presented here is similar to the approach proposedby Privad in [12]. We present it here for completeness.
Along with the ad, the broker transmits the following to DETAILED DESIGNS
To run the auction specified in Section 3, the system doing the ranking must have access to the bid B, the broker quality score G, and the user score U . This means that either 1) B ID: An identifier unique to this specific delivery of this ad and G are sent to the client, 2) U is sent to the broker, or (among all other deliveries). In other words, the same 3) B, G, and U are all sent to a 3rd party (Figure 2). These ad delivered to other clients would have a different basic approaches are explored in the following sections.
Rank-at-Client (RaC)
[B × G]: A single value which is the product of (B × G), In this approach, the following information is transmitted encrypted with a symmetric key known only to the with the ad to the client along with everything else required by the advertising system (e.g. targeting information, not [B, G]: The values B and G, encrypted with a symmetric The client computes a user score U for each ad (in the (B × G): A single value which is the product of (B × G).
absense of knowledge of what web page the ad may be shownon). In order to obscure the user profile, the client assigns [B, G]: The individual values B and G, encrypted with a a random value for U for those ads for which the client has symmetric key known only to the broker.
a very low user score (for instance because the demographicdoesn’t match that of the user).
When a collection of ads arrive at the client, it ranks all Clients transmit the ID,U tuples to the broker via a proxy.
ads using (B × G × U ). Note that in this case U is current, These messages are not encrypted. The proxy also remem- while (B × G) is more-or-less older. If the user clicks on bers which IDs were received from which clients.
For each received ID, the broker looks up the current val- in Sections 5.1.1 and 5.3.3. When a user clicks on an ad, the ues of B and G. It then uses B, G, and U to rank each client encrypts (Un/Uc) with the broker’s public key. In the received ad among a large number of recently received ads case of RaC, it also encrypts (Bn × Gn) with the broker’s (say, those received over the last hour), and associates a rank public key. In the case of RaB and Ra3, the broker provides number Rank to each ad. Closely ranked ads may have the the encrypted [Bn × Gn], but using its public key instead of same ranking number. The broker transmits the ID,Rank a symmetric key. For all three schemes, the broker provides [1/Gc], again encrypted with the broker’s public key. Using The proxy looks up which client is associated with each homomorphic property of the encryption, the client is able ID, and forwards the message on to the client. The client disregards the ads related to the low user scores, and uses the remaining ranking for selecting ads to put in ad boxes.
When a user clicks on an ad, it transmits the following and transmit the resulting value in the click report. Uponreceiving a click report, broker decrypts the value to obtain the CPC. Although homomorphic encryption is relatively expensive, there is no need to do the operation in real-time.
n × Gn]: The encrypted (B × G) for the next-ranked ad Rather, the client can do the operation when it has spareCPU cycles before transmitting, and the broker can likewise (Un/Uc): A single value which is the ratio of the user score run the operations later on as batch processing.
of the next-ranked ad Un and the user score of theclicked ad Uc.
This section analyzes the three types of auctions in terms c ,Gc ]: The encrypted B and G for the clicked ad received of privacy, auction quality, and attacks on the auction sys-tems.
Upon reception of this message, the broker decrypts [Bn × Privacy Properties
compute the CPC as shown in Equation 2. As with RaC, the In this section, we look at the information that is conveyed broker also compares the resulting CPC with the decrypted for the sake of the auction between honest-but-curious play- ers, and determine whether it constitutes a privacy threat.
The proxy prevents the broker from learning the identity In Section 5.3 we relax this assumption.
of the client whose ads are being ranked. The per-client-per-ad unique ID prevents the proxy, which does know the Broker analyzes (Un/Uc) (RaB and Ra3) client identity (network address), from learning which ad, In order to exploit this value to gather more information and therefore what targeting information, is being referred about the user profile, the broker would have to first tease apart the values of Un and Uc, then use the value combined Even with the noise added to low user scores, we are con- with the ad targeting to reverse engineer the user profile, and cerned that the non-noise user scores can be interpreted at then use the user profile knowledge to link together multiple the broker as a kind of fingerprint over the set of ads (i.e., reports. The first step may be made difficult by making ads targeted to men should have uniformly higher scores for (Un/Uc) relatively coarse-grained, thus making it harder to men, and lower scores for women). In this way, the broker uniquely factor out its components. The second step is made could potentially tease out the profile of users.
difficult simply by the sheer number of clients that are likelyto have similar user scores. Thus, we conclude that exposure Rank-at-3rd-Party (Ra3)
of (Un/Uc) does not constitute a serious threat.
This approach is similar to RaB, but prevents the finger- printing mentioned above. The main difference between Ra3 Broker analyzes ((Bn × Gn) × (Un/Uc)) (RaC) and RaB is that in Ra3, the broker additionally sends the This value is more difficult to reverse engineer than unique ad IDs and (B × G) products to a 3rd party system (Un/Uc), and is therefore also not a threat.
which is trusted not to collude with the broker. This in-formation must be delayed long enough that the 3rd party Client analyzes (B × G) (RaC) system cannot use a timing attack to correlate the values An advertiser can use this value to determine the broker associated with a single user. This 3rd party also receives quality score G assigned by the broker to its own ads. This the user scores from the clients, and based on this informa- can be done by the advertiser simply creating a client that tion, ranks ads in the same way the broker does in the RaB receives its own ads, and using the known value of B to factor approach. Since, unlike the broker, it does not know which out G. Whether this is a problem needs to be decided by ads were transmitted to the same client, it cannot fingerprint the broker, though we point out that today Google reveals a coarse-grained quality score to its advertising customers.
The product (B × G) to the client also reveals the overall Homomorphic Encryption Variant (RaC,
ranking of an ad to anyone running a client, including the RaB, and Ra3)
advertiser’s competitors. From this, they can also roughly A variation on all three auction designs is to use homomor- estimate the advertiser’s bids. It is not clear that this is a phic encryption (e.g., ElGamal [9]), which allows for multi- problem, for two reasons. First, in today’s advertising sys- plication operations on encrypted data. This may be used tems, an advertiser can see how its competitors rank relative to defend against certain attacks by the broker as described to itself simply by observing how ads are displayed. RaC makes it easier and cheaper to obtain this ranking infor- the user score based on the number of clients. If the broker mation, but does not fundamentally change an advertiser’s predicts too high, then it can lower the minimum user score ability to do so. Second, historically in traditional advertis- and send this to clients, thus causing more clients to show ing (print, TV, radio), advertisers can monitor how much advertising their competitors do, and can generally knowthe cost of that advertising. While certainly all things being equal advertisers would like to keep this information secret, An important aspect of the auction is the scope of the historically the inability to do so has not, for the most part, auction, by which we mean the set of ads that compete in prevented companies from advertising.
any given auction. As a general rule, the more ads that If exposing the product (B × G) to the client is an accept- compete, the higher the CPC. This is simply because the able privacy loss, then RaC should be the preferred auction more ads there are, the more probabilistically likely the next- method for its overall simplicity and lower overhead. If it is ranked ad will have a (B ×G×U ) closer to that of the clicked not acceptable, then RaB and Ra3, which both avoid expos- ad. On the other hand, the larger the auction scope, the less ing this information, may be preferred.
fair it is in the sense that very different types of ads mustcompete. A local pizza store may not wish to compete with Auction Properties
In this section, we discuss the various shortcomings of each The auction scope for search or contextual systems like of the approaches with respect to the auction properties, Bing and Google is the set of ads whose keywords match that especially ranking results and revenue.
of the search or web page. Today’s ad exchanges, where ad-vertisers bid in real time, typically for adboxes on premium publishers, have a potentially much broader scope because There are several potential delays in the non-tracking ad- any advertiser can bid. The auction scope in a non-tracking vertising systems that can change both the rankings and the advertising system is tunable. It may be all ads in a client, or computed CPC. With RaC, there is a delay between when all ads within an ad type (i.e. an interest). What’s more, in- the ad was transmitted and adbox time when the ranking terests may be hierarchical (sports/tennis/clothing/shoes), takes place. With RaB and Ra3, there is a delay between and may be more general or more specific, thus allowing for when the ranking occurs and adbox time when the ranking substantial flexibility in auction scope.
is actually used. In either case, the bid B or the brokerquality score G used for ranking may no longer be correct, and an out-of-date ranking takes place.
In this section, we relax our assumption of honest-but- During the design of the auction approaches, these delays curious players, and consider a number of malicious attacks were a major concern. As it turns out, at least for the auc- tion data from Bing search advertising auctions (Section 6),the delays have only a minor impact on both broker rev- enue and advertiser costs, even when the delay is several The client can commit a form of click fraud by lying about hours or a day. Nevertheless, this may not be the case for the value of ((Bn × Gn) × (Un/Uc)) (RaC) or (Un/Uc) (RaB other systems or future systems, and so it remains important or Ra3). By inflating or deflating these values, the client that these delays are engineered to be minimal. This could can cause advertisers to pay more or less, and cause pub- be done, for instance, by having clients frequently request lishers to earn more or less. At a high level, this is very similar to normal forms of click fraud that occur today, andin this sense our auctions do not allow fundamentally new forms of click fraud. Privad describes how to defend against A problem encountered by non-tracking advertising is that click-fraud even with anonymizing brokers [12]. The same the broker does not know which clients are the best clients method may be used here. The basic idea is that the proxy to send an ad to. For instance, suppose that some number tags reports with a per-report unique identifier. If the bro- of clients M have requests ads for watches. The broker does ker suspects click fraud, it informs the proxy of the report not know which clients may be interested in cheap watches, ID of the suspicious report. If a given client is suspected and which in expensive. The advertiser, however, might not more times than some threshold, its reports can be tagged have enough budget to pay for all the clicks that would result by the proxy as coming from a suspected client. In some if all watch ads are sent to all interested users.
cases the broker may suspect click fraud simply because the Lets assume that the broker knows the clicks per 2nd price is impossibly high (i.e. higher than the 1st price delivered-ad rate. From this, it can determine the number of bid). In most cases, however, the broker may suspect click clients N that should receive the ad without exhausting the fraud through statistical analysis of the reports for given advertisers budget. If it randomly chooses N clients among the interested clients, then it will not be sending all ads tothe most interested clients.
Proxy fingerprints client user scores and re- One way to solve this problem is for the broker to go ahead and send the ad to all interested clients, but to also send It is difficult but conceivable in RaB and Ra3 that the a parameter giving the minimum user score that a client proxy could determine user profiles through observation of must have in order to show the ad. This way, only the the client user scores and rankings. For instance, the proxy best matching clients will show the ad. The broker may could establish a number of fake clients that pretend to have be able to establish the expected click per delivered-ad rate various profile attributes, and establish fingerprints of the re- for various user scores, and therefore predict the setting of sulting user scores and rankings. One way to do this might be to determine (B × G) given user scores and correspond- can manually update the bid; second, the ad network can ing ad ranks, and use these values as the fingerprint. The automatically update the bid (as directed by the advertiser); proxy could then compare these fingerprints with the cor- third, a 3rd-party may update the bid on the advertiser’s responding fingerprints of real clients. It could be that the behalf. Each of these has different churn characteristics: signal-to-noise ratio is high enough to successfully pull off Advertiser: Manual updates, we believe, cause very little this attack. One way to prevent this would be to encrypt churn since they are reactive over a long feedback cycle.
user scores and rankings. The user scores could be encrypted Advertisers receive updated campaign information (i.e. how using the brokers (RaB) or 3rd-party’s (Ra3) public key, and many clicks, actual amount charged, budget left) at fairly the rankings could be encrypted using symmetric keys cre- coarse intervals (few times a day). This limits the number ated by the clients and conveyed securely to the broker or of informed changes to their campaigns.
3rd-party. These symmetric keys would be frequently mod- Ad Network: The advertiser can invoke functionality pro- ified to prevent the broker or 3rd-party from linking user vided by the ad network to optimize his bidding strategy.
scores with the same client, and possibly launching a simi- For example, the ad network may allow the advertiser to set a preferred rank (e.g. position 4), and the ad networkautomatically lowers or raises the bid to satisfy the request Broker manipulates [Bn × Gn] (RaB, Ra3) based on the market. Other examples may include auto- A malicious broker could launch an attack on a non- matically modifying bids to meet a target number of im- tracking advertising system to identify clients by inserting pressions per day (while still being charged only for clicks), unique IDs into the encrypted fields [B, G] or [B × G]. Once or modifying bids based on time of day etc. Some of this a client is identified in this way, unlinkability is lost, and the functionality (e.g. modify bids based on time-of-day) can be broker can build up client profiles. The broker can then po- implemented in the client and would therefore not result in tentially identify the client through external means. There any added churn. Other functionality (e.g. preferred rank) is some cost to this approach, as the broker must “waste” ads tends to be implemented today as a periodic update (once to do the tracking2. The homomorphic encryption variant described in Section 4.4 defends against this attack. Because 3rd Party: Search Engine Optimization (SEO) companies the client multiplies the received encrypted fields with other optimize their client’s bidding strategy in real-time [6] e.g.
fields, the values generated by the broker are obscured.
based on trending terms, real-time click-through rates, etc.
This could potentially result in high bid churn, however, due Discussion
to the premium nature of these SEO services, only a small RaB, which was proposed in [12], is a weak system be- cause it opens up a fingerprinting attack at the broker. Ra3 Aside from changes in B, G can also change. Recall G solves this problem, though at the expense of requiring yet in our model is a function of what the broker knows: G is another administratively distinct and non-colluding system.
computed based on the ad (past CTR, landing page quality, Nevertheless, we consider it to be better than RaB.
etc.). G is largely a property of the ad itself, which we If exposing the (B × G) product to the client is not a don’t expect to change quickly or dramatically. In any event, problem for the broker and advertisers, then RaC is better our Bing auction trace unfortunately does not allow us to than Ra3 because it is simpler, incurs less overhead and validate our assumption since it does not isolate user-derived latency, and does not require the 3rd party. In addition, it components of G from other components.
has no issues here with respect to user score churn, becauseranking takes place at ad view time. If exposing (B × G) How Does Churn Affect Auctions?
is a problem, however, then Ra3 appears to be a reasonable Today auctions take place at the time when an ad is dis- played to the user; ranking and CPC calculations can im-mediately reflect any changes in B or G. Privacy preserving EFFECT OF CHURN
auctions described in Section 4 are limited in terms of how Section 5.2 describes how various delays in all three auc- fast new B and G information can be incorporated. Since G tion systems may distort rankings and CPC computation.
does not rapidly change over time or can be engineered to How detrimental this delay is depends on how much churn remain relatively stable (e.g., using U to reflect short-term there is. Churn may affect rank, CPC value, and ultimately changes in click probability), the main source of churn is revenue. RaC is affected by churn in B and G, while RaB the changes in B. To understand the effects of churn in B and Ra3 are additionally affected by churn in U . In this values, we simulate auctions that use stale B information section, we use trace data from Microsoft’s Bing advertising for ranking and CPC computation, and then compare the platform to study in depth the effect of auction delays on B resulting ranking and CPC computation with auctions that and G from both the advertiser and broker perspective. We find that, while churn exists, it has only a negligible impacton broker revenue and advertiser costs.
For our trace driven simulations, we sampled around 2TB What Causes Churn?
of log data from Bing’s auction engine spanning a 48 hour B × G for an ad changes when either B or G changes.
period starting September 1, 2010. The data covers over B can change in one of three ways: first, the advertiser 150M auctions for over 18M unique ads shown to North 2One might argue that the same attack can be launched American Bing search users across all search topics. The simply be creating unique Ad IDs transmitted in the clear.
trace record for an auction lists all the ads that participated However, this attack can at least be detected by third par- in it (whether the ad was ultimately shown or not), the bids ties, for instance running honey-farms of clients.
corresponding to each ad, the corresponding quality scores, and which if any of the ads were ultimately clicked by the Methodology
We re-compute auction rankings and the CPC for each auction in our dataset using stale bid information; we varystaleness from 1 minute to 2 days.
Auction rankings are re-computed using bid and quality data from the trace. Since our trace does not show when the advertiser updated the bid, we infer the time based on multiple auctions that a given ad participates in. If the bid for the ad is the same for two consecutive auctions, we inferthat the bid did not change during that interval. If the bid is different, we infer that the bid changed sometime between the two auctions; we use the mid-point as the time of change.
To simulate an auction at time T with stale information fromd minutes ago, we simply use the bids current as of time T − d in our trace. The quality score in the trace is based onuser features (e.g. search query), which correspond to U inprivate auctions; since the client always has the current value advertisers or SEOs, may for instance, attempt to predict of U we use the same quality score for simulated auctions as what bid they might want to make 1hr hence, and enter it into the system well in advance. Advance bidding would re- CPCs are re-computed based on the re-computed auction duce the effective staleness of information. For our purposes, rankings (using the second-price formula of Equation 2). In we assume the bidding strategy does not change.
other words, for an adbox at time T in the trace, we computethe ranking based on bid values recorded at time T − d Simulation Results
and populate the adbox using resulting ranking. If the user Overall our simulations show that there is no appreciable clicked on an ad in this adbox, the bid of the next lower change in broker revenue for using stale bid information; ranked ad Bn that we use in the CPC computation is the even in the most conservative cases, the revenue is within ±0.1% of today. For advertisers, while stale bids affect their One limitation we face is that we cannot predict the auction rankings, they do so in a balanced manner with cases change in user behavior when auction rankings change. Con- of higher-than-today rank canceling out cases of lower-than- sider, for example, two ads A1 and A2 where in the trace today rank resulting in zero net change.
they are ranked 1 and 2, while in the simulated stale auction Figure 3 plots the change in broker revenue compared to they are ranked 2 and 1 respectively. If the user clicked A2 today as a function of the staleness of information used and in the trace, what might we expect the user to click in our the user model. The x-axis varies the stateless of bids from simulation? One option is to model the user as clicking the 1 minute to 2 days. The box-and-whisker plot varies the same ad he clicked in the trace; thus in this case the user user model with the top whisker showing the outcome where clicks A2 in the simulation. Another option is to model the the user clicks the same position, and the bottom whisker user as clicking the same position he did in the trace; in this showing when the user clicks the same ad; the top edge of case the user clicks position 2 (A1 in the simulation). In the box shows 75% same position and 25% same ad, and reality, the user model is neither of these two extremes — vice versa for the bottom edge of the box; the line in the it is well-known that both ad content and rank effect click- through rates (CTR) [10]. To account for this, we simulate The first observation we make from Figure 3 is that under 5 user models: 1) same position, 2) 75% same position and a 50-50 user model, change in revenue is practically 0% even 25% same ad, 3) 50%-50%, 4) 25% same position and 75% with bid information as stale as up to 12 hours. Under the same ad, and 5) same ad. Thus we establish an envelope of 75-25 and 25-75 models, the change is almost always between possible user behavior to get a sense of the upper- and lower- ±0.05%, and only in the extreme cases 100-0 or 0-100 does bounds of our simulation results. Note that always clicking it pass ±0.1%. More importantly, the change increases very on the same ad is a strictly conservative estimate. This is gradually. This is good news since it means a private adver- because an ad that was clicked in the trace but is not shown tising system would not have a hard delay deadline beyond to the user in our simulation (due to being ranked too low) which there would be disproportionate change in revenue.
would not get clicked; at the same time, under the same-ad Instead the system can strive to do the best it can, and re- model, an ad that was not shown in the trace (due to being duce revenue change proportionally. The extremely gradual ranked too low) and was therefore not clicked would have rate of change also means that system design trade-offs can no chance of getting clicked even if it were to be shown to be biased towards scalability and other engineering goals the user in the simulation. This asymmetry biases the sim- without much concern to revenue since it changes very little ulation towards fewer clicks (and therefore lower revenues).
The only user model immune to this limitation is the same- At first blush the effect of the “same-ad” user-model ap- pears to be to reduce the revenue, but this is deceptive. As A second limitation we face is that we cannot predict how mentioned earlier, the more the user clicks on the same-ad advertisers would change their bidding strategy in response (going from 0% to 100% from the top whisker to bottom), to auctions being based on stale information. Enterprising the more biased the simulation is towards fewer clicks and COMPUTING USER SCORE
So far, we have assumed the existence of a user score U that, when multiplied with the quality score G produces the expected click probability at the client for a given ad. Be- cause clicks are relatively rare, it may be difficult to estimate U at the client based purely on the click history of the client.
Therefore, we require that the broker anonymously and un-linkably gathers detailed click statistics from clients in order to improve click probability estimates at individual clients.
In what follows, we sketch out an approach.
There are a number of measurable attributes X = {x1, x2, .xL} at the client that may help in prediction ofclick probability. For instance, the level of interest (high Average fraction of auctions affected for advertiser (%) or low) in the ad’s product or service, the quality of the match between the targeting and the user, the context ofthe webpage, as well as the user’s historic CTR. The idea Figure 4: Fraction of auctions with modified rank- is that each client reports this information anonymously to the broker for each ad that it views and clicks. These re-ports contain: {Ad-ID, X, click}, where X is the values ofthe attributes, and ‘click’ indicates whether or not the ad therefore less revenue. Recall that only the top-whisker is was clicked. Given this information from many clients, the broker can determine the effect of the attributes on click The second observation we make from Figure 3 is the probability, and convey this information to the client as a slight upwards trend of the top-whisker signifying higher function f of the attributes such that U = f (X), along with revenues as more stale information is used. This suggests the ad. This allows the client to compute U by measuring a consistent trend of advertisers (as a whole) reducing their the attributes and plugging them into the provided function.
bids over time. We don’t know the cause of this trend.
As mentioned, U in RaC can be computed at viewing time Next we turn to the advertiser perspective. We compute with the latest set of attributes without churn issues since for each ad the fraction of auctions where the user-visible that’s when the ranking takes place. The function f for a simulated ranking increased or decreased compared to the new ad can be initially set to that of similar existing ads trace, and whether the ad became visible or invisible due until enough data for the new ad is gathered. The details of to being ranked high-enough or too-low as compared to the trace. Figure 4 plots the average of these numbers across One concern is that the set of attribute values X is unique all ads as a function of the staleness of bid information cern. First, the attributes may be fairly coarse-grained, thus We first observe that both increased ranks and decreased broadening the set of users to which they apply. Second, ranks are roughly equal, and so average to nearly zero. The some of the attributes may be hard to correlate using ex- same is true for ads becoming visible or invisible. While this ternal knowledge, such as the user’s CTR. Third, attributes is consistent with the revenue change in Figure 3 averaging like level of interest change from interest to interest, and out to zero, we note that there are other ways the revenue even within an interest over time, and therefore are hard to could average out to zero while being unfair to advertisers.
link to the same user. Fourth, some attributes are not spe- For instance, fewer increased ranks could have been com- cific to the user, for instance webpage context. Finally, the pensated by more cases where the ad became visible thus only information beyond the attribute values that is leaked still resulting in zero revenue change while being unfair to is the ad viewed. In particular, the user’s click-stream is not the advertiser; luckily, this is not the case.
exposed. We believe that it is reasonable to establish public We observe next that there is a very small impact of stal- policies that determine the nature of the attributes in such eness on change in ranks; it begins with around 12% of a way that meaningful privacy is preserved.
auctions for 1 minute stale data, and quickly converges toaround 16%. The reason this number is high is because ofthe cascade effect — if a single ad jumps from a low rank RELATED WORK
to a high rank, it causes all the ads in between to register a There is a substantial body of work on cryptographic pro- “change” in rank; thus a single change in bid can affect up tocols for privacy preserving auctions. Depending on the un- to ten ads. The impact, however, is very little; the ad jump- derlying security model these proposals can be classified into ing from low to high might register a change of 10 ranks, the following three categories. In the first category, there are however, the other 10 ads would register a change of only 1 protocols that rely on computation that is distributed among rank each (not captured in the graph). Overall we found a auctioneers who jointly determine the outcome of an auction median net change of 1 rank for every 820 auctions the ad using threshold multi-party computation (e.g., [14, 15, 22]).
The second category of protocols introduces a semi-trusted To summarize, based on extensive simulations across vary- third party, aka an “auction issuer” or “auction authority”, ing degrees of staleness and different user-models, there is in addition to the auctioneer, and uses asymmetric multi- little impact on broker revenue as compared to today, and party computation technique, such as Yao’s garbled circuit little impact on advertiser fairness as compared to today.
(e.g., [1, 2, 5, 19, 21]). Finally, protocols in the third categoryallow bidders to cooperatively compute the auction outcome without relying on any trusted third party (e.g., [3, 4]). The on Financial cryptography, pages 223–238. Springer-Verlag, primary goal of all these proposals, and many others not cited, is to keep bids and selling price secret from the auc- [4] F. Brandt and T. Sandholm. Efficient privacy-preserving tioneer and other auction participants. The problem that protocols for multi-unit auctions. In Proceedings of the 9thinternational conference on Financial cryptography and we address in this paper is different. This paper is primarily Data Security, pages 298–312. Springer-Verlag, 2005.
concerned with protecting the user, not the bidder (i.e. ad- [5] C. Cachin. Efficient private bidding and auctions with an vertiser). Indeed the user does not exist in the prior work.
oblivious third party. In Proceedings of the 6th ACM In any event, the high computational and communication conference on Computer and communications security, complexity imposed by aforementioned secure auction pro- tocols make them impractical for our problem.
[6] S. Clifford. Instant Ads Set the Pace on the Web. The New York Times, 2010.
[7] DoubleClick. DART for Advertisers. http: //, [8] B. Edelman, M. Benjamin, and M. Schwarz. Internet This paper addresses the challenge of designing an on- Advertising and the Generalized Second-Price Auction: line advertising auction for a non-tracking advertising sys- Selling Billions of Dollars Worth of Keywords. American tem that leverages the user profile information while keep- Economic Review, 97(1):242–259, Mar. 2007.
[9] T. ElGamal. A public key cryptosystem and a signature ing the user profile private. We broadly explore the design scheme based on discrete logarithms. Information Theory, space, proposing three types of auctions, and analyzing their IEEE Transactions on, 31(4):469–472, 2002.
properties with respect to privacy, auction quality and vul- [10] J. Feng, H. Bhargava, and D. Pennock. Implementing nerability to attack. Overall, we find that two of the sys- sponsored search in web search engines: Computational tems, Rank-at-Client (RaC) and Rank-at-3rd-Party (Ra3), evaluation of alternative mechanisms. INFORMS Journal are very acceptable designs. RaC is simpler and more effi- [11] Google, Inc. AdWords: Advertise Your Business on Google.
cient, but has the drawback that information about ad qual- ity and bid is leaked. On the other hand, this is information [12] S. Guha, B. Cheng, and P. Francis. Privad: Practical that can today be determined by placing ads and monitoring Privacy in Online Advertising. In Proceedings of the 8th the resulting ranking. Finally, noting that our auction de- Symposium on Networked Systems Design and signs suffer delays that cause out-of-date bid information to Implementation (NSDI), Boston, MA, Mar 2011.
be used in rankings, we use Bing advertising system auction [13] S. Guha, A. Reznichenko, K. Tang, H. Haddadi, and P. Francis. Serving Ads from localhost for Performance, trace to determine the effect of these delays. We find the Privacy, and Profit. In Proceedings of HotNets ’09.
effect to be very minimal, and so conclude that our auction [14] H. Kikuchi. (M+ 1) st-Price Auction Protocol. In Proceedings of the 5th International Conference on As future work, we plan to implement the auction sys- Financial Cryptography, pages 351–363. Springer-Verlag, tem to operate with a medium-scale deployment of Privad planned for next year (several 10’s of thousands of users).
[15] H. Kikuchi, S. Hotta, K. Abe, and S. Nakanishi.
Distributed auction servers resolving winner and winning Along with this, we plan to design and deploy mechanisms to bid without revealing privacy of bids. In Parallel and compute U , and measure their effectiveness. We also plan to Distributed Systems: Workshops, Seventh International do a measurement study of ads served by Bing to determine Conference on, 2000, pages 307–312. IEEE, 2000.
to what extent advertisers can reverse-engineer each other’s [16] B. Krishnamurthy and C. Wills. On the Leakage of bids in today’s systems. This will quantify how much adver- Personally Identifiable Information Via Online Social tiser privacy loss is incurred by the Rank-at-Client scheme.
Networks. In Proceedings of WOSN ’09.
[17] B. Krishnamurthy and C. Wills. Privacy diffusion on the Each of the non-tracking advertising schemes so far proposed web: A longitudinal perspective. In Proceedings of the 18th makes the tacit assumption that there is only a single bro- International Conference on World Wide Web (WWW ker, and a single profiler operating at each client. We are interested in exploring what happens if there are multiple [18] D. Levin, B. Bhattacharjee, J. R. Douceur, J. R. Lorch, brokers with competing profilers in each client. In particu- J. Mickens, and T. Moscibroda. Nurikabe: Private yet lar, the profilers should be able to dynamically compete for Accountable Targeted Advertising. Under submission.
Contact [email protected] for copy, 2009.
ad boxes in real time, thus adding a new element to the auc- [19] H. Lipmaa, N. Asokan, and V. Niemi. Secure Vickrey tion that is not unlike the way ad exchanges operate today.
auctions without threshold trust. In Proceedings of the 6th What’s more, these clients may also need to compete with international conference on Financial cryptography, pages existing tracking advertising systems in real-time auctions run by existing ad exchanges. We plan on designing and [20] Microsoft, Inc. Start advertising on Yahoo! Search and Bing.
[21] M. Naor, B. Pinkas, and R. Sumner. Privacy preserving auctions and mechanism design. In Proceedings of the 1st REFERENCES
ACM conference on Electronic commerce, pages 129–139.
[1] M. Abe and K. Suzuki. M+ 1-st Price Auction Using Homomorphic Encryption. In Proceedings of the 5th [22] K. Sako. An Auction Protocol Which Hides Bids of Losers.
International Workshop on Practice and Theory in Public In Proceedings of the Third International Workshop on Key Cryptosystems, pages 115–124. Springer-Verlag, 2002.
Practice and Theory in Public Key Cryptography, pages [2] O. Baudron and J. Stern. Non-interactive Private Auctions. In Proceedings of the 5th International [23] Stanford. Do Not Track Universal Web Tracking Opt-Out.
Conference on Financial Cryptography, pages 364–378.
[24] V. Toubiana, A. Narayanan, D. Boneh, H. Nissenbaum, [3] F. Brandt. Fully private auctions in a constant number of and S. Barocas. Adnostic: Privacy Preserving Targeted rounds. In Proceedings of the 7th international conference Advertising. In Proceedings of NDSS ’10.


• Highly adaptable to meet ever-changing A broad range of data col ection and platform components including the latest Windows Mobile 6.1, and Cisco® Compatible Extensions (CCX) certification, ensuring wireless networks, to provide the robust • Latest generation of imaging technology performance metrics, simplify IT support and control operating costs. extend enterprise applica

Microsoft word - primaryhyperparathyroid

Primary Hyperparathyroidism By Associate Professor Terry Diamond, Department of Endocrinology, University of New South Wales What is Primary Hyperparathyroidism? This is a syndrome comprising of an overproduction of parathyroid hormone (PTH) by one or more abnormal parathyroid glands resulting in hypercalcaemia (an elevated serum calcium level). Normal Parathyroid Anatomy: The para

Copyright © 2010-2014 Drug Shortages pdf