Notice: This blog is no longer updated. You may find a broken link or two

You can follow my new adventures @mikeonwine

RTB Serving Speed

October 18th, 2009

One of my readers posted the following comment on my first post on RTB:

In your second diagram you show the interaction between the publisher adserver and multiple networks. Does this potentially multiple source back and forth not slow down the adserving in the same way a series of dumb redirects would? Especially when you consider that presumably if Network 1 came back with the best price out of 3 or four networks, once the publisher ad server knew that it would need to go back to it and request the actual ad again. It would be interesting to see some realistic HTTP traces for this stuff.

This is indeed a great question. Technically it looks like there are the same # of requests going back and forth in RTB versus a traditional ad-call. Although this is the case, RTB is going to be significantly faster… and here’s why.

Technically a browser downloading content from an adserver is a five step process:
* DNS lookup of the adserver domain name
* Establishment of a TCP connection
* Requesting content
* Acknowledge of request & sending back content
* Terminating the TCP connection

Assume for this case that a DNS lookup takes about 100ms. Each of these steps requires a number of packets to go from the local computer up to the adserver and a series of response packets. Here’s the # required for each step:

* TCP Connection: Two packets up, and one packet down (SYN, SYN-ACK, AKC)
* Requesting content: One packet up (minimum)
* Request acknowledgement and content: One packet down (minimum) & one packet up
* Terminating the connection: One packet

So the minimum number of packets sent back and forth is 7. If the latency from an end-user is 50ms to the adserver, this means it will take *at least* 450ms (100ms DNS + 350ms ad-request) to request the ad.

Now you’d think this would be the same for real-time, but it’s not! There are three reasons a request between two serving systems is much faster:
* Better connectivity — Adservers are hosted in datacenter that generally have much better internet connectivity than the average end-user. This means lower latency between the two adserving systems.
* No DNS lookup — The RTB system can cache DNS lookups for all RTB partners, effectively removing this 100ms.
* Persistent TCP connections — Any intelligent RTB integration would use persistent TCP sessions between the sell and buy side systems. This means a connection is established once and reused thousands of times after that.

With the above three, here’s how requesting a “bid” looks from sell to buy side:
* Requesting content: One packet
* Acknowledge of request & sending back content: One packet

So assume 25ms latency between systems (rather than 50) and the minimum time for an RTB request between systems is only 50ms compared to the 450ms it would take for an actual end-user or 9 times faster. The slower the end users connection and the faster RTB will be.

Conclusion — yes, adserving individual requests becomes a little bit slower but the removal of redirects makes the overall process signficantly faster.

For those technically curious, here’s are tcpdumps that prove this.

Browser to adserver:

15:45:04.380042 IP > Flags [S], seq 50484529, win 65535, [...]
15:45:04.397395 IP > Flags [S.], seq 661028066, ack 50484530 [...]
15:45:04.397529 IP > Flags [.], ack 1, win 65535, length 0
15:45:04.397831 IP > Flags [P.], seq 1:1288, ack 1, win 65535, length 1287
15:45:04.424466 IP > Flags [.], seq 1:1461, ack 1288, win 62780, length 1460
15:45:04.424472 IP > Flags [P.], seq 1461:1543, ack 1288, win 62780, length 82
15:45:04.424546 IP > Flags [.], ack 1543, win 65535, length 0

Adserver to adserver with persistent connections:

20:00:10.709152 IP > . ack 1023 win 7154
20:00:10.754844 IP > P 1023:2045(1022) ack 501 win 62780

RTB Part I Followup

September 19th, 2009

Ok, so in the last post I explained what Real Time Bidding (RTB) was. Greg Yardley wrote a response on his blog which made me realize that I didn’t spend quite enough time on the technical implications of RTB. He wrote:

[...] If you’re using real-time bidding, and you’re buying one impression out of a hundred, it’s got to pay the cost of the ninety-nine other decisions not to buy. [...] I think you’ve got to have as close to real-time reporting as possible. [...] Finally, your models have got to account for the data that can’t arrive in real-time, or you’re going to put your efficiency gains at risk. [...]

Add all of the above up, and you may be wishing you stuck with your predefined rule sets.

So I guess I should clarify — building a real time bidder is technically almost exactly as complex as building a performance aware ad-server — it needs all the same pieces:
* Decisioning algorithms
* Targeting & frequency tracking
* Server-side cookie store
* Log Collection & Aggregation Pipelines
* Reporting & Analytics
* Budgeting & Campaign Pacing
* A Trafficking interface

This is certainly not an undertaking to take lightly. Many developer years go into building a scalable serving platform — so yes, hiring the talent and building out the infrastructure to build a scalable bidder is certainly an expensive proposition. But also, note that if you already have an adserver, it likely has a lot of the features listed above.

Put it another way — Real Time Bidding allows you to plugin your intelligence into a massive pool of inventory. If you don’t have any intelligence yet, obviously that will be a significant investment! I’ll talk more about building “bidders” in future posts.

Now to continue the intro on RTB.