Notice: This blog is no longer updated. You may find a broken link or two

You can follow my new adventures @mikeonwine


How do behavioral networks work?

February 28th, 2007

I really wanted to write a post about the difficulties in monetizing small websites but I realized to best have that discussion I would have to explain behavioral networks first! So, here goes.

A few months ago I was trying to explain to my girlfriend how advertising works. After a couple discussions she said a rather interesting thing — “How come the travel industry seems to be the biggest advertiser out there? All I see are travel ads, no matter where I go!”. Well, turns out my girlfriend is a travel fanatic and it’s behavioral networks that have picked up on the fact and no matter what the content of the site she’s visiting they manage to find a relevant travel related ad to show her. Try it, go to travel.nytimes.com and click through to a couple articles. Then go to a random section of the site, say ‘health’, and chances are you’ll see a travel ad! (see screenshot below) Sounds like voodoo or big brother right? Wrong!

NYT Travel Behavioral
Second behavioral ad on the new york times.

And peeking into my cookies, indeed there is some travel related fun in there (hard to tell what all means, since most cookies are encrypted):

NYTimes Cookie Data

There are many different behavioral players out there in the world each with varied success. Perhaps the most successful behavioral company on the internet is this small company based out of California called Yahoo. There are also a couple ad-networks that dabble in this space, one of the most popular is Tribal Fusion. Each approaches the challenges in a different way, but the core of how it works is the same for all.

So how does it work? It’s actually an amazingly simple concept. Lets say we have four websites:

  1. gizmodo.com
  2. engadget.com
  3. slashdot.org
  4. myspace.com

I am a user, and I visit each of these websites. Each of these websites also works with a brand new (not real of course) behavioral network called “BigBrother2.0″. On every page of these sites there is a little thing known as an ad tag that points to BigBrother’s adserver and requests an ad. Just for fun, lets assume that gizmodo also know your age & gender and passes that information to BigBrother. So you visit the first page, and an adrequest goes to BigBrother as follows:

http://ad.bigbrother.com/request?site=gizmodo.com&age=25&gender=m

When your browser requests the URL above, the following sequence of events happens:

  1. Browser Requests content from ad.bigbrother.com
  2. Server requests cookies (if any)
  3. Browser sends over cookies
  4. Server does some crunching, picks and ad
  5. Server returns new cookie data & url to get get ad
  6. Browser shows ad

Now the magic here is in the cookies. Even though you are visiting the site gizmodo.com, the cookie is under the domain ‘ad.bigbrother.com’. In step 5 where the server returns new cookie data, they’ll put in there when you last visited gizmodo.com and how many times you’ve been there today.

Now lets say I spend the morning checking out my tech-news, all the while receiving ads from ad.bigbrother.com. Each page I view gives them a little bit more information about me, and even though I’m on three separate sites, because the cookie spans all three they get a holistic view of what I”m doing.

So now I go to Myspace.com. This is a big site. Normally it’d be difficult to pick a good ad to show you. You’re one of the 18 billion myspace users and all you really do is upload photos and post silly comments about your friends on their blogs. BUT, when Myspace decides to let BigBrother2.0 show an ad something interesting happens. Lets walk through the steps of this new ad call in more detail:

http://ad.bigbrother.com/request?site=myspace.com
  1. Browser Requests content from ad.bigbrother.com
  2. Server requests cookies (if any)
  3. Browser sends over cookies
    1. Cookie contains your age & gender (25 & male) (encrypted)
    2. Cookie shows you visited gizmodo.com 12 times this morning
    3. Cookie shows you visited slashdot.org 3 times this morning
    4. Cookie shows you visited engadet.com once this morning
  4. Server does some crunching, picks and ad
    1. Server plugs user data (25, male, 16 visits to tech sites) into a ‘profile engine’
    2. Profile engine spits back ‘categories’, lets say ‘male, technology’
    3. Server looks for ad campaigns targeted to ‘male & technology’
    4. Server picks highest paying ‘male & technology’ targeted campaign
  5. Server returns new cookie data & url to get get ad
  6. Browser shows ad

Instead of showing you a random ‘punch-the-monkey’ ad, BigBrother is able to show you a highly relevant and targeted advertisement that you are far more likely to click on. The end result? Better ads for you and more money for the websites that you visit.

You can see from the above that building a behavioral adserver really isn’t the most difficult thing in the world. All you need to do is track history & have some sort of ‘mapping engine’ that says that slashdot.org is a tech site. The real challenge for behavioral companies is the classic chicken & egg problem. Without a rich user-base, the ad-network won’t be able to sell deals. Without deals the network won’t be able to convince publishers to put their tags on pages. This makes it very difficult for new behavioral players to enter the marketplace, although emerging marketplaces such as the RMX will make it easier in the future.

So what’s my opinion on all this? As long as the data that is being stored is encrypted and no Personally Identifiable Information (PII) is stored I’m all for behavioral targeting. As I mentioned before, the more relevant the ad is to my interests, the more pleasant my browsing experience is going to be.

So, time and time again you see people rant and rave about how your privacy is being seriously compromised by the use of cookies. I must say, if you just read around on google, advertiser cookies are one of the most misunderstood beasts out there. Nowadays, practically every spyware removal program flags advertiser cookies as ‘SPYWARE/ADWARE’. Try it, google up tribalfusion cookie, or yieldmanager cookie, etc. etc.

Take this SpywareNuker page:

Some of the Cookie.Advertising.com components are listed below. The list is compiled as a reference. The list might not be complete and it doesn’t represent instructions for manual removal. We DO NOT recommend manual removal. Incorrect removal of certain software might make your computer unstable or even unusable.
Removal of adware component might affect the related ad-supported software.

Sorry to say, but this is total bullshit. Ad.com’s cookie doesn’t infect a thing and can be removed easily and safely by yourself. So what are adserver cookies and what’s stored in them? The most basic information that most adserver will want to know is:

  • which ads you’ve seen, and how many times you’ve seen them
  • which ads you’ve clicked on
  • which ads you’ve converted on

Yes — that’s right, adservers know when you buy things after clicking on an ad — but that’s another post. Now more advanced behavioral companies will also want to track which sites you’ve been to and what behaviors (or segments) you belong to. Allowing adservers to track you using cookies means you won’t repeatedly see the same ad, and you should see more relevant ads to your interests as you continue your path along the internet.

So how do companies store this information? There are two ways — client-side or server-side. Client-side involves storing all the information about you in your cookie whereas server-side the cookie simply stores an ID and all information about you is stored on some database somewhere. Doubleclick as seen in the below screenshot seems to have gone the server-side route and simply stores my “id” which maps to some data they have on me in their databases. The server-side route is great if the advertiser wants to store a lot of data since there are size limiations on cookies and you don’t necessarily want to transfer 10kb back and forth on every ad call. It’s not so great as it requires some serious database infrastructure to handle 10-100k read/writes per second to a single database.

Doubleclick Cookie

Burst seems to have gone the client-side route and stores all ads I’ve seen from them directly in my cookie files. It’s actually interesting to note that the cookie data from Burst is not-encrypted (big nono!).

Burst Cookie Files

Ok, so what does this all mean for the end-user? Well, indeed, there are companies out there that are tracking most of everything that you do online. Ad companies like Tribal Fusion know what you’re interested in and what you like to browse for. Hell, if Google succeeds in pushing Checkout to the world they’ll know everything you’ve bought, how much you paid for it and much more! All in all this sounds scary but it really shouldn’t be. Why?

First off, none of the information _should_ be personally identifiable. Cookies are tied to your computer, not your name. So if you switch machines often, or clear your cookies, all history is immediately erased. Now, some argue that if your browsing history is stored it should be possible to figure out exactly who you are. Lets be honest, why go through all the trouble if there are much easier ways of figuring out what you’re doing online? If you’re worried, go download an anonymous browsing tool (there are plenty), or visit the little known-of ‘opt-out’ pages that almost all online advertising companies have. Some examples: Advertising.com Opt-out, Yieldmanager Opt-out, Doubleclick Opt-out. If you do go the opt-out route, don’t clear your cookies because that’s the only way they’ll know you don’t want them to track you!

So whichever route you take, don’t forget that the websites you visit can provide you with free content because of the money they receive from advertisers. Cookies are one tool that advertisers use to help track revenue and regularly clearing them can cost your favorite sites money. My personal choice? I clear my cookies every month or so but am perfectly happy to let companies track my behavior to show me more relevant ads. Hell, I’d much rather look at an ad for some new tech gadget than ‘punch the monkey’!