How do digital video ads work? Oh my sweet summer child, if only that could be answered simply.

Digital video advertising involves a multitude of layered technologies, companies and concepts that all come together in delivering you the sweetest gift of capitalism, moving images and sound that educate you about shit to buy (video ads). If you are asking yourself this question from a technical (rather than a business) perspective, then you need to start somewhere, and that somewhere is VAST.

For the sake of simplicity, I will be leaving most of the business aspects of an ad transaction out of this post. There are will be other resources on this site available to explain that. This article is intended to simply explain and answer the question: What is VAST?

VAST is used to actually deliver what is going to playback in a video player and into a potential consumer’s eyeballs, and let everyone who cares about that, know that it happened and how it happened. VAST or "video ad serving template” is the standard template used by video players to deliver and playback video ads on desktop, mobile and connected TV. It is formatted as an XML document based on a simple structure that contains within it important metadata that is used to not only playback the ad itself but to also correctly satisfy all required elements of an ad transaction.

Sound nerdy? We are just getting started, Champ.

You might be all worried you don’t know what XML is, that’s ok. Just think about it as a language used to communicate data between computer programs. If you know what HTML is, then it’s kind of like that. It’s a way to structure and deliver data.

A video ad player can request VAST from an ad server by calling an “ad tag” or “VAST tag”. This is a URL provided by an ad server that looks something like this:

http://adtechexplained.com/vast/1234?page_url=examplepublisher.com&ip=192.168.0.1&height=1080&width=1920

Let’s break this tag down a bit to understand what is happening here.

Domain / Server

If you were mailing a letter, you need an address to address the letter. Think of the domain name within a VAST tag like the city and street name of your letter. An ad player makes a call over the internet to a domain name, which then points to an ad server that will ultimately be returning a VAST “response". But much like a city and street name, the domain only provides a general location. Our example VAST tag is pointing the AdTech Explained ad server (which doesn’t actually exist).

ID

Typically there will be some kind of publisher or partner identifier within the tag that will tell an ad server exactly which of their business or publishing partners is making this VAST request. If the domain points to the city and street, the ID points to the exact house.

Parameters

This is how an ad server collects information about the environment a potential advertisement will run on. Think of the parameters as being the actual contents of the letter being mailed. The ad server uses this information to select which ad it should return to this partner.

In our example, we are telling the ad server that the ad is displaying on the world renowned “examplepublisher.com” website, in a player that is 1920 pixels wide and 1080 pixels tall and the user’s IP is 192.168.0.1. This enables the ad server to target a specific website, in this example, all ads targeted to “examplepublisher.com” will be eligible to serve.

Once the ad server selects an ad, it will return back to the video player an XML VAST “response”. Think of calling a URL in your web browser, when you go to a web page you are requesting data from a server (actually many servers) that will return everything your browser needs to display a web page, including HTML, text, images and more. An ad server is simply sending data back in the raw markup language of XML, that a video player then uses to run an ad.

So essentially, the video player is like “Yo what up ad server. I need an ad, here’s what's up” and the ad server is all like, “Hold on man I got you, lemme find something. Ok here you go take this.” The “this” is the XML VAST response.

Once the XML is received, the video player reads metadata contained in the VAST document and extracts the values it requires in order to play back an ad and notify the various ad platforms that may be involved in the ad transaction. The XML document consists of items called “elements" and “attributes". Let’s take a look at a sample VAST response to understand each of these items and how they make up a final digital video advertisement. Follow the link below to view the full example VAST in your browser:

https://github.com/InteractiveAdvertisingBureau/VAST_Samples/blob/master/VAST%204.0%20Samples/Inline_Simple.xml

There are many items within a VAST response, but we are going to focus on some of the most important. If you are interested in learning about all VAST fields and masochism is your thing, download the latest 4.2 specification put out by the IAB at their website.

For now, let’s focus on possibly the most important value in the VAST response, the actual video file, or what is commonly referred to as the “creative”. This is the ad itself and what the user views in the video player. If we comb through our VAST document, we  will see a media file element, which sounds like what we are trying to find. Anything wrapped in less than or greater than characters is referred to as an element, so let’s look at the "<MediaFiles>” element:

<MediaFiles>
    <MediaFile id="5241" delivery="progressive" type="video/mp4" bitrate="2000" width="1280" height="720" minBitrate="1500" maxBitrate="2500" scalable="1" maintainAspectRatio="1" codec="0"><![CDATA[https://iabtechlab.com/wp-content/uploads/2016/07/VAST-4.0-Short-Intro.mp4]]>
    </MediaFile>
    <MediaFile id="5244" delivery="progressive" type="video/mp4" bitrate="1000" width="854" height="480" minBitrate="700" maxBitrate="1500" scalable="1" maintainAspectRatio="1" codec="0"><![CDATA[https://iabtechlab.com/wp-content/uploads/2017/12/VAST-4.0-Short-Intro-mid-resolution.mp4]]>
    </MediaFile>
    <MediaFile id="5246" delivery="progressive" type="video/mp4" bitrate="600" width="640" height="360" minBitrate="500" maxBitrate="700" scalable="1" maintainAspectRatio="1" codec="0"><![CDATA[https://iabtechlab.com/wp-content/uploads/2017/12/VAST-4.0-Short-Intro-low-resolution.mp4]]>
    </MediaFile>
</MediaFiles>```

There’s a lot going on here but let’s break it down. An element starts with the element name (ex. "<MediaFiles>”) and ends with a closing tag in front of that name (ex “</MediaFiles>”). So we can see that there are multiple “<MediaFile>” elements within the main "<MediaFiles>” element. This means that there are several different video files to choose from. Why is this?

The answer lies within the attributes of the various “<MediaFile>” elements. You will see that there are parameters and associated values wrapped within the “<“ “>” symbols like “width”, “height”, “bitrate” and more along with the “MediaFile” element name. These parameters contain values that provide information about each of the creative files that a video player can read and use to ensure it selects the perfectly suited video file for the environment the player intends to play it back in.

For example, if the ad was being played on a mobile device, the player should probably select the smallest video file so it does not eat up your monthly data cap. However, if this was being played back on connected TV in your living room on that sick new UHDOLED curved screen, the video player would want to select the highest quality file available.

To select the right creative, the video player can look at the height, width and bitrate to understand more about the file itself immediately. That’s why attributes are important, they provide extra information about a particular element.

Let’s go ahead and pretend that the user was on desktop since the original request was made from a website, “examplepublisher.com”. If it was made from an app or connected TV you would instead see a bundle ID parameter like “bundle_id=com.myapp.wordswithgiraffes”.

Bundle IDs are like domain names for apps, but let’s stay focused here. Our ad request was on desktop you are watching the video in HD quality, so the video player selects the highest quality video available. The next step is to look between “<MediaFile>” and “</MediaFile>” to find the URL that leads directly to the video itself:

https://iabtechlab.com/wp-content/uploads/2016/07/VAST-4.0-Short-Intro.mp4

The value of the “<MediaFile>” element is a URL that points to an MP4 video file hosted on an external server. If you copy this path, and paste it in your  web browser’s address bar, your browser will play back the video file in its native ad player and you can view the ad, but no one will know, SAD! So now that the player has the creative to play back, how does anyone know it was actually delivered? That’s when tracking events come in!

<TrackingEvents>
    <Tracking event="start">http://example.com/tracking/start</Tracking>
    <Tracking event="firstQuartile">http://example.com/tracking/firstQuartile</Tracking>
    <Tracking event="midpoint">http://example.com/tracking/midpoint</Tracking
    <Tracking event="thirdQuartile">http://example.com/tracking/thirdQuartile</Tracking>
    <Tracking event="complete">http://example.com/tracking/complete</Tracking
</TrackingEvents>

Ad servers and other partner’s can be notified when a specific event has occurred, more specifically there are events for when the ad starts, when it has reached 25% (firstQuartile), 50% (midpoint), 75% (thirdQuartile), and complete.

When any of these events are reached during playback, the video is required to call the URL provided in that specific tracking event. This sends a signal to an ad server or platform that the event has occurred and they can now increment that metric for this specific ad by 1.

Normally in VAST there will be multiple URLs for the same event from different ad tech vendors because there may be many parties involved in an ad transaction and each one is required to count the events for a particular ad if they want to prove it ran and collect that cheddar.

That’s it! You now have the building blocks to form a basic understanding of how VAST works. To summarize, VAST is an XML template that is returned by an ad server when a video player calls a VAST tag. The video player then reads specific parts of the XML document in the server’s response and extracts metadata that is required to facilitate a video ad transaction. The creative is played back and platforms involved in the transaction are notified when specific events occur. Easy peezy!