Understanding Cross Device Measurement and the User-ID

One of the fundamental new features of Universal Analytics is user-centric measurement. This includes measurement across multiple devices – computers, smart phones, tablets, kiosks, etc. But this change introduces a number of new challenges for analysts and marketers. In order to do cross device measurement you need to understand some of the challenges and limitation. […]

One of the fundamental new features of Universal Analytics is user-centric measurement. This includes measurement across multiple devices – computers, smart phones, tablets, kiosks, etc.

The User-ID feature let's you measure the user journey across multiple devices - and even in stores.

But this change introduces a number of new challenges for analysts and marketers. In order to do cross device measurement you need to understand some of the challenges and limitation. Let’s begin our exploration of cross device data with a discussion about how it works.

How Cross Device Measurement Works

You’ll recall that most analytics tools set an anonymous identifier to measure users. On websites the JavaScript code creates the identifier and stores it in a cookie. On mobile apps the SDK creates the identifier and stores it in a database on the device. (We call this default ID the Client-ID).

We actually discussed this concept in our Google Analytics Platform Principles course. Skip to 21 seconds for details – but the whole video is helpful :)

The User-ID feature lets you override this default behavior. So rather than letting the tracking code create the a Client-ID, YOU can create and use your own identifier. How do you do that?

Well, your business needs to have some way of identifying users. Don’t worry, most businesses do. A CRM system or customer database usually has a User-ID that you can use.

The important thing is that you can create the technology that moves the ID from your database into your website, app or other digital experience where your users interact with your content.

The User-ID value must originate from your systems. It must eventually appear in the tracking code on your site or in your app. The User-ID will then be sent to Google Analytics  with each data hit.

The User-ID value must originate from your systems. It must eventually appear in the tracking code on your site or in your app. The User-ID will then be sent to Google Analytics with each data hit.

In the above diagram the company would need to create code that pulls the User-ID from the database, then passes it through the web servers, and finally places it in the Google Analytics Tracking code that appears on the website.

I know – that seems like a lot of work! But a good tech person can make this happen.

After you add all the necessary code, and set up the User-ID feature in Google Analytics, then the actual id value that you supply is sent to Google Analytics with each hit (see my post on hits, sessions and users to learn about all the different hit types in Google Analytics.)

Then, as Google Analytics processes the data, it groups hits with the same User-ID together. It does not matter if the hits come from a website, mobile app or some other device.

Hits from the same user can be grouped together as long as each hit has the same User-ID.

Hits from the same user can be grouped together as long as each hit has the same User-ID.

In the above image there would be three unique users – one user with a User-ID=1, one user with a User-ID=2 and one user with a User-ID=5. Again – it doesn’t matter where the hits come from (mobile, web, kiosk, etc.).

But what about instances when the User-ID is not present? For example, what if the user is not logged in and we can not retrieve the User-ID? Good question.

In this case, Google Analytics will go back to its default behavior and generate it’s own User-ID (again, this is called a Client-ID, because the ID is specific to the client or device). Obviously this ID can not be used to measure across devices as it will only exist on the device where it is set.

But now we have a scenario where a user might have two different User-ID numbers for a single user? Isn’t this going to have an impact on the data? Aren’t we trying to avoid that? This sucks!

Well – it’s not that simple. Let’s talk about something called Session Unification.

Session Unification

The above scenario is very common. You will not always be able to set the User-ID in Google Analytics – even for known users. The result is some hits and sessions will have a User-ID value, and some with an automatically generated User-ID value.

It's not always possible to send your own User-ID to Google Analytics.

It’s not always possible to send your own User-ID to Google Analytics.

Google Analytics has a feature called Session Unification. When activated, it will unify, or group, hits with the manually set User-ID and hits with an auto-generated User-ID together. This means that Google Analytics can associate some hits that were received prior to setting the User-ID.

Here’s the definition of Session Unification from the Google Analytics Help center (I think it’s pretty good!):

Session unification allows hits collected before the User-ID is assigned to be associated with the ID, as long as those hits happen within the same session in which a specific ID value is assigned for the first time.

This means that Google Analytics will only associate hits collected in the same session AND it must be in the first session where the User-ID is set.

This functionality is sometimes called “stitching” – and it differs from one tool to another.

Google Analytics will not go back in time and stitch every single session from a given user together. I can hear the groans now – and the comparisons to other tools. I hope to write about session unification in a later post. But I think a lot of people that are going to complain about this are missing the point. It’s not “can we stitch data together” it’s “should we stitch data together?”. Rant over.

So how does Session Unification impact the data? Well, to understand that we need to talk about another topic the User-ID View and reports.

The User-ID View

We all know the basic hierarchy of a Google Analytics account. Within your account you can create a number of properties. Under a property you can create a number of views – which were formerly called profiles.

Now you can designate certain views as User-ID views. This means that the view will be filtered and the only data in this view is data that contains hits where you set the User-ID value.

Only a Google Analytics view with User-ID enabled will display information about cross-device users.

Only a Google Analytics view with User-ID enabled will display information about cross-device users.

Obviously this view will have less data than a standard view where the User-ID is not enabled. But the idea is that this view will be able to provide deeper insights into how users who are logged in – a VERY valuable segment of your users – interact with your business across multiple devices.

To view all of the data, hits with and without a User-ID, you would use a standard view.

Let’s also consider the scenario from above – when a manually set User-ID is present in some sessions, but not others. The result is that the data from sessions without your User-ID will not be in the User-ID view.

Only hits that contain a manually set User-ID will be included in a User-ID view.

Only hits that contain a manually set User-ID will be included in a User-ID view.

There are also some other significant differences between views that do, and do not, have User-ID feature enabled.

1. Certain metrics are calculated differently. Obviously if a User-ID view contains different data, then certain metrics will be calculated differently.

For example, the number of Users is calculated based on the number of unique User-ID values. This will provide a fairly accurate view of the number of users. It will probably be less than the number of users in a standard view because that that view will also include users where you do not set a User-ID.

2. Cross device reports. This is the HUGE benefit of a User-ID view. These reports provide some awesome insights into how users access your content from multiple devices. More info about the reports below.

3. Limited date range. When working with a User-ID view you can only change the date range to the past 90 days. This is consistent with the standard 90-day user look-back window in other features, like Multi-Channel Funnels and user segments.

Implementing Cross Device Measurement

Implementing the User-ID feature can be involved, depending on your specific infrastructure. Here’s a brief overview of the process.

1. First, you need to “turn on” the user ID feature for a given property.

2. Second, you need to add the actual user ID value to the data collection. For website, this means you need to modify the JavaScript tracking code. For mobile apps you need to change the SDK.

3. Third, create a User-ID view. This is a special data view that includes new reports.

Let’s get into a bit more detail.

1. Enable User-ID Feature

This is really important to read and understand the terms. For example, it’s important to note that you can not use personally identifiable information as the User-ID. This includes an email address, name, etc.

To use the User-ID feature you must read the User-ID policy and agree to the terms.

To use the User-ID feature you must read the User-ID policy and agree to the terms.

You also really need to take a look at your own privacy policy and make sure it complies with the following:

You will give your end users proper notice about the implementations and features of Google Analytics you use (e.g. notice about what data you will collect via Google Analytics, and whether this data can be connected to other data you have about the end user). You will either get consent from your end users, or provide them with the opportunity to opt-out from the implementations and features you use.

2. Implement User-ID in your Tracking Code

Now for the coding! Go get your nerds and Red Bull! Just kidding.

As I mentioned above, the User-ID value must come from you. You must generate the ID from one of your systems. Once you do that you must place it inside the tracking code. The hard part is writing the code that moves the User-ID from your systems and puts it in {{ USER_ID }} in the code snippets below..

Let’s look at some of the most common code formats.

Adding the User-ID to website tracking

Adding a User-ID to the JavaScript code is fairly easy – it’s a single line.

ga('create', 'UA-XXXX-Y', 'auto');
ga('set', '&uid', {{ USER_ID }});
ga('send', 'pageview');

Remember, the User-ID needs to be set before any hits are sent to Google Analytics. So make sure you call the set command before a pageview, event, transaction, etc. is sent.

It’s also recommended that you include the set method on ALL of the pages, not just one page.

See the developer docs for more about JavaScript information.

Adding the User-ID to the Android SDK

t.set("&uid", {{ USER_ID }});

See the developer docs for more Android information.

Adding the User-ID to the iOS SDK

[tracker set:@"&uid" value:{{ USER_ID }}];

See the developer docs for more iOS information.

For both the Android code and the iOS code, you only need to set the User-ID once. Once it is set once the User-ID will be sent with all subsequent hits. But try to set it before any hit is sent to Google Analytics.

Adding the User-ID to the Measurement Protocol

Adding the User-ID to a measurement protocol hit is actually really easy. All you need to do is add the uid parameter in each hit. So a hit might look something like this:

http://www.google-analytics.com/collect?v=1&_v=j16&a=164718749&t=pageview&_s=1&dl=http%3A%2F%2Fcutroni.com%2F&ul=en-us&uid=hsjfy4782jduyth6k4

Adding the User-ID via Google Tag Manager

A quick note that you can also set the User-ID with Google Tag Manager. You’ll find the setting in the ‘More Settings > Fields to set’. You’ll also need to create a macro to pull the actual User-ID value from a cookie or the data layer.

You can set the User-ID value with Google Tag Manager.

You can set the User-ID value with Google Tag Manager.

In addition to adding the User-ID to your data collection code, you must also choose if you want to use Session Unification. See above for more information on session unification.

The second step in setting up the User-ID is to add the actual identifier to the tracking code for your site or app.

The second step in setting up the User-ID is to add the actual identifier to the tracking code for your site or app AND configuring the session unification setting.

Now it’s time to add a User-ID view.

3. Create a User ID View

As mentioned above, a User-ID view is a filtered view of your data. It only includes hits in which you have set the User-ID value. This view also contains reports that show cross device usage and other user-centric metrics.

Please note, this view is in ADDITION to the other views that you have for a property. This means that you will need to configure things like goals, filters custom reports, dashboards, etc. on this new view.

You must create a new User-ID view to see the Google Analytics cross device reports.

You must create a new User-ID view to see the Google Analytics cross device reports.

That’s really it. I don’t want to oversimplify the implementation. But most of the work is really creating the code that pulls your User-ID from your systems and then places it in the correct tracking code.

Data and Reports

We finally made it, let’s look at some data and figure out how we can use this.

Remember, in all of these reports we’re trying to understand the behavior of our users. And we’re not just looking at the behavior of everyone – we’re looking at the behavior of those users that have self identified. This is really important as this group is naturally very valuable.

User-ID Coverage

Let’s start by understanding what percentage of our users are actually logged in.

Remember, you’ll have two different profiles with data. One profile will just be all of the data, the other will be a User-ID profile that only contains information about logged in users.

The Coverage Report identifies the data that has a User-ID associate with it vs. the data that does not have a User-ID.

The User-ID Coverage report shows what percentage of your sessions have a User-ID associated with them.

The User-ID Coverage report shows what percentage of your sessions have a User-ID associated with them.

Remember, you’re probably not going to get 100% User-ID coverage – unless your online experience requires authentication. But this depends on your specific business and your specific implementation.

You can use this data to get a better understanding of how big your data pool is – will you be making decisions on 1% of sessions or 50%?

Device Overlap

Ok this is where we start to get into the really interesting data. Here’s a visualization that shows the device overlap. That means the percentage of users that use various combinations of devices.

Device overlap shows the number of users and the value of users based on combinations of devices.

Device overlap shows the number of users and the value of users based on combinations of devices.

Rather than just looking at how many people use certain combinations of devices, let’s look at the associated revenue from those combinations. Notice that you can change the display using the selector at the top of the chart?

The Device overlap report also includes detailed information about how combinations of devices drive value.

The Device overlap report also includes detailed information about how combinations of devices drive value.

So what’s the actionability here?

Do people who use a certain combination of devices behave differently than others? Are people who use tablet and desktop more valuable than those that use tablet and phone? If so – how do we encourage more of that behavior? Is it via marketing? Changes to the platform?

And don’t forget, you need to add a LOT of context to this data. You need to keep in mind your marketing initiatives along with the user experience that you offer your users on each device.

Device Pathing

Now let’s move on to device pathing. This report shows the device used for a sequence of sessions.

Device Pathing shows the user sequence of devices.

Device Pathing shows the user sequence of devices.

You can look at a specific path prior to, or after, a user action. The action could be a goal conversion, a pageview, a transaction or an event.

You can view the device path prior to, or after, specific user actions.

You can view the device path prior to, or after, specific user actions.

What’s the actionability here? Let’s look at a use-case.

If you have a SaaS business you may offer your users a free trial. In this case the user would create an account for the trial and then use your service. At the end of the trial they would need to upgrade to a real account.

The first thing you could do is look at the user’s device behavior after creating their free trial account. Did they perform any specific tasks on a specific device? Was one device more popular than another? If so maybe you can simplify the workflow on that device.

You could also look at the device path at the end of their trial, when they upgraded to a paying account. Did they perform the upgrade on a certain device? Or, more importantly, was the conversion rate higher on on a certain type of device? If so, you might want to simplify or optimize the process on that type of device.

Don’t forget to look back at the Device Overlap report to understand if a certain combination of devices yielded a more valuable customer.

Also notice that there are a lot of ways that you can configure this report to view different paths. One of the most important things to note is that you can not choose specific instances of each item. For example, if a user generates multiple transactions you can not choose a specific transaction.

You can choose to view the device path before or after various user actions.

You can choose to view the device path before or after various user actions.

My suggestion is to create a goal, page or unique event for the most important user actions – the ones that represent the transition from one phase of the customer lifecycle to another. That way you will always be able to see the device path before and after the action.

Device Acquisition Report

Finally we have the Acquisition Device report. This is similar to the Device Overlap report in that it helps you understand the value of users on a certain device. But the difference is that it shows the value based on the first device type.

Use the Device Acquisition report to understand if users acquired on a certain device generate revenue on the same device or different devices.

Use the Device Acquisition report to understand if users acquired on a certain device generate revenue on the same device or different devices.

What’s the actionability here? Do users acquired on a specific type of device generate more value on the same device or on other devices? If so how can you drive more of that behavior?

Segmentation

One final thing to mention. You may have noticed that you can apply segmentation to all of these reports. Segmentation will work the same in these reports as it does in all other GA reports.

If you create a session based segment then Google Analytics will show all the paths that include a session that meets your criteria.

If you create a user based segment then Google Analytics will show all of the paths generated from users that match your criteria.

Did you make it to the end? I hope this post gave you some insights into how Cross Device Measurement works. There’s going to be a lot of chatter about User-ID and cross device measurement – some positive, some negative. And I have a lot more to say – but this post is long enough!

Universal Analytics: Now out of beta!

We’ve been talking about Universal Analytics for a long time – over a year. In that time Universal has always been in beta because it was not 100% compatible with the existing version of GA. Sure, various parts of the Universal platform have rolled out, like the Measurement Protocol and Dimension Widening, but we were […]

We’ve been talking about Universal Analytics for a long time – over a year. In that time Universal has always been in beta because it was not 100% compatible with the existing version of GA. Sure, various parts of the Universal platform have rolled out, like the Measurement Protocol and Dimension Widening, but we were missing things like Remarketing and Audience data. But no more :)

I’m excited to say that as of today, April 2, 2014, Universal Analytics is out of beta!

Universal Analytics: The next generation of Google Analytics

Let’s run through everything you need to know about the announcement.

100% Feature Compatibility

Universal Analytics now supports all standard Google Analytics features. This includes:

Remarketing with Google Analytics. This is one of my favorite analytics features – and it made me very sad that Universal Analytics did not support it. But that’s in the past – You can now use the remarketing feature with Universal Analytics.

Audience reporting. The audience reports are an awesome way to understand who is using your site. They include data like gender and interest categories. This can be incredible helpful when trying to understand if the correct audience is using your site. Now you can use this feature with Universal Analytics.

Premium SLA Support. For all of those using Google Analytics Premium, all of your standard SLAs now apply to Universal Analytics. This includes data collection, data processing, etc.

Full Google Tag Manager support. Google Tag Manager now fully supports all Universal Analytics features, this includes audience data and the new User ID feature (discussed below).

I’ve said it many, many times – I’m a big fan of tag management. If you are going to migrate to Universal Analytics you might as well migrate to Tag Manager (or any tag management solution) now!

Universal Analytics is Google Analytics – and vice versa. Everything that Google Analytics can do, Universal Analytics can do – and more :)

Cross Device Measurement

In addition to complete feature compatibility, cross device measurement, via the User-ID feature, is now available.

The User-ID feature let's you measure the user journey across multiple devices - and even in stores.

The User-ID feature let’s you measure the user journey across multiple devices – and even in stores.

As you recall, this feature lets businesses use their own User-ID to measure customers across multiple devices. This feature includes some awesome reports to help businesses understand which devices and behaviors generate value. Here’s a quick overview:

Device Overlap: This report can help you identify what types of devices your users use to access your service or content.

The Device Overlap report shows what percentage of users access your content from multiple devices.

The Device Overlap report

Device Paths: This report will show the last five devices that were used prior to a conversion. It’s a bit like the Multi-Channel Funnels report – but for devices.

The Device Path report shows the last five devices that were used prior to a conversion.

The Device Path report

Acquisition Device: This report shows revenue based on the device that generated the first conversion. It’s can help you understand if users on a certain device have a larger impact on revenue.

The Acquisition Device Report.

The Acquisition Device Report

Understanding cross device measurement, and implementing it correct, is a huge topic – way more than I can cover in one post. I’ll be publishing a few other articles that explain cross device measurement in Google Analytics ASAP.

Time-zone Based Processing

In addition to the above features, there’s one more piece that is rolling out today. Google Analytics users can now specify the time-zone where their data is processed. In the past all data was processed in the Pacific Timezone (because that’s there Google is).

But now data processing will occur in the time zone of each data view.

The time zone setting in a view now controls when your data is processed.

The time zone setting for a view now controls when your data is processed.

While most people will not notice a big difference, this is a HUGE improvement for many users in Australia, Japan and other parts of Asia.

This also means that, for some users, automated daily reports will arrive on the correct day!

Do you need to migrate?

Ok, so that’s a brief overview of what’s happening today. But the big question that everyone will ask is, “do I need to migrate to Universal Analytics?”

No, you do not need to migrate to Universal Analytics – at least not now.

However, you need to start planning to migrate.

Universal Analytics is the new platform – all new features will be developed for UA. So if you want to use the new shiny things in the future you need to be on UA.

But migrating t can be a lot of work depending on your specific measurement plan. I’ll address that in another post.

Ok, that’s it for this post. But there is a lot more on Universal Analytics coming.

Hits, Sessions & Users: Understanding Digital Analytics Data

We talk about data every day – sessions, visits, conversions, pages, hits, etc. etc. etc. But sometimes we fail to understand how all of these metrics fit together and where they come from. Let’s take a look at how digital analytics tools organize data. All digital analytics data is organized into a general hierarchy of […]

We talk about data every day – sessions, visits, conversions, pages, hits, etc. etc. etc. But sometimes we fail to understand how all of these metrics fit together and where they come from. Let’s take a look at how digital analytics tools organize data.

All digital analytics data is organized into a general hierarchy of users, sessions and hits. It doesn’t matter where the data comes from, it could be a website or a mobile app or a kiosk. This model works for web, apps or anything else.

Digital analytics data is organized into a hierarchy of hits, sessions and users.

Digital analytics data is organized into a hierarchy of hits, sessions and users.

Sometimes we use the terms visitors instead of users and visits instead of sessions – they’re analogous. The onset of mobile devices (and other devices, like set top boxes) have prompted us to introduce new terms into our vocabulary.

It’s important to understand each piece of the hierarchy and how it builds on the other to create a view of our customers and potential customers. Because, at the end of the day, we need to use this data to evaluate our decisions and look for new business opportunities.

Let’s start at the bottom, with hits, and work our way up to users.

Hits

A hit is the most granular piece of data in an analytics tool. It’s how most analytics tools send data to a collection server. In reality, a hit is a request for a small image file. This image request is how the data is transmitted from a website or app to the data collection server.

All data is sent using a hit. Most hits are actually the request for an invisible image file.

All data is sent using a hit. Most hits are actually the request for an invisible image file.

There are many different kinds of hits depending on your analytics tool. Here are some of the most common hits in Google Analytics:

Pageviews/Screenviews: A pageview (for web, or screenview for mobile) is usually automatically generated and measures a user viewing a piece of content. A pageview is one of the fundamental metrics in digital analytics. It is used to calculate many other metrics, like Pageviews per Visit and Avg. Time on Page.

Events: An event is like a counter. It’s used to measure how often a user takes action on a piece of content. Unlike a pageview which is automatically generated, an event must be manually implemented. You usually trigger an event when the user takes some kind of action. The action may be clicking on a button, clicking on a link, swiping a screen, etc. The key is that the user is interacting with content that is on a page or a screen.

Transactions: A transaction is sent when a user completes an ecommerce transaction. You must manually implement ecommerce tracking to collect transactions. You can send all sorts of data related to the transaction including product information (ID, color, sku, etc.) and transactional information (shipping, tax, payment type, etc.)

Social interaction hit: A social interaction is whenever a user clicks on a ReTweet button, +1 button, or Like button. If you want to know if people are clicking on social buttons then use this feature! Social interaction tracking must be manually implemented.

Customized user timings:User timings provide a simple way to measure the actual time between two activities. For example, you can measure the time between when a page loads and when the user clicks a button. Custom timings must be implemented with additional code.

That’s a lot of hit types!

All hit types are sent to Google Analytics via a tracking code. The tracking code variation depends on what you are tracking. If you are tracing a website then JavaScript code, named analytics.js, generates the hits. If you are tracking a mobile app then an SDK (either Android or iOS) generates the hits. If you are tracking a kiosk, then YOU generate the hits with the measurement protocol.

Regardless of the hit type, the hits are all formatted in a similar manner. They are a request for an invisible image and contain data in query string parameters.

http://www.google-analytics.com/collect?v=1&_v=j16&a=164718749&t=pageview&_s=1&dl=http%3A%2F%2Fcutroni.com%2F&ul=en-us&de=UTF-8&dt=Analytics%20Talk%20-%20Digital%20Analytics%20for%20Business&sd=24-bit&sr=1920x1080&vp=1308x417&je=1&fl=12.0%20r0&
_utma=32856364.1751219558.1391525474.1391525475.1391525475.1&
_utmz=32856364.1391525475.1.1.utmcsr%3D(direct)
%7Cutmccn%3D(direct)%7Cutmcmd%3D(none)&_utmht=1391525534970&
_u=cACC~&cid=1751219558.1391525474&tid=UA-91817-11&z=378275262

For all the nerds out there, the data hits can be sent via a GET request or a POST request. This is really important to know, because the amount of data can change. With a GET request you can only send 2048 characters of data. Technically a post can be any length (it’s a setting on most servers), but it’s around 8000 characters when sending data to Google Analytics.

The information in a hit is transformed into dimensions during processing. Every report is just a single dimension, and the corresponding metrics for each value. that you see in your reports.

Each report in Google Analytics shows all of the values for a single dimension, and the corresponding metrics for each value.

Each report in Google Analytics shows all of the values for a single dimension, and the corresponding metrics for each value.

A quick note on mobile…

The mobile SDKs do not send data in real time. They actually store the hits locally and them send them in bursts. This is called dispatching and it’s used for a couple of reasons. First, mobile devices are not always connected to a network. So analytics must store the hits until it detects a connection and then it sends the hits. Second, sending hits in a bunches can help conserve battery life. Don’t worry, dispatching does not impact session calculations – which we’ll talk about right now :)

Session

A session is simply a collection of hits, from the same user, grouped together. By default, most analytics tools, including Google Analytics, will group hits together based on activity. When the analytics tool detects that the user is no longer active it will terminate the session and start a new one when the user becomes active.

Most analytics tools use 30 minutes of inactivity to separate sessions. This 30-minute period is called the timeout.

A session is a collection of hits. It ends when there has been 30 minutes of inactivity.

A session is a collection of hits. It ends when there has been 30 minutes of inactivity.

Google Analytics, and most tools, use the time between the first hit and the last hit to calculate the time on site. The time between hits is also used to calculate other metrics, like time on page. You can read more in my overview of how Google Analytics performs time calculations.

Most tools let you change the default timeout to better suit your needs. For example, if you have a lot of video on your site you might want to change the timeout – especially if your video last more than 30 minutes.

Why?

If a user is watching a 60 minute video (and by watching I mean that no other hits are sent to analytics) their session will end 30 minutes after the first hit. To insure that the session lasts until the end of the video you could change the timeout to match the longest video length.

OR, a better way to extend the session, would be to send additional hits while the user is watching the video. Think about it – more hits create more data points that can be used to calculate time. Trust me, take 12 minutes to read more about how Google Analytics performs time calculations.

Now that we know that hits are grouped together into sessions, let’s look at how sessions are grouped based on users.

Users

Here’s where things start to get interesting. A user is the tools best-guess of an anonymous person. Users are identified using an anonymous number or a string of characters. The analytics tool normally creates the identifier the first time a user is detected. Then that identifier persists until it expires or is deleted.

The identifier is sent to the analytics tool with every hit of data. Then the analytics tools can group hits (and thus sessions) together using the identifier in the hits.

Make sense?

Sessions from the same user can be grouped together as long as each hit has the same user ID.

Sessions from the same user can be grouped together as long as each hit has the same user ID.

Here’s how users are detected on some of today’s most common digital platforms.

Website Users

To measure a user on a website almost all analytics tools use a cookie. A cookie is a small text file. The cookie contains the anonymous identifier. Every time a hit is sent from the browser back to the analytics server identifier stored in the cookie is sent along with the data.

When measuring a website, the analytics tool usually uses a first party cookie to store an anonymous ID.

When measuring a website, the analytics tool usually uses a first party cookie to store an anonymous ID.

Now let’s have the cookie talk.

Google Analytics uses a first party cookie. A first party cookie is connected to the domain that creates it. A first-party can only be used by the domain that sets it. So on this site, the cookie has a domain of cutroni.com and can only be used by this website.

In Universal Analytics the cookie is named _ga and lasts for two years. In the previous version of Google Analytics the cookie was named __utma.

The good thing about a first party cookie is that almost all browsers will allow a first party cookie. It’s a very reliable piece of technology.

First party cookies are challenging when your site spans multiple domains. When a user leaves your site, and traverses to another site that you own, they do not take their first party cookies. In most situations, unless you configure analytics correctly, analytics will set another cookie when the user lands on the second domain.

Analytics uses a first party cookie to maintain a user ID.

Analytics uses a first party cookie to maintain a user identifier.

Now you have one user with two cookies. That could lead to double counting of users. Plus, if we want to create really cool metrics, like Revenue per user, it becomes very, very hard because we don’t know the true number of users.

The other type of cookie, a third-party cookie, can be set and accessed by domains other than the domain that creates it. Some analytics tools will let you use a third party cookie.

The value of a third party cookie is that the analytics tool can use a third party cookie to identify a user as they move from one domain to another.

A third party cookie can be used by multiple domains.

A third party cookie can be used by multiple domains.

However, third-party cookies are not permitted by most browsers – that means no data.

Google Analytics does not use a third party cookie. You can read all about the Google Analytics cookies in the developer documentation.

So what’s the solution here? How do you correctly identify a user if your website spans multiple domains? In the Google Analytics world we use a feature called Cross Domain Tracking. I’m not going to talk about it in this post, but you can read about it in our support documentation.

Mobile Users

Now let’s move on to mobile platforms – something that is very popular :)

Mobile tracking is similar to web tracking. There is an anonymous identifier stored on the device. The identifier is generated every time the app is installed. So if a user deletes the app the identifier will also be deleted. But if a user updates the app the identifier will not change.

The big difference between mobile and web is that the identifier is not stored in a cookie. It’s stored in a database on the mobile device – but it basically functions the same way as a cookie. The identifier is sent on every hit back to the analytics server. The analytics server then uses the identifier to create metrics like unique users.

Here’s one challenge with user measurement on an app. Many apps are not just an app. They’re a hybrid app/website. They use a browser within the app to “frame” content from a website. This can mess up the data collection.

In this situation we have two technologies with two different user identifiers. The app will measure a user based on the ID stored on the device and the website will use a cookie when a page loads in the app.

Mobile apps that "frame in" content from a website, might be sending duplicate hits to the analytics tool.

Mobile apps that “frame in” content from a website, might be sending duplicate hits to the analytics tool.

There are some ways around this, but it’s a long solution that need it’s own blog post. But just be aware of this potential data issue.

Ok, so now we know about website users and mobile users. But what about other digital touch-points, like a kiosk?

Other Digital Touch-points

In today’s world a user can interact with your digital content on lots of different devices (computers, mobile, kiosks, set top boxes, etc.). And that can cause a lot of issues as tools try to de-duplicate users and get an accurate count of users.

One of the key features of Universal Analytics is the ability to track users on devices other than websites and mobile devices, things like a point-of-sale system or a kiosk. It does this using a technology called the measurement protocol.

But how does it actually work?

The measurement protocol works by – wait for it – collecting hits :) These are the same hits that I described above. The big difference is that you must manually build the hits. So if you want to implement analytics on a kiosk, you must create MORE code to build the hits that are sent to Google Analytics.

But what about measuring users when you use the measurement protocol?

When you create the hit you must insert a user identifier into the hit. Google Analytics will then use this identifier as the unique identifier when it processes the data.

To measure users when tracking other devices, like a kiosk, you must insert your own identifier and generate your own data hits.

To measure users when tracking other devices, like a kiosk, you must insert your own identifier and generate your own data hits.

Unlike websites and mobile apps, there is no cookie or database to store the identifier. So the ID does not persist from one hit to another, or from one session to another. You must manually insert the identifier into every hit in every session.

Your code must control the generation of the identifier and the storage of the identifier.

Let’s end it there. That’s a pretty good overview of digital analytics data.

I know this was a really geeky post, but it’s an important subject and will become more and more important.

Now it’s your turn. Thoughts? Please feel free to leave a comment.