Guide to Data Export—Walkthrough Building a Feature Usage Dashboard
From raw data to your first dashboard with FullStory Data Export
FullStory's Data Export is a robust CX data solution.
For analysts (or data scientists) requiring more detail around how it works, this supplement "Case Study" will walk through how you can access FullStory data, make sense of the information, transform the data into inputs for data visualization and statistical analysis, and build a "Feature Usage Dashboard."
On completing this walkthrough, you'll have a solid understanding of what's possible using FullStory Data Export.
Not an analyst or data scientist? Consider sharing the Walkthrough with someone who is! Otherwise, buckle up as we explore how the nuanced information provided by FullStory Data Export can power analysis about feature usage.
☞ How to use this walkthrough
Take advantage of the Table of Contents to jump ahead or bookmark different sections, as required.
Get the Data Exported
To begin our walkthrough, we have to get our data out of FullStory. As we mentioned in the Guide to Data Export, there are two methods for getting your data exported out of FullStory:
- Directly download the data (We explain this method immediately below), or
- Utilize our APIs and open sourced tool, Hauser, to connect Data Export directly to your analysis tool (Jump ahead to this method)
We'll walk you through both options—or just you can skip to the one you expect to use. Also as a reminder, Data Export Data comes in JSON format in which a single event will look like something like the image below. A single download will have tens of thousands of these events in the file.
Directly download your data as a JSON file
Once you’ve added the Data Export Pack to your FullStory account, anyone with an "Owner" seat on the acount will be able to access the JSON files for download from within the FullStory. Simply log in and navigate to the Data Exports link in the left sidebar:
Note that you can change the frequency of these exports via your Settings. For large customers (I.e. 15M+ sessions a month), we recommend making your extract frequency 30 minutes. That will keep the file sizes from becoming unmanageable by your computer. For customers with fewer sessions, you can make your time between exports longer—2 hours or so. The app will keep your last 100 downloaded files, giving you plenty of time to download your files into your preferred working format.
Once the data—in the form of a JSON file—has been downloaded directly from FullStory, we'll need to convert it. For that, we suggest one of two methods:
- Use a JSON to CSV converter
- Use Python
We'll walk through each method.
Method 1. Use a JSON to CSV converter on your direct download
There are many JSON to CSV converters on the internet. You can upload a file (usually up to 1MB) or paste the data directly into a web app and it will return data in a tabular format. This method is very quick to use. However, it has drawbacks.
First and most importantly, before you upload any data, make sure the conversion happens in your browser. You don’t want the website or tool to send your customer data to a server.
Second, this process is very quick, but has to be repeated every time you download a file. Unless you are always doing one-off analyses, this can get very tedious.
But if you just bought the Data Export Pack and you want to dive right in, this is a great way to get started with some data and begin building your analytical models.
Method 2. Use Python to convert JSON
Python offers another option for converting your downloaded JSONs to a more readable format. This is how we converted data at FullStory before we began using the API. Below, you will see a simple program that simply iterates over each JSON blurb in a file and writes it to a csv file.
import csv import json #opens the file the JSON data is stored (Make sure you are running this program in the same folder as the .json file you just downloaded from FullStory) j=open('NAME_OF_YOUR_DATA_EXPORT_DOWNLOAD.json') #Loads the JSON array into Variable JS js = json.load(j) #Figures out the number of JSON objects in the array (input to the for loop that iterates through each "blurb") end = len(js) #See how many JSON objects are in a file (should be commented out except for test) #print(end) #creates a csv called 'DataExport_FullStory.csv' and stores it in variable f f = csv.writer(open("DataExport_FullStory.csv", "wb+")) # Writes a header on the CSV file you just created, If you dont need a header in your csv file, remove this line # Also note, you may have custom variables from other apps that you are importing into FullStory. If you are, add them to the list of headers here. These would be things like "LifeTimeSpend" and "ABTestGroup" f.writerow(["IndvId", "UserId", "SessionId", "PageId", "EventStart", "EventType", "EventTargetText", "EventTargetSelectorTok", "PageDuration", "PageActiveDuration", "PageUrl", "PageRefererUrl", "PageIp", "PageLatLong", "PageBrowser", "PageDevice", "PageOperatingSystem", "PageNumInfos", "PageNumWarnings", "PageNumErrors", "UserAppKey", "UserDisplayName", "UserEmail"]) #loops through each json object and writes to CSV file for x in range(0, end): #Writes a line for each JSON blurb f.writerow([js[x]["IndvId"], js[x]["UserId"], js[x]["SessionId"],js[x]["EventStart"], js[x]["EventType"],js[x]["EventTargetText"], js[x]["EventTargetSelectorTok"],js[x]["PageDuration"], js[x]["PageActiveDuration"],js[x]["PageUrl"], js[x]["PageRefererUrl"],js[x]["PageIp"], js[x]["PageLatLong"],js[x]["PageBrowser"], js[x]["PageDevice"],js[x]["PageOperatingSystem"], js[x]["PageNumInfos"], js[x]["PageNumWarnings"], js[x]["PageNumErrors"],js[x]["UserAppKey"], js[x]["UserDisplayName"],js[x]["UserEmail"]]) # Be sure to input your Custom vars at the end!
If you choose Python for conversion, we have some recommendations and tips to customize it to your needs:
- Convert multiple files at a time
- Use conditional statements to only grab specific data (specific sessions/users)
- Use our REST API to “build your own Hauser” and automatically grab and convert files as they are posted to your FullStory Account
- Convert the files to a .txt instead of .csv if you are working with large data sets
- Automatically upload or append these files to the place in your data lake that makes sense
Now that you have converted the JSON file, you can jump to Make Sense of Your Data. Or read on for how you can connect the Data Export Pack to RedShift or BigQuery.
Connect Data Export to RedShift/BigQuery with Hauser
Most of our data export customers use RedShift or BigQuery to process their JSON data. That is why we created an open source tool called Hauser (Available on GitHub), that automatically uploads JSON bundles from the app into a format acceptable for these tools. Hauser leverages our REST API, to grab your JSON bundles as they appear in the app (Note that your export frequency matters here). Accordingly, data is exported to RedShift/BigQuery is in batches, and not a live stream.
- Implementation requires a working knowledge of Go (and a version of GO 1.7 or higher). That said, we have had users install it without having ever used Go before.
- You will have to be familiar with your GCS or AWS security instance—primarily capable of configuring your credentials in order to import data into BigQuery or RedShift
Common troubleshooting questions
We have supported AWS installation for longer than BigQuery, so the common troubleshooting problems below primarily work for AWS. However, you can see how these would also apply to BigQuery.
- By default, the AWS libraries (used in Hauser) will try to pick up credentials from
~/.aws/credentials, or you can explicitly set
- If you run Hauser on an EC2 instance (Most AWS users are likely to do this), and assuming you're using a standard machine image, EC2 will put the credentials in
~/.aws/credentials file, which means Hauser will just work out of the box.
- Be sure to make all table names in the config lowercase.
More information about Hauser, including the link to GitHub, can be found here.
Make Sense of Your Data
Now you should have your data in a format and tool that you feel comfortable with. Next it’s time to make sense of all the data.
Below, you’ll find steps you can take to tie event data to session replays in FullStory and transform your data for a few “quick wins.” When reading this section, you may find it helpful to have this documentation of all the attributes included in Data Export and their meaning.
Because Python and SQL are the languages we use most often on our data analytics team, the documentation here is in these two languages, but you can convert our logic into any framework you choose.
Uniquely identify a session and a user
Before we demonstrate how to identify a session in FullStory directly, let’s quickly review the data structure of the Data Export Pack. There are Events, Sessions, Users, and Individuals (see diagram below) that are all represented in each JSON blurb.
A Session is made up of a group of Events, and maps directly to a replay in FullStory. A User is identified by the cookie in the client’s browser. If a User clears the cookies, the next time he/she visits your website, FullStory will generate a new
UserID. That is why we have the notion of Individuals in FullStory. A group of unique
UserId’s can map to the same
IndvID. This is so if a person logs in on multiple devices, you can track all their Sessions and Events back to the single Individual.
From Individual, User, and Session ID to replay
Now that we have data taxonomy out of the way, we can discuss uniquely identifying a session. A session is not uniquely identified by only the
SessionID. You need both the
SessionID. Once you have both of these variables, you can easily tie the events you are looking at in the Data Export Pack to a single Session in FullStory. The URLs for all of your session replays are made up of the
SessionID. Therefore, if you identify events in your data analysis and want to tie them back to session replay, you can find the correct URL to type into your browser and the exact time in the replay based on the
EventStart field in the data.
If you don’t want to go directly to an exact session, you can also jump to the User profile using the
IndividualID (As seen below). That will bring up the user card with a list of every Session in your data history recorded by that user.
Incorporating custom variables
We’ve shared how to incorporate custom variables in the Guide to Data Export and some examples of applicable data to pass in. At this point, we will dive into interpreting the custom data alongside the other Data Export Pack attributes.
Analyzing custom data is particularly powerful because you can monitor how the custom variables change for a User over the course of their lifetime with your website.
Use Data Export to identify website and app interactions
There are several ways to link interactions on your webpage to Data Export events. The attributes of the events that will be most helpful to you are:
EventType— Click, Navigate, or Change (i.e. changing the text in a drop down)
PageURL— The URL the user is on when the event takes place
EventTargetText— The text visible to the user on the element he/she is interacting with
EventTargetTok— The underlying CSS / HTML element the user is interacting with
In most cases, you will need logic around several of these attributes to properly identify an event. For example, you may need a ‘click’
EventType on the ‘Add to Cart’
EventTargetText on the
PageURL of your most popular product.
Many customers ask about the difference between
EventTargetTok since they can be interchangeable in many instances. The
EventTargetText field can be helpful because CSS Selectors can change frequently with site updates, and in general can be more difficult to instrument. However, the
EventTargetTok is a much more specific way to identify specific events on your website. Many interactions on your webpage will not have an
EventTargetText, like clicking on the search bar, and if your website is displayed in multiple languages, it is much easier to identify the underlying CSS instead of each language the
EventTargetText could be listed in.
Transforming the Data for statistical analysis and dashboarding
At this point, you should feel pretty comfortable with how to make sense of your indexed data. However, it's still not in a form that can be plugged into dashboards or used to perform statistical analysis.
Next, we'll walk through an example built from our eCommerce demo site so we can show you how to transform raw data into inputs needed for common data visualization tools or statistical analyses.
Step 1 — Grouping events into a session
First, we want to group all our data at the session level. Note that this is merely a choice we made: you could just as easily group by User, Individual, Date, or Page (depending on what you want to analyze).
In the query below, you can see that we
SELECT from a
SELECT statement that is grouped by
UserID (the two fields required to uniquely identify a session). We use the
_PartitionTime to limit our results to two days of data. In BigQuery,
_PartitionTime is a way to limit your query size/cost—but it maps directly to the
SELECT FROM ( SELECT SessionID, UserID FROM `your_table_name_here` WHERE _PartitionTime BETWEEN TIMESTAMP('2017-11-01') AND TIMESTAMP('2017-11-01') GROUP BY SessionId, UserID) a
Step 2 — Defining characteristics for Sessions
Next, we assign different characteristics to sessions we think are interesting. You can see in the SQL below, we use a
MAX statement and a
CASE statement to assign each characteristic with a 1 or a 0 for each session.
- A “1” in our model means the characteristic is true for a session.
- A “0” in our model means the characteristic is false for a session.
This makes it easy to input into dashboards and regressions, but you can also use categorical and float variables if you want to use sums, counts, and averages.
SELECT Made_Purchase, Desktop, Mobile,Visit_Home_Page, Searched_for_Cherries FROM ( SELECT SessionId, UserID, MAX(CASE WHEN PageDevice = "Desktop" THEN 1 ELSE 0 END) AS Desktop, MAX(CASE WHEN PageDevice = "Mobile" THEN 1 ELSE 0 END) AS Mobile, MAX(CASE WHEN PageURL LIKE "%fruitshoppe.firebaseapp.com/#/" THEN 1 ELSE 0 END) AS Visit_Home_Page, MAX(CASE WHEN EventType = "change" and EventTargetText = "cherries" THEN 1 ELSE 0 END) AS Searched_for_Cherries, MAX(CASE WHEN EventType = "click" and EventTargetText = "Purchase" THEN 1 ELSE 0 END) AS Made_Purchase FROM `your_table_name_here` WHERE _PartitionTime BETWEEN TIMESTAMP('2017-11-01') AND TIMESTAMP('2017-11-02') GROUP BY sessionid, userid) a
We are using simple characteristics as examples here, but this is just a demonstration of how to use SQL with
GROUP BY statements to transform your data. You can and should add more complex
CASE statements like “adding specific product to cart < 2 minutes into session from a mobile device.”
In the end, your data will be in the form below, making it much easier to input into regressions or data visualization tools that rely on summing and counting columns.
One last helpful trick when dissecting the data, is to group it by the new characteristics you created with your
CASE statements (i.e.
Desktop, etc…), see SQL Statement below. While the SQL statement above is great for creating inputs to DV tools and statistical analyses, sometimes we want to look at summary statistics for the data. This is a helpful sanity check to determine if what you are looking at is worth a data visualization or more robust analysis.
SELECT Made_Purchase, Desktop, Mobile, count(*) as Session_Num, sum(Visit_Home_Page), sum(Searched_for_Cherries) FROM ( SELECT SessionId, UserID, MAX(CASE WHEN PageDevice = "Desktop" THEN 1 ELSE 0 END) AS Desktop, MAX(CASE WHEN PageDevice = "Mobile" THEN 1 ELSE 0 END) AS Mobile, MAX(CASE WHEN PageURL LIKE "%fruitshoppe.firebaseapp.com/#/" THEN 1 ELSE 0 END) AS Visit_Home_Page, MAX(CASE WHEN EventType = "change" and EventTargetText = "cherries" THEN 1 ELSE 0 END) AS Searched_for_Cherries, MAX(CASE WHEN EventType = "click" and EventTargetText = "Purchase" THEN 1 ELSE 0 END) AS Made_Purchase FROM `your_table_name_here` WHERE _PartitionTime BETWEEN TIMESTAMP('2017-11-01') AND TIMESTAMP('2017-11-02') GROUP BY sessionid, userid) a GROUP BY Made_Purchase, Desktop, Mobile
By using the SQL statement directly above, we will now see every combination of whether or not a purchase was made and what device the user was on. This analysis is kind of boring because it only leaves us with
Searched_for_Cherries, but you can imagine SQL statements with 40-50 characteristics identified.
By comparison, one FullStory customer identified 20 different calls to action on their homepage and is analyzing which are correlated with purchase, desktop users, and longer sessions. These summary statistics, like a dashboard, can lead you to areas that warrant a deep dive, a statistical analysis, or an experiment.
The SQL and methodology from this section should give you a framework of how to translate your indexed event data to events that are meaningful to your business. While we only used one
SELECT statement in this example, you can also create statements using the
EventStartTime that look for the first action a user took and what the last action a user took. You can get creative with your joins so that you can combine data from multiple queries.
Building a “Feature Usage Dashboard” With FullStory
Now we'll take a deep dive into how we built a Feature Usage Dashboard using Data Export for our own FullStory data.
This is a great beginner’s project for two reasons:
- The benefits of this dashboard are ongoing. That means that once you—or an assigned analyst—builds the dashboard, you’ll be able to benefit indefinitely from the pack
- It brings CX insights to all groups in your company, which is where we think they belong!
Why we wanted a “Feature Usage Dashboard”
FullStory is amazing at surfacing specific sessions, but it can be difficult to compare many different segments against each other within the app. However, by using Data Export we knew we could build a dashboard to visualize how different segments used features. We also suspected that a “Feature Usage Dashboard” could be used to help every job family at FullStory.
Going from exporting the data to building the Feature Usage Dashboard required going through a few steps.
Transforming the data for a "Feature Usage Dashboard"
To transform the data needed for the dashboard, we need only slightly modify the SQL Queries we used in Transforming the Data, Step 2 — Defining characteristics for Sessions, above.
For this project, we wanted to find when a unique user interacted with a feature on a specific day. Additionally, we wanted to filter our data by each user’s account which required adding another variable to the
GROUP BY in the SQL Statement.
In the end, we want our data output to look like this:
This format works because it can be easily interpreted by a data visualization tool like Google Data Studio or Tableau. This format will also give us the ability to filter feature usage by date and account.
Next, we'll go step-by-step and transform the Data Export data into this format.
Step 1 — Group By user, day, and account
The first step in transforming the data is to identify user activity by date and account. The SQL snippet below illustrates how you would do this grouping. This snippet represents our framework for the data transformation, which we will add to in Step 2 — Identify logic for feature usage, below.
SELECT DISTINCT Usage_Date, UserAppKey, OrgID FROM (SELECT DATE(_PartitionTime) AS Usage_Date, UserAppKey, JSON_EXTRACT_Scalar(customvars, "$.user_OrgId_str") as OrgID From `fs-staging.hugs_event_export.fs_export` WHERE _PartitionTime BETWEEN TIMESTAMP('2017-11-01') AND TIMESTAMP('2017-11-30') AND length(UserAppKey) > 5 AND UserAppKey NOT LIKE "%@fullstory.com" GROUP BY DATE(_PartitionTime), UserAppKey, JSON_EXTRACT_Scalar(customvars, "$.user_OrgId_str")) a
Date(_PartitionTime)— As a reminder,
_PartitionTimeis the same as the
EventStartTimein each JSON blurb but it is an indexed field in BigQuery that can be used to shrink query sizes. We are taking the
Dateof this field because we do not need the “Timestamp” for reporting purposes. You should also be wary of the timezone you want to track days the feature is being used. The Timestamp on these events is set to UTC but can easily be converted.
UserAppKey— This field represents the email users use to log in to our app. We populate this field with our FS.identify() API. You can also see in our
WHEREstatement that we are excluding users where the
UserAppKeyis blank or is a FullStory email address—we don’t want to count ourselves using our product.
Account— We identify accounts at FullStory by assigning each company that subscribes to FullStory with an
OrgID. So for building our FullStory Feature Usage Dashboard, including
OrgIDin the query makes it easy to aggregate feature usage by an entire organization, and not just individual users and the total.
You are probably wondering how an
OrgID applies to your business. This is because
OrgID is a custom variable that we pass in ourselves. It does not come standard with the Data Export Pack, which is why we have to use a JSON
EXTRACT to remove OrgID from the
customvars field on the event. The
customvars field for us, actually holds additional information (price of plan, number of sessions, etc…), so we use BigQuery to isolate just the custom variable we want.
Step 2 — Identify logic for feature usage
The next step in this process is to identify the events that signal a specific feature usage. As we explained above in Transforming the Data, Step 2 — Defining characteristics for Sessions, we utilize both the
CASE statements. This will tell us for any given user in an account on a given day, did they use each specific feature.
Some features are easy to identify. For instance, an
Active_User is determined by anyone who visits a url that contains
app.fullstory.com/ui. However, some of our features are a little more nuanced.
SELECT DISTINCT Usage_Date, UserAppKey, OrgID FROM (SELECT DATE(_PartitionTime) AS Usage_Date, UserAppKey, JSON_EXTRACT_Scalar(customvars, "$.user_OrgId_str") as OrgID, --Logged in / Active User MAX(CASE WHEN PageURL LIKE "%app.fullstory.com/ui/%" THEN 1 ELSE 0 END) AS Active_User, --Watched a Session MAX(CASE WHEN PageURL LIKE "%app.fullstory.com/ui/%" and PageURL LIKE "%/session/%" THEN 1 ELSE 0 END) AS Watched_Session, --Inspect Mode MAX(CASE WHEN EventTargetSelectorTok Like "%.fullSelector" and EventType = 'change' THEN 1 ELSE 0 END) AS Inspect_Mode, FROM `fs-staging.hugs_event_export.fs_export` WHERE _PartitionTime BETWEEN TIMESTAMP('2017-11-01') AND TIMESTAMP('2017-11-30') AND length(UserAppKey) > 5 AND UserAppKey NOT LIKE "%@fullstory.com" GROUP BY DATE(_PartitionTime), UserAppKey, JSON_EXTRACT_Scalar(customvars, "$.user_OrgId_str")) a
Pro tip—Analyze the Right ActionWe encourage you to think critically when identifying feature usage in your own Data Export Pack. Ask yourself, “What is an easy way I can identify this feature usage—but also, what is the right way?” Take an eCommerce company, for example. You could identify purchases by someone who clicks the ‘Purchase’ button—sounds simple. However, users can click on ‘Purchase’ but never actually make the purchase. What if their credit card was declined or they didn't input their shipping address properly? Your analysis would be more accurate if you identified purchases by users who visit an order confirmation page.
Displaying the "Feature Usage Dashboard" with Google Data Studio
Once you have transformed your data into a similar output as above, you can use it as an input to a data visualization tool. We use Google Data Studio at FullStory because its free and plays nice with Google BigQuery. However, many of our customers use Tableua, Looker, and Periscope Data.
If you want to get started with displaying your data, we’ll walk through the steps necessary using Google Data Studio below.
Step 1 — Connecting Data Studio to Your Analytics Warehouse (i.e. BigQuery)
The first step to displaying your data is to connect the Google Data Studio Dashboard to your analytics warehouse. Data Studio supports a lot of different integrations and connecting it to BigQuery where we write our queries was easy.
Data Studio even allows us to drop the custom query we wrote together in a box to power the dashboard.
Step 2 — Creating visualizations with the data
Now that we have data in our dashboard, the first chart we wanted to make was feature usage over time. For this, we chose the ‘Time Series’ chart option in Google Data Studio. The
Usage_Date made up our x-axis, and data studio added up all of the 1’s and 0’s for each specific feature for every day. Now, our dashboard will display for each day, how many users engaged with each feature.
Because we also grouped our data by
UserAppKey, we can create a filter in Google Data Studio. We used an
OrgId filter so we can view Feature Usage for single or multiple accounts. As a result, we can see how quickly FullStory feature usage expands throughout a single organization as well as which accounts are at risk—e.g. accounts not utilizing features or when their users stop logging in.
Using the "Feature Usage Dashboard"
More data does not make your company inherently more valuable—it must guide your business to better decision making. So how does this a Feature Usage Dashboard inform better strategy and decisions across teams?
- Customer Success — our customer success team uses our Feature Usage Dashboard to monitor the health of our large, enterprise clients that they work with. It also uses the dashboard to determine the difference in feature usage for happy healthy customers, and customers who churn. Based on the dashboard, they give more targeted outreach and training to our customers.
- Sales — the sales team gets a better idea of what to demo to prospective accounts. Now, sales can skip demoing features clients are already successfully using and focus on the features that are underutilized—or haven’t yet been discovered.
- Product — The product team at FullStory previously had to manually track feature usage in spreadsheets. It took hours per week to compile. It was also difficult to segment feature usage by size of account, type of customers, etc. The Feature Usage Dashboard has helped analyze how discoverable our features are, guiding our product roadmap.
Building a dashboard alone won't make your company more successful. For that, data must empower teams to make better decisions. At FullStory, we've found success with our Feature Usage Dashboard because the insights it provides touch multiple key parts of our business.
Remember: we manage what we measure. Be careful you're measuring the right stuff for your company or industry. Once you identify the right KPIs and other actions, determine how FullStory Data Export can make that process bionic.
If you have any questions about the Walkthrough, reach out to us—we'd love to help and/or recieve your feedback.
Last updated June 2018.