The Marketer’s Data Dictionary: 50 Terms You Need to Know
Your response to the original Marketer’s Data Dictionary was so positive that we have now enhanced it with 14 brand new terms. Enjoy!
Slay the jargon dragons at your company with The Marketer’s Data Dictionary: our user-friendly guide to 50 terms you need to know.
When it comes to data, accessibility is everything.
So, we’ve put together a data dictionary to help marketers simply understand complicated martech language and slay the jargon dragons standing in the way of meaningful, inclusive conversations about data at work.
Types of data
1st party data
1st party data is the data you have collected and are able to use.
2nd party data
2nd party data is the data you can receive from agreed partners.
3rd party data
3rd party data is the data collected from everything and everywhere else, well beyond your own interactions.
We’ve expanded out on first, second and third-party data here.
A consistent identifier (i.e. a customer number or email) used to follow a customer across different devices, think mobile web, in-app and on desktop.
Personal information that cannot be associated with a specific individual. “De-identification” refers to removing personal information like an email address or a name.
Hashed data is data stored in an encrypted, secure format. For example firstname.lastname@example.org is BB8C71F261C69B19446FD88243F8E579820C5D536CCD2572A5D284EEF6081D0 in hashed form.
When a CDP like Lexer sends email addresses to Google to create an audience, we send the hashed emails and Google matches it to their database of hashed emails – so no private information is transferred.
Personally Identifiable Information (PII) is data like email address, name, date of birth, physical address, or a phone number that can be used to confidently identify a specific person. (It’s important you handle it correctly, especially in Social Media) It is possible to securely unify digital behaviors to an identified person using Lexer Tag.
Think of cookies as the crumbs you leave behind for the internet to trace your customer journey.
A cookie is a small amount of data generated by a website and saved by a user’s web browser. Its purpose is to remember information about a specific user, storing your logins, helping you pick up where you left off and personalizing your experience.
SKU level data
SKU stands for stock keeping unit, a way of keeping track of products by assigning them a specific number. Think of it as the Dewey decimal system for retail businesses. This number denotes category, size, style and color.
The Net Promoter Score is an index ranging from -100 to 100 that measures the willingness of customers to recommend a company’s products or services to others. It is used as a proxy for gauging a customer’s overall satisfaction with a company’s product or service and a customer’s loyalty to the brand.
Lexer NPS allows customer service agents to send personalized NPS surveys on social and attribute scores to full profiles of each individual.
Public data is information that can be freely used, re-used and re-distributed by anyone with no existing local, national or international legal restrictions on access or usage. For example, census data released by the Australian Bureau of Statistics. This article has 33 fascinating examples of public data.
Statistical data on individuals and households usually collected by a census. Experian ConsumerView is a powerful source of demographic data: a comprehensive data set on 80% of the population. Our partnership with Experian gives our clients access to insights on household income, gender, education, occupation, relationship status, decision making, Mosaic® segments and more.
Mastercard Advertising Insights
Mastercard Advertising Insights identifies consumer segments based on aggregated and anonymized spend data within each postal code and category, derived from billions of Mastercard transactions. Lexer has partnered with Mastercard to make this powerful source of enrichment available to our clients through our CDP.
Bringing data together
The process of bringing together data from a range of sources to form a comprehensive profile of a customer, by connecting data points using deterministic or probabilistic matching.
Deterministic matching looks for an exact match between two different pieces of data. For example, an email address or phone number in two data sets can be used to make an exact match of two records.
Probabilistic or “Fuzzy” matching calculates the likelihood of a match based on a scoring system on a range of data points. For example, two customer records with the same address and date of birth are 99.9% the same person, but two records with the same name, like John Smith, are not very likely the same person. Usually, a combination of 2-3 data points like address, DOB, name, and transactional data are used.
Data cleansing involves detecting and correcting corrupt or inaccurate data. An example would be removing the value of ‘John’ from a column called ‘Age’.
The process of creating a richer view of each customer record by adding data from external sources. Lexer enriches each customer record with valuable data from partners like Experian, Roy Morgan, and Mastercard to give you extra data points like their Mosaic profile or purchasing habits.
Extract, transform, and load (ETL)
ETL is a process used in data warehousing to prepare data for use in reporting or analytics. The data is taken from somewhere, shaped, and loaded into a database. It’s one of the initial stages in Lexer’s data onboarding process, which cleans client data before it is loaded as attributes and identities into our platform.
Data Lakes store data in its raw format: unstructured, inconsistent and not easily queried. Companies can build data lakes by using Infrastructure-as-a-Service (IaaS) clouds including Amazon Web Services (AWS) and Microsoft Azure.
A massive database where the structure is defined before the data is captured. Think a ginormous Excel spreadsheet, with rows and column titles specified in advance. Data Warehouses are easier to analyze than data lakes but generally require technical data skills and specialized software.
Machine learning is a method of data analysis where systems can learn from data, identify patterns and make decisions with minimal human intervention. Some examples of machine learning include regression analysis, looking at trends in data to predict what happens next and classification, the process of predicting the class or targets of given data points e.g. this email matches spam filters, so I will send it to your spam folder.
Customer churn modeling
Churn modeling is a method of identifying which data points can be used to indicate someone is likely to stop buying from you, in other words, their churn rate. It’s a quintessential form of regression testing.
How much money a customer has spent with you, adding up all spend, across channels, removing discounts and refunds. This can be a challenge if you don’t have all transaction data in one place – this is where a CDP comes in really handy. Sometimes lifetime value can also mean how much a customer will spend with you in the future.
RFM model (Recency, frequency & monetary model)
RFM modeling ranks customers by the recency and frequency of their purchases and how much they’ve spent with you in the past. It’s great for identifying high-value customers for loyalty campaigns and re-engaging lapsed customers.
A way of taking an audience and expanding it to include people with similar qualities. Lookalikes may be used for prospecting and to ensure you’re reaching the largest and most relevant audience possible.
An attribution model is a way of determining the source of leads coming in from different touchpoints. Google Analytics’ Last Interaction model is an example of this, which assigns 100% credit to the final touchpoints (ie. clicks) before a conversion.
One of the more accessible coding languages, Python is really popular in data science and machine learning. Python is supported by major tech companies like Google, Instagram, Netflix, and Dropbox.
It’s a great way to query and transform data, and Lexer use it in our ETL process.
Structured Query Language (SQL)
SQL is an abbreviation for structured query language and pronounced either see-kwell or as separate letters. SQL is a standardized query language for requesting information from a database.
Using data in your day-to-day
Audience / customer onboarding
Audience onboarding involves uploading your customer data to an outside platform like Facebook and having them match it to their database.
Programmatic advertising is the automated process of buying and selling ad space using a few tools including a Demand Side Platform (DSP). To learn more about a how a DSP fits into your marketing stack, check out Navigating Martech: DSP.
Single customer view
A comprehensive view of a customer across all channels. Single customer view is the end goal of bringing various customer data points together. A CDP achieves this rapidly and allows marketers and customer service teams to use this data to deliver personalization and contextualized customer care.
In multichannel marketing, a brand may use different channels to interact with customers, but each channel is managed separately and with a different strategy.
Omnichannel marketing seamlessly connects all consumer touchpoints to create a consistent and progressive customer experience. It is centered around customer-centric measurements like lifetime value and loyalty. Excited? Read our guide to the Top 5 benefits of omnichannel marketing.
Cost per click
Cost Per Click (CPC) refers to the actual price you pay for each click in your pay-per-click (PPC) marketing campaigns. CPC is one of the most commonly tracked indicators of success for advertising campaigns.
CPM means Cost per Thousand Impressions, a marketing term used to denote the price of 1,000 ad impressions online. CPM is a commonly used marker of success when comparing campaign results.
If a website publisher charges $2.00 CPM, that means an advertiser must pay $2.00 for every 1,000 impressions of its ad.
Search Engine Marketing (SEM)
Search Engine Marketing involves using paid advertisements in search results to increase visibility. Companies bid on keywords that they think their customers might type into Google when looking for their product or service, so their ads appear in search results.
Service Level Agreement (SLA)
A service level agreement (SLA) is a contract between a service provider and the end-user that defines the level of service expected.
SLA is an important metric for measuring customer satisfaction, and we provide it simply and quickly through our customer service tool, Lexer.
When it comes to Martech, the number of tools and abbreviations out there can be pretty overwhelming, which is why we developed Navigating Martech: The Complete Guide to the key platforms in the industry today. For now, let’s dive into five commonly used systems.
Manages the sales and service history with every known customer. Originating from 1:1 sales workflows, a CRM can provide cross-channel contact history and servicing tools.
Expert in connecting email addresses (PII – known customers) to cookies (unknown prospects), so marketers can create addressable segments to target or suppress across the digital advertising ecosystem.
Manages the integration of tags from third-party software into owned digital properties.
A DMP provides a centralized dataset that aggregates cookie browsing behavior (unknown prospects) to create large, de-identified audiences for ad targeting across digital channels.
A system that analyses multiple data points on each visitor or email recipient and selects the ideal creative to serve from a library of dynamic or pre-created messages – all in real-time.
The Customer Data Platform Institute is a vendor-neutral organization dedicated to helping marketers manage customer data. Have a read of their Founder, David Raab’s blog here.
Led by Data Rockstar, Todd Belcher, the CDP Resource is another great read for understanding the increasingly crowded Martech space. We were thrilled to be named as one to watch by these guys, and you can read their recent spotlight on us here.
We’re passionate about data compliance, security, and management. We’re also certified and regularly audited and love helping our clients do the same. To learn more about our approach read our Privacy and Information Security policy.
General Data Protection Regulation (GDPR) is a reform designed to give individuals greater control of their data. It has been created to be centered around individuals – with a hefty responsibility on organizations, to empower greater transparency and accountability.
Short for information security, Infosec refers to the processes and tools companies use to protect their data.
ISO 27001 is a global information security standard for an information security management system or ISMS. (We’re certified!)
SFTP stands for Secure File Transfer Protocol and is a way of safely sending data over a secure connection. (If you’re sending PII or sensitive customer data over email you are breaking laws and putting your customer’s data at risk).
SOC 2 is an audit that ensures you’re securely managing your data. It focuses specifically on controls around unusual system activity, authorized and unauthorized system configuration changes, and user access levels. For example, logins from unusual devices or locations.
A cloud computing storage option that groups huge amounts of data into buckets that you can query through the Amazon Web Services (AWS) API.
Slay the jargon dragons at your company
We hope our data dictionary helps you slay the jargon dragon at your company, leading to more meaningful and inclusive discussions about how data can drive value at every level.