What’s in a name? A lot if you are talking databases….

Database Design – Naming Conventions


What's in a Name By Shakespear

Well, that’s a pretty popular line, however, not most Data Architects would agree with Shakespeare on that!

Talking about “Name”, here is an incident that I remember. One day, a new guy in my team created a table and called it “I_WIN_YOU_LOSE”, seriously he did that! And so were the names of the fields in that table. At first sight it was funny, as all the queries running against that table were pretty creative. You can imagine some innovative queries on such a table.

Although all of that earned him the reputation of a “Cool Dude”, it was not possible to follow his style in our Data Warehouse. So, I had to be the geek & explain the team the way we name database objects here…

An observation that I made based on this experience was that when we lay out standards for everyone the progress is rapid. The computer industry has experienced the value of standardization for long. When we standardized on the formatting of a disk drive we got drives that could be read by both Unix and Windows systems, CDs, DVDs, Flash Drives, all have a standard format for information exchange. When we standardized on the TCP/IP protocol we are able to connect all the computers to each other and the internet was born! When we standardized on the HTML protocol the world wide web was born. I could give you many more examples but you get the point. So why can’t we humans follow simple naming conventions? Particularly when it comes to naming table names and column names we suddenly get creative. Its almost like we take for granted an authority to be different.

When it comes to data modeling, especially in a multi-tier, team based, fast growing environment, the “Name” of an object really becomes crucial as it defines the objects. Naming objects becomes much more than just tagging a word to a face. Naming becomes complex when different people have different meanings for the same name and also have different names with the same meaning. Everyone has their own style that comes with the personality, personal preference as well as past experiences, like the one in my earlier story.

Let’s try to understand the principles behind naming conventions with specific industry standards & examples. I am going to make this more generic & not specifically to our environment.

Principles of Naming Conventions:

By combining the words of names in a specific way, standardized data component names are created.  The rules will vary for each organization, but the basic principles for developing rule sets are constant.

There are three kinds of rules that form a complete naming convention:

  • Semantic rules are based on the meaning of the words used to describe
  • Syntax rules prescribe the arrangement of words within a name
  • Lexical rules concern the language-related aspects of names

 I.     Semantic Rules:

These are rules based on the meaning of the words used to name data components.

  • Subjects: entity or subject terms are based on the names of data objects or subjects that are found in data models (entities) or object models (object classes).
  • Modifiers: can be that subject’s properties or qualifiers that are used interchangeably when naming data objects.
  • Class Words: describe the type of data that a column or attribute contains.  This is a classification of the type of data, or domain.

II.     Syntax Rules:

These rules specify the arrangement of name components. Examples of Syntax Rules are:

  • • The subject or object term occupies the leftmost position in the name, unless it is used as a modifier to another subject.
  • • Modifier terms follow the subject. The order of the qualifiers in a name is used to make the name complete and clear to the intended audience. Use subject, property and/or qualifier terms as needed.
  • • For columns and attributes, the last term should be the class word at the rightmost position.

III.     Lexical Rules

These rules determine the standard look of names.

Examples of Lexical Rules are:

  • • Nouns are used in singular form
  • • Verbs are always in the present tense
  • • No special characters are allowed
  • • All words are separated by underscores
  • • All words are in upper case
  • • Listed / approved abbreviations and acronyms

Industry Standards:

 I.     Definitions & Common Rules

Entity or Table

An entity is the representation of a distinguishable person, place, thing, concept, event or state that has characteristics, properties and relationships.  A table is a physical collection of data about a person, place, thing, concept, event or state.  A table may correspond with an entity.

Attribute or Column

A column or attribute contains a specific detail about an entity or table.  A column or attribute should not contain multiple values such as arrays or concatenated values.

Common Rules

  • • The name of an entity / table or attribute / column should enable its audience to identify and locate it within its context. Therefore, each entity / table or attribute / column name must be unique within its context (an entity within its model, a table within a database schema, an attribute within an entity or a column within a table).
  • • The name of an entity / table or attribute / column should be a declaration of the classification of the data it contains or will contain and therefore it should be a noun or noun phrase in the singular form and should follow a classification declarative format.
  • • The name of an entity / table or attribute / column should enable designers, developers and business personnel to effectively know what is in it or what to place in it.  It should describe its content (what it is), rather than how it is used, processed, populated or decoded.

II.     Formats & Examples of Naming Conventions

Entity or table name should be a noun phrase constructed with the following format:

Subject Modifier

Attribute or column names should be a noun phrase constructed following the format: 

Subject Modifier Class

Subject indicates the class of information that the entity or table describes; it provides the proper naming context for the modifier.  Subjects are nouns that name things.

Subjects may be composed of several terms or words.

Examples: EmployeePurchase OrderItem etc.

Modifier is an optional component of the entity or table name that further qualifies the name.  The modifier is one or more properties and one or more qualifiers.

Examples: Project InstallmentEmployee Contact etc.

Class or class word classifies the type of information being represented by the column or attribute.

Examples: Employee NumberGL Account NumberPurchase Order Status Code

All of this might sound boring to all the Shakespeare types out there, but I know for sure that all the Data Architects would agree & appreciate these naming conventions!

After all this makes our lives easier & organized and data interchangeable. I will follow this up with a post on data types and data quality rules that change data into information. How to populate missing values? How to check for ranges of values for each field? How to test data quality on the dimensions of completeness, accuracy, consistency, timeliness, etc.



Segmentation: An Important Marketing Strategy

In the current hyper-competitive business environment, understanding customers’ changing tastes and purchasing behavior is extremely important! It is clearly evident that companies that do not track changing consumer wants and needs have seen a decline in their fortune. KODAK is just one but a very good example to underline this point. Failure of understanding and predicting consumers’ buying trends towards more technologically advanced digital  cameras lead to the bankruptcy and demise of the company.

Most companies today make tremendous effort to track consumer behavior. Meticulous recording and utilization of consumer digital body language serves as the keystone of the process. The advent of the internet and mobile technologies has changed the consumer behavior – They search for things on the net, educate the pros and cons of the product, compare prices and evaluate vendors for timeliness and customer service. They actively seek coupons and collect feedback from friends on the social networks.

In doing so the consumer leaves a trail of breadcrumbs all over the internet. Astute businesses collect, analyze and utilize the digital body language of the customer to improve their marketing activities. There are several ways of collecting and analyzing the customer data – web crawling, browser based java scripts, server log mining, geo location tracking in malls and stores. Most companies use a data warehouse to store the data and analyze it to find out the customer purchasing behavioral patterns. Effective utilization of this information can lead to revenue generating promotional activities for the company.

All promotional marketing activities are expensive since the response rates are very low. You cannot send a letter or catalog to every customer in the country, as it will be cost prohibitive. To make these promotional activities cost effective, it is very important to segment the customers, so that the promotional plans can be directed towards selected and potentially profitable customers.

Segmentation is a process by which a large customer base is divided into small subsets of customers, having common needs and priorities. There are several conventional methods of creating segments such as: Geographical segmentation, Demographic Segmentation, Behavioral segmentation etc. However, more sophisticated ways, currently utilized in the market include various statistical techniques such as K-means clustering, hierarchical clustering etc. An interesting and upcoming technique of segmentation is “Micro segmentation”. It is utilized to understand individual customer behavior and to make personalized marketing offers.

When using these methods, companies should consider the ideal characteristics of the segments such as:

  • Segment should be measurable and profitable.
  • Stable across the time
  • And every consumer in the segment should be easily reachable.

We will focus our discussion on the segmentation method commonly utilized by most b2c companies. In retail industry, companies prominently focus on customers’ transactional behavior to create various segments. Recency, Frequency and Monetary (RFM) are the main indicators used for creating segments of the consumers. Few marketers utilize these individual indicators and while others combine the various segmentation techniques to implement their marketing plan.

Traditionally, more recently active customers are considered as the more important customers. Some marketers divide customers according to the recency of their purchase. e.g. customers who made a purchase during last 12 months, 13-24 months and 25+ months. Marketers use predictive modeling techniques such as linear regression, logistic regression to score their consumer base. Finally, consumers are segmented by their geography such as zip codes, households as well. These groups are then combined to make several mini segments.  These resultant segments are considered as potential consumers and are targeted for promotional efforts of the company.

The performance of these segmentation and promotional strategies is measured for each segment, against the metrics such as:

  • total revenue generated,
  • average revenue per customer,
  • cost of the promotional activity.

This report is then utilized as a feedback, to continually improve the marketing strategy. Segments evolve over time depending on the performance feedback, more recent trends, and various experimental testing (A/B testing).

It is worth considering those consumer segments that are not current and active buyers; they may represent potential defectors to competitors. Knowledge of their purchasing behavior and transactional data can be very helpful to target and attract them through selective marketing strategies.

Hence, marketing strategies conducted by utilizing selective customer segmentation can serve as a win–win situation for both, the company as well as consumer; as company can promote the right offer to right consumer at right time and consumer may eventually respond to the offer by making a purchase, while not getting bombarded with junk mail.

Express Analytics
8001 Irvine Center Drive Suite 400
Irvine, CA 92618

View Larger Map 







How to select an analytics platform?

Recently Express Analytics was engaged by a client to help them select a platform for marketing analytics. This led us to ponder the answers to the following questions.

  1. What is an Analytics Platform and how is it different from a Transactional Platform?
  2. What prevents organizations from exploiting the existing data in their databases for Analytics?
  3. Who are the new entrants in the market with analytics platforms offerings?
  4. What is the long term direction of the market?
  5. Can I afford an analytics platform?
  6. What is the correct measure of the cost of an analytic platform?

Let us ponder on the first question:

What is an Analytics Platform and how is it different from a transactional Platform?

A bit of Background

For the last 40 years or so the entire technology industry has been focused on solving just one problem.

  • How to improve the ability to record and manage transactions?

This unrelenting focus has lead to enormous improvements in the ability to record transactions. We have become so good that today we are able to record transactions in micro and Nano seconds. This has given rise to high frequency trading on the stock markets, massive scale collaboration of billions of people on social media and recording sensor data from machines. It is well understood that the core requirement of a transaction management system is to be able to

  • Create a new record
  • Read a single record
  • Update any field of that record
  • Delete a record
  • Save the record.

These few functions (CRUDS) all operate on a single record. Hence we have developed database systems that are highly efficient in selecting a needle from the proverbial haystack. To achieve this singular objective we record the complete record as a row. Each row can have fields that are number, date, or text type. In the late 90s and 2000′s we have struggled to modify these database systems to accommodate audio and video content without much success. So every time we need to read a few fields in a record, we need to bring the entire record in the memory before we can operate on it. So if the record has 100 fields and I mostly query 10 most frequently used fields, I am moving the whole record even though I have no use for the 90% of the fields in my current operation. When I have to read in millions of records in the memory this becomes a wasteful use of precious resources such as server memory, CPU at an enormous cost of I/O from disk. The result is a system that is sluggish and doesn’t respond before the train of thought has left the station.

What about Now?

In the last decade, we have found that the amount of data we have stored in our databases has grown humongous in size and we are unable to access that data efficiently. This lead to looking at different approaches to storing and accessing the data. First we analyzed the queries we frequently ran and discovered that only 5-10% of the fields of a record are used in our queries. This lead to a different way to store data in databases. This approach stores data as columns rather than a complete record. This approach is called the columnar database systems. Since each column of a table has a single data type we can use compression techniques to reduce the size of the database. This in turn reduces the I/O necessary to retrieve large volume of data and improves query performance dramatically. Second we discovered that the clock speed of the CPU and the memory has hit a wall, so we adopted a parallel processing approach using multicore CPUs. Taking this one step further we created massively parallel clusters of commodity servers that work together to crunch a very large amount of data very effectively.

During the last decade we have also uncoupled the hardware and software in servers. Today we are able to define what a cluster delivers by the software installed on it. Completely software defined servers give us ability to use commodity hardware and open source software to create Big Data crunching abilities that are easy to configure and change. They bring fluidity to our operations. We have been moving from brittle to flexible architectures.

Good but not good enough!

So currently we are able to record and retrieve large volumes (Gigabytes, terabytes and Petabytes) efficiently. But, how do we make sense of the large volumes of data effectively? We are attempting to develop machine-learning techniques to be able to analyze data at high volume because it is not humanly possible to read, understand and analyze large amount of granular data. Along with this increased velocity of data generation, the data is also becoming more unstructured, sparse and coming at us from all channels. Even channels that are digital and modern are starting to get left behind. Today texting is preferred over email and written letters are not even used to fire employees or divorce notices. World over, we frequently use three or four letter acronyms from slang. So LOL, PFA, and GTG are used routinely in communication. Our interactions have become informal, sparse, multi-channel and asynchronous. Yet our desire to be understood has never been greater.

We expect our service providers to divine our expectations and be ready to serve us without so much as expressing our needs. We are migrating to an era when an organization needs to:

  • observe us,
  • understand our desires,
  • appreciate our tastes,
  • analyze our past behavior
  • serve us graciously
  • In real time

or we are ready to change our service providers in a flash.

What should an analytics platform provide?

All this has led to the desire for a platform that will allow us to analyze the data and extract meaning and nuances from it. The modern analytic platform needs to do the following functions efficiently:

  • Select a few columns from a very large number of records
  • Select sets of records based on different criteria
  • Create new sets from the intersection or union of these record sets
  • Create derived fields from the few columns selected to enrich this data
  • Create algorithms to recognize the trends in this data
  • Project discovered trends in future
  • Describe the patterns recognized
  • Classify similar data together
  • Predict the likelihood of events
  • Prescribe corrective action
  • Infer meaning from seemingly dissimilar data
  • Present data in an easy to understand visual image

The modern analytic platform has many more requirements that are contradictory to transactional platforms. In the following posts we will discuss the answers to the remaining questions. The list above is an incomplete one. I am sure you have a lot more functions that you feel are important. Let me know and I will keep adding it to the list.

In the next blog I will discuss the reasons why organizations are unable to exploit existing data in the company. Stay tuned.







Marketing is strategy and strategy is destiny.

In an article in the December 2013 issue of the Harvard Business Review,  Author Professor Niraj Dawar argues that today upstream activities – such as sourcing, production, and logistics – are being commoditized or outsourced, while downstream activities aimed at reducing consumer’s costs and risks are emerging as  drivers of value creation and sources of competitive advantage.

Today the locus of competitive advantage lie outside the firm and the advantage is accumulative -rather than eroding over time as competitors catch up, it grows with experience and knowledge.

Since competitors can emulate, clone, copy or source from the same suppliers and neutralize a company’s competitive advantage, the company has to look outside the firm for competitive advantage. How the market perceives you can be a strategic advantage.

According to the author you can choose to compete with players of your choice, you can change the criteria on which you compete and best of all your competitive advantage can grow rather than diminish.

Understanding the customers’ wants and desires, their buying patterns, their socio economic status, the importance they attribute to certain features or the convenience of procuring the product, the price they are willing to pay for it, all these can be the foundation of your strategy. In the era of social media and mobility the customer is sending you these signals through their expressions in the social media and ratings on the web. Collecting these signals and analyzing them to arrive at your strategy and the next best action can lead to sustainable competitive advantage.

Experimenting with control groups, evaluating the behavior changes as you hold out your marketing messages to the holdout group can provide greater insight into the buyer’s real motivations. This can then lead to different strategies, Costco for example has a triggers and treasures strategy, so the things that you routinely buy such as bread, milk and eggs are priced very low motivating you to go to the club every week, but once you reach the warehouse you are guided thru aisles stacked with seasonal items, high fashion items, exclusives that are only available for a few days and only at certain clubs, so if you procrastinate the first time and want to buy the item the next visit it is gone. You are also motivated to visit multiple clubs to discover what treasures are hidden there. This is a behavior modification technique that leads you to buy the treasures as soon as you find them for the fear of missing out. Leads you to visit the store more frequently, at multiple locations in search of things that you never knew you desired.

Amazon has a convenience strategy and has adopted the club membership strategy of Costco, so now it is more like Costco than Walmart. Be a member of Amazon Prime and you are guaranteed a 2 day delivery at no cost. By ensuring that your buying experience is par excellence, and removing the friction of delayed shipments, Amazon has lifelong converts – who are not just couch shoppers but mobile shoppers too. These customers are so comfortable with the buying experience at Amazon that they stand in the aisle in Best Buy, Target and Walmart, compare prices and order on Amazon, confident in the knowledge that 2 Day shipping means two day delivery no excuses.

These two companies don’t produce the goods that they sell, so product innovation doesn’t play a part in their success, but understanding their customers’ buying behavior is their innovation and strategy. Marketing is the strategy and strategy is destiny. Superior logistics and centralized buying are indeed a part of their success but understanding the customer drives these functions not the other way around.  The entire organization is executing flawlessly to deliver the habit forming experience to the customer.


The process of segmentation

The question I attempt to answer in this post:

Is there a scientific way of segmenting customers based on a number of dimensions?

We all know that we can plot the shape of a curve on a two dimensional graph or draw the shape of an object on a three dimensional graph. However once we have crossed the number of perspectives to be more than three dimensions the mind starts to wonder how one can visualize the shape of the object. Trying to model the outcome of a process that has multiple dimensions is more complex than can be represented in a euclidean space.

Even more difficult is to find the optimum of that shape. So let us say that we wanted to find the lowest cost of marketing to the community of a million customers. Further we know that they interact with us via multiple channels, such as web browsers, email, chat rooms, call centers, mobile phones, tablets. We also know that the process of communication is either initiated by the sender  ( marketer)or the receiver (prospect or customer). The stage of the receiver’s buying cycle also has an influence on the outcome of this interaction.  The awareness of the brand, the price sensitivity, the affluence of the receiver, the promotional offers on the table are just a few of the influencing factors. The array of factors that influence the outcome of this marketing game are too many to articulate. So how do we model this complex world of b2c or b2b marketing?

Early in our life we have been taught to use a cryptic language to represent ideas.  The language of mathematics. So very early on we learned that we can represent a straight line by an equation.  We also learned to define the line by its slope, the height of the Y axis where it intersects it and a pair of points on a two dimensional graph by a set of points like (X,Y).  We can use the same approach to represent the marketing scenario mentioned above by a multidimensional shape.

This process is called modeling. We attempt to fit the abstract representation of the real world, to an equation that defines the shape of this world.  In marketing there are two questions we try to answer.

What is the probability of a favorable outcome ( someone buying something )?

What is the amount of revenue that can be generated if the outcome is favorable? In other words if one were to buy, how much will they buy?

The first question has a YES/NO kind of binary answer and the second question has a discrete number ($58.25) kind of answer. Both are estimates but their nature is different. So the technique to answer them is also different. The first is called modeling response and the second is called modeling revenue.

The first question is of classification of the outcome as a yes or no. The technique used for this is called logistic regression.

The second question estimated the return amount. The technique used for this is called linear regression.

Now humans are creatures of habit, so we assume that they will behave the same like they have been behaving under normal circumstances. In effect, they will tend to fall back (regress) to their habit. This hypothesis gave rise to the method of observing the behavior of the receiver and coming to the conclusion about the two answers we are seeking. Marketers and the statisticians have been using this method to observe the behavior of the last year’s buyers, and create a equation to predict the likelihood of purchase and better still, the amount of money likely to be spent by the customer.

By the use of these techniques we can create a score (very much akin to the FICO score we all are measured by lenders). We can then use this score to rank the customers from the highest to the lowest by the probability of buying. We can also rank the customers by the amount of likely money to be spent. These two scores themselves reveal a lot.

We could decide whom to send a marketing collateral and whom not to send it. Thus by holding off sending it to the lost causes or the sleeping dogs we can improve the return on investment of our marketing efforts. After all it costs real dollars for marketing.

We could also multiply the probability of buying with the amount of money likely to be spent by the customer, to create a final score for each customer. By ranking the customers top to bottom by this final score we can get the best of the best customers to market to. Now you know why some of us are such magnets for junk mail!!!

Experimenting with the email content


Once you have identified the customer segment that you want to target, the next monumental task in front of the direct marketing folks is to grab the attention of the audience and keep it on the message that you want to convey. The composition of the email, starting with the subject line, the images in the body, the language in the text all can be tweaked to create the right effect. In this role you actually act like a movie director. Just like a movie director uses the light, sound, background score, the action scenes, the photographic angles, the expressions of the actors to create the right mood and setting for the story that he wants to tell, you need to harness all the skills that you have to engage the receiver.

Today, the email marketing industry is exploring the impact of the design by using multiple different versions of the email content. Matching the right content with the right segment of receivers does magic in improving the response rates of the email campaign. With email, the ability to try different combinations of the creative and monitoring the responses is quite inexpensive and instant, so we can experiment to find the effective email combination that works for a segment of receivers.

Subject Line testing and analysis

Generally the subject line testing is the first thing that can be tried with significant impact. Some of the questions that one can try to answer by testing are:

Is a short subject line better than a long one?

What is the optimal size of the subject line?

Is a question in the subject line better than a statement?

The objective here is to improve the open-to-sent metric. The more catchy the subject line, the more likely the receiver is to open your email. This is usually used with great effect by crime thriller writers—”Sex, Lies and Money” they say will always improve ratings. Perhaps, there is something to a catchy heading. Our objective, is to measure the improvement in the open-to-sent rate.  So, let us say that you have created a set of four segments of customers, whom you are going to send the same email. Look at some of the following headlines used by Amazon, WSJ.com, Bloomberg etc.:

“Camera, Photo & Video Lightning Deals”

“Dear China, it’s over…”

“Is Best Buy a Better Sell?”

These subject lines tend to stand out in an already overcrowded inbox of the receiver.

Image Size Analysis

Another thing that has become very effective is adding images to the email message that draw attention to themselves while conveying the product you are trying to sell. As a marketer you want to test the impact of the size of the image. Perhaps a single image of the appropriate size, tells the whole story. Creating multiple versions of the email with varying images of different sizes can give you a set of different versions of the email. These can be sent to different groups of customers in a single segment. Use these versions of the email  to monitor the response rates of the various versions.

Play with the Call to Action (BUY) Button

The next thing to work with is the “call to action” button that the receiver can click on. The size the color and the position of this crucial element can make a difference between the receiver buying the product or moving on to the next email. Try to place this at the top of the page, or the lower left corner, center of the page, right below the image etc.

When you combine these various elements of the email –header, images, call to action buttons etc. you can come up with quite a few versions of the same email message.

Then you can send the traditional email you have been sending to one group of the chosen segment, and send the different versions to the rest of the groups of your segment. Most Email Service Providers (ESP) will allow you to manage this effectively.  The key activity is to gather the data from these email blasts and store them in a database for analysis. In a short four weeks you can have enough data to guide you into making some empirical decisions about the  versions that work and the versions that can be discarded.  This forms your baseline for monitoring the improvement in response rates by using the right version to the right segment. See figure 1.

comparitive analysis of campaigns


Figure 1: Comparison of email campaigns by subject line

Technology needed

The merger of content versions and the segments of your target list needs you to track the responses over a period of time to find the optimum creative combination for each segment. So you need to track the individual email version sent, the rate of response of the receiver across the various response segments, and a large number of other variables that influence the outcome of your email campaign. You can’t achieve this without having a very large database for marketing department. This database should focus on the individual receiver, the emails sent to them, their response to various creative combinations. The size of this database can be significant and performance can be a major consideration. Suddenly the meaning of ‘BIG DATA’ becomes abundantly clear to you. A typical email campaign database can store upwards of a few billion records over a 3-5 year period. To develop long-term insight, the marketing department needs to start with a long-term plan. If it is not possible to have a VLDB database in-house due to cost and IT support considerations, search for a vendor who can provide such a database for you. Without this kind of database, your ability to improve the response of the campaign is quite difficult, if not impossible.

The current trend of cloud based services offers a very effective solution. In a future blog I will review the characteristics of a database engine which is most suitable to the marketing activities. Do you need a columnar or a row oriented database engine, a massively parallel database engine? The database technology does uniquely differentiate the marketing analytics solution.

Improving the response rates of email marketing #2

Everything in marketing is about being relevant to the context of the customer. Having said that, when we set out on achieving this lofty goal we run into a number of challenges. One of the major consideration is the definition of the size of the opportunity.

Let us assume that we email every one of our email list of 10 million members.

Let us define response rate in marketing parlance.

Response rate = number of responders/number of impressions in the performance window.

Performance window can be defined as the time after the email reaches the inbox of the target audience. This can be as short as 24 hours or up to two weeks depending on the frequency of your waves of emails.

So let us say that

  • we sent 10 million emails
  • 500,000 people opened their emails
  • and of these 5000 clicked at least one link in your email
  • within the two weeks after we sent the email,

then our response rate would be  5000/10,000000 = 0.05%.

We could also look at another important ratio here.

Click to Open Ratio = 5000/500,000 = 1%. These two response rates are typical of the email marketing numbers. Let us see why.

The general notion is to send more emails to increase the number of responders. However, by the definition as the number of impressions increase the denominator of our formula increases thus reducing your response  rate.

Our executive management reviews the response rates and mandates that we do better, or perhaps we don’t share these rates with our management as they are too low and proactively decide to do something about the response rates.

[polldaddy poll=7273554]

Let us rethink our approach

If we take the same “spray and pray” technology of the newspaper, billboard era, then we have not made any progress. The best part about the digital world is that technology allows us to sense and measure more signals than was possible in the offline world. If we don’t take advantage of this measurement we haven’t made progress.

So let us think about narrowcasting rather than broadcasting. I want to make an offer only to those people who may be interested. That way I can keep my number of impressions to a minimum and improve the number of responders. However, wishful thinking alone doesn’t make this happen. We decide to think different.

One of the team members has a bright idea,- why don’t we segregate our buyers from non buyers? surely, they behave differently. Then the discussion drifts along the lines of demographics and geographic segmentation. These dimensions are easy to use, and so, we dive in headlong in this approach. But we soon discover that in the next couple of our campaigns the response rates don’t budge. If you have been there, don’t worry, we all have been there.

A Better Approach 

Let us start by looking at how we can understand the context of the buyer. We can start by segmenting the customers into a number of segments by the following attributes based on their past buying behavior. Some of them are:

  1. Buying stage
  2. Behavior
  3. Style preference
  4. Price Sensitivity
  5. Social acceptance
  6. Location
  7. Quality consciousness
  8. Recency
  9. Frequency
  10. Monetary
  11. Return behavior
  12. demographic
  13. Gender
  14. Channel preference
  15. Opt In status

With Today’s technology, we can observe the potential customer unobtrusively over a long period of time and collect data about them. Once we have done that for a while, we can start to group them into various clusters. Some clusters can be as follows:

Cluster 1

Cluster 1

Cluster #1

Just browsing/prefers to browse online/brand conscious/likes contemporary styling/wants to buy first/local < 10 miles/discerning of quality/last bought a year ago/last purchase was $300/Return behavior unknown/Professional/ male/retail buyer/Opted-In

Cluster 2

Cluster 2

Cluster #2

Actively buying for self/Brand Neutral/Modern style lover/Balances price to performance/seeks recommendations/local <5 miles from store/compromises on quality/last bought 90 days ago/last purchase was $100/rarely returns/upper class/female/online buyer/opt-In

Such clusters, once defined allow us to predict the future behavior of new members to the cluster. As we gather important clues about individual customers, we can start to plan separate campaigns to address every unique cluster.

Each of these listed variables influence the buyer behavior to a varying degree. So, perhaps we can study the last two years of the behavior of this customer segment and calculate the rank of each variable by its influence on the buying behavior of the members.

We can also find the strength of influence of each variable on the outcome. This is the correlation between individual variable and the outcome. So let us try to find an equation that can explain in mathematical terms, the influence of these variables on the final outcome. Will the receiver respond or not? So we classify receivers based on their likelihood of response.

Once we have done this classification, we can calculate the probability of response. Now even within responders there is going to be a probability of response by each receiver. So we need to rank the likely responders based on their probability of response (e.g. 0.85). Then we can create a cut off threshold (let us say 0.75) . Anyone with a probability above this threshold should be emailed, the rest of them can be safely ignored.

This is one part of the equation. We have identified the segments of customers we want to communicate with. THE WHO of our story is defined.

However if we still need to establish the context of the receiver of our communication when we are going to email them. So we still need to try to explore the recent activities of the receiver. Our ability to track the receiver’s search terms, the visit to the various websites, the blogs visited, the products reviewed, the products they have pinned, the stores they have visited, give us their stage in their journey to purchase. This also gives us the clue about their stage in the buying cycle and what may be of interest to the receiver. This allows us to do some controlled experiments with the type of emails we can send them. It also answers the question -when is the best time to send them the emails.

This approach of using the data to guide us through the unknown territory of marketing new receivers is called Data Science. Using the historical data to analyze the habits of customers needs a scientific approach. It is methodical, time-consuming but the surest path to marketing success.

In a future blog, I will explore the world of subject line testing, the composition of the email, the contrast colors, the sizes of the images, the positioning of the call to action links. These activities make our emails more effective as marketing messages.

Improving the response rates of email marketing

If you are like most marketing managers the top most thing in your mind is to generate more revenue with your marketing spend. Perhaps your performance is measured on it. In effect you are expected to invent the perpetual machine which takes no input but generates infinite output, or so it seems.

Most companies today use emails as a main way to communicate their marketing messages with their customers. Yet the response rates are so low that the gut reaction to poor email responses to your email marketing program is to increase the email frequency to improve the number of responders.

However this leads the customers to perceive your messages as an irritant if not spam. Even if you are not flagged as a spammer or an irritant there is a strong possibility that the value of your message is diluted, thus creating a longterm loss of brand value. Is there anything that an organization with a modest budget, can do to improve response rates without barraging your customers with unwanted emails?

Fortunately, there is a way to engage your customers without bombarding them with emails and yet improve the response rates of your email marketing.

Let us look at how you can segment your customers with whom you want to communicate.

Broadly speaking there are four different types of customer categories:

  • The persuadables
  • The sure bets
  • The lost causes
  • The “Do not Disturbs”

The persuadables are those that are likely to be seeking a product or service and are familiar with your brand and aware of your offerings. These customers are likely to welcome your email because it solves a problem they are trying to solve. Perhaps they are interested in buying a product you are offering and so your email seems to be well-timed. Here the need is met just in time, or there is an untapped desire, unspent disposable income that you can access by sending the right message at the right time to the right person. The persuadables are also the customers who will spend higher if targeted.

The sure bets are those customers who are very familiar with your brand and offering. These customers may buy irrespective of receiving an email/catalog/sms/coupon. You may potentially waste your money by sending them emails, or better still reduce the profitable revenue generated by your marketing program by offering them coupons. This is preaching to the choir.

The lost causes are those who are never likely to respond to marketing messages as they are either not interested in buying, or they have been won over by your competition. Sending them emails may be fruitless and you are better off trying your message somewhere else.

The Do Not Disturbs: The fourth category of customers are those who are likely to be loyal customers but who don’t want to be disturbed by frequent emails. Sending them emails is likely to turn them off. You can lose a good customer due to poor marketing. Generally these customers feel slighted that you don’t know them and get put off by your marketing emails. This is a risk you can’t afford to take, as it would mean losing a good but infrequent customer who buys a lot when ever they get to your store or website.

The question by now you must be asking is all this is good but how do I segment my customers in these four categories. I will get to the process of effective segmentation later first let us look at the historical and current situation.

Historically experienced marketing managers have developed an intuition based on observing the behavior of their customers.

  1. When did they last buy?
  2. How frequently do they buy?
  3. How much do they buy?

The trade term for this formula or expertise is called RFM (Recency, frequency and monetary) value. For years this has been a mechanism used to segment customers by these three dimensions and target them with marketing messages. But this technique has been overused. Along with this the avenues for buying have increased significantly as well. Besides retail stores, there are now e-commerce sites and mobile apps where the buyer can exercise the right to buy. They can buy in their bedroom late at night, in their pajamas, or buy while they are riding a car during their daily commute. So the customer is getting empowered to buy anything, anywhere, anytime.

The advertising influences on a customer are increasing multifold. Google search, ratings and reviews, social media bragging by friends about what a great deal they got are routine. So what in your marketing really worked? What can ou attribute the sale to? This is the holy grail of marketing today. My point is that just a three-dimensional analysis of customers doesn’t give enough insight into their buying behavior. Obviously a better way to analyze customer behavior is needed.

Over the years direct marketing companies have used predictive modeling for creating multiple segments of the customers based on a large number of variables that are likely to influence the buying behavior. Obviously you couldn’t mail the catalog to all the people in the country as it costs real money to get the catalog in the hands of a customer. Even if it costs $0.50 per catalog to send a 50 page catalog to a customer the numbers quickly add up when you mail the whole population multiple times a year. Hence the need to improve targeting. Marketing managers have developed very deep expertise to increase the return on investment of the marketing dollars. In direct marketing the predictive modeling is used to calculate a purchase propensity score (the probability of purchase multiplied by the amount of money the customer is likely to spend) for each customer. This gives a sense of the success of the campaign before any mailing is done. Use of this technique has not been applied to email marketing mostly due to the cost of modeling and scoring the customers. There is also a notion that it costs very little (at least relatively!) to send an email blast, so I might as well send it to every one of my customers.

Both the cost of the modeling and the almost negligible cost of emailing have kept this approach from being used for email modeling.

However our experience over the last few years has been quite the opposite. Typically most companies are happy if they get 1%- 2.5% rate of response to their email marketing. But using the approach I am about to outline, we have experienced response rates in the 12-15% range. Initially, when we reviewed the numbers we didn’t buy them, but when the rates continued to keep coming up again and again it became conviction that we are on to something.

In the next few posts I will attempt to articulate this approach and look forward to your feedback. What we will attempt to learn together are the issues involved and how to overcome these to attain the marketing nirvana of “sending the right message to the right customer at the right time based on their moods, likings, buying stage and buying behavior”. Stay tuned.

[contact-form][contact-field label='Name' type='name' required='1'/][contact-field label='Email' type='email' required='1'/][contact-field label='Website' type='url'/][contact-field label='Comment' type='textarea' required='1'/][/contact-form]