Thursday, October 22, 2009

Data: Screening, Diagnosis and Treatment

Screening Phase

Examine data for five different kinds of possible errors:
  1. Lack of data – Do some questions have far fewer answers than surrounding questions?
  2. Excess of data – Are there duplicate responses?
  3. Outliers/inconsistencies – Are there values that are so far beyond the typical that they seem potentially erroneous?
  4. Strange patterns – Are there patterns that imply cheating rather than honest answers?  For instance, does a respondent alternate between ratings of 4 and 5 on every other topic in a matrix question?
  5. Suspect analysis results – Do the answers to some questions seem counterintuitive or extremely unlikely?

Diagnosis Phase

From the Screening Phase you have highlighted data that needs investigation. To clarify suspect data, you often must review all of a respondent's answers to determine if the data makes sense taken in context. Sometimes you must review a cross-section of different respondents' answers, to identify issues such as a skip pattern that was specified incorrectly. 

With this research complete, what is the true nature of the data that you've highlighted?  The five possible values the authors give:
  1. Missing data – Answers omitted by the respondent or questions skipped over 
  2. Errors – Typos or answers that indicate the question was misunderstood
  3. True extreme – An answer that seems high but can be justified by other answers (e.g., the respondent working 100 hours a week because they work a full-time job and two part-time jobs)
  4. True normal – A valid answer 
  5. No diagnosis, still suspect – The verdict is out on this "idiopathic" data. When it comes time for the Treatment Phase, you may need to make a judgment call on how to treat this data.

Treatment Phase

You've screened the data and tried to come to a verdict on whether suspect data is guilty or innocent. You have three choices for what to do with suspect data:
  1. Leave it unchanged – The most conservative course of action is to accept this data as a valid response and make no change to it. The larger your sample size, the less that one suspect response will affect the analysis; the smaller your sample size, the more difficult the decision.
  2. Correct the data – If the respondent's original intent can be determined, then I am in favor of fixing their answer.  For instance, perhaps it is clear from the respondent's explanation for their ratings that they reversed the scale in their minds; you can invert each of their answers to this question to correct the issue. Some statisticians will argue for imputation, replacing the answers with imputed values, such as the mean for that variable, but the techniques for imputation can become quite elaborate and are best left to professional statisticians.
  3. Delete the data – The data seems illogical and the value is so far from the norm that it will affect descriptive or inferential statistics. What to do? Delete just this response or delete the entire record? Whenever you begin to toss out data, it raises the possibility that you are "cherry picking" the data to get the answer you want. 
However you choose to treat the data, make sure to document in your survey report what steps you took, how many responses were affected and for which questions.

Screening Phase

Examine data for five different kinds of possible errors:
  1. Lack of data – Do some questions have far fewer answers than surrounding questions?
  2. Excess of data – Are there duplicate responses?
  3. Outliers/inconsistencies – Are there values that are so far beyond the typical that they seem potentially erroneous?
  4. Strange patterns – Are there patterns that imply cheating rather than honest answers?  For instance, does a respondent alternate between ratings of 4 and 5 on every other topic in a matrix question?
  5. Suspect analysis results – Do the answers to some questions seem counterintuitive or extremely unlikely?

Market Basket Analysis

Market Basket Analysis is used to identify or gain interesting insights into the way customers buy items. It finds out the combination of items which are most likely to be bought together. For example buying milk and bread together can lead to buying butter. Identifying such patterns from the transaction data collected over the years provide new window of opportunity to explore and understand the purchasing behavior and thus can be used to increase sales through cross selling and targeted marketing. Market basket analysis is not limited to retail sales but are applied to a wide range of areas like analyzing credit card purchases, fraud detection for insurance claims and so on. Data Mining technique called association rule mining is usually used. Algorithms like FP-Growth and Apriori algorithm are usually used for mining the transaction data to get the desired patterns.

Dos and Don'ts: Online copywriting

Consumers interact differently with copy on the Web than they do with traditional marketing media. Transferring copy from a printed brochure online is not a recipe for success. Web copy must embrace online consumer behavior and be relevant to the audience's needs.

Don't: Forget to listen

Make sure you are creating a dialogue, not a monologue. Give the audience a part to play. Be attentive. With every reaction and interaction, they're telling you something about themselves. If you haven't given them anything relevant or a way to engage/respond, you'll lose them fast. Use this information to lead them through click by click. Are they looking for cold, hard facts? Do they want to be entertained? Make the information accessible, easy to understand and interactive. Listen and learn.

Do: Optimize your copy for people in addition to search engines

Sure, having searchable words and targeted keywords is crucial if you want the search engines to find you, but you don't want to lose your readers in the process. Selling is about connecting with people and building relationships. Your words are your virtual handshake, extending to those who probably trust a stranger more than they trust your brand. What you write needs to inform, educate and entertain, but it also needs to connect and build trust. Your readers want to know that you get what's going on in their lives and that you actually care about them. So get to know who they are. And then write the way they speak. People instinctively trust those who speak like they do. Keep it conversational, concise and simple. Big words may impress, but your job is to communicate and engage. Avoid words that sound like you're selling something because it will just sound like you're selling something. And the only thing that will build is resentment.

Don't: Try to retrofit a static piece of offline copy into an interactive medium

You're now speaking to an impatient online reader or, more appropriately, an impatient online scanner. Gone is the luxury of the beautifully crafted setup. On the Web, your reader wants the conclusion up front. Think the "inverted pyramid" approach to writing copy. Every word has to hold their attention and move them toward whatever it is they're looking for. Headlines have to be meaningful rather than clever. One idea per paragraph is a good rule of thumb. Speaking of which, stay away from clich├ęs. Include searchable words and targeted keywords so the search engines will find you. Of course, there are rules in the offline world that also apply to the Web. Know who your target is and convince them of what the product can do for their life rather than how many cool features it has to offer. And when you have figured out that magnificent amalgamation of the traditional and the technological, don't let a typo be the thing they remember.

Do: Try writing your next line of copy in 140 characters or less

If you look back four to five years ago, people interacted with Web copy in a somewhat passive way. They would navigate to a Web site or see a banner and then take action – hopefully click and buy. Today we are consuming digital content in a completely different way; one that is short, dynamic and increasingly personalized. The task in front of us as digital copywriters to get our complex and sometimes long-winded thoughts condensed into smaller segments. To start, the approach needs to maintain some fundamentals like understanding the audience, being concise and most importantly bringing a sense of humanity into your work. When constructing the line, script or piece of Web content, remember the consumer is time-starved and success lies is in making each word count in the Facebook and Twitter world.