Online marketing research
IBM Journal of Research and Development, Sep-Nov 2004 by Agrawal, A, Basak, J, Jain, V, Kothari, R, Et al
Active learning
In order to conduct OMR rapidly and to limit the exposure of an experiment to the smallest possible subset of users, it is necessary to choose the respondents with care. Conceptually, the most informative participants should be chosen in such a way that it is possible to collect the required data using the minimum number of respondents. Learning from chosen participants (or "data points" in a generic context) as opposed to learning from the available data (or randomly sampled data) is often called active learning and has been the object of sustained study [8-12]. Typically, one begins with a small set of labeled data points (previous participants whose responses are known) to find the unlabeled data point (the next visitor to the site) which, if labeled (chosen as a participant), would provide the maximal gain in information. In the present context, one may begin with a few users whose behavior is known (through observation or through manual curation) and use an algorithm to find a user whose responses to an OMR experiment would be maximally informative. Technically, the prior approaches to active learning have been based on using the known (or labeled) data to find the next most informative data point. We have developed an innovative algorithm that actually reverses the role of the unlabeled and labeled data [13] and that uses available information such as demographics and clickstream to evaluate the anticipated gain in information that would result from the individual's response. Informative individuals are chosen for participation in the online marketing research experiment.
To clarify, let the attributes derived from demographics, clickstream, and historical transactions be denoted by the vector x and the total information provided by an individual be denoted by I(x|X), where X represents the individuals who have already been sampled. Then, the next most informative respondent satisfies the relation
argmax [I(x|X)].
x,x ∉X
It is possible that the most informative visitor, as determined by the above equation, may in fact never arrive during the course of the experiment. We thus recommend discretizing the entire feature space and computing the information content of the features in each feature cell. Each feature cell corresponds to an idealized user, and the most informative feature cells provide the set of most informative users. Any real user visiting the site and matching anyone from the set of informative idealized users can be selected as a potential respondent. If the set of idealized users chosen is large, it ensures that informative users are not discarded simply because they are not the most informative users.
To associate the information content corresponding to a certain feature vector (user), we form multiple models that predict the behavior of a user given x. The notion of entropy (degree of disagreement) between these multiple models is then used to characterize the gain in information that is likely to result from an individual's response. Additional details of the algorithms are available elsewhere [8]. One may observe that the true behavior of a user is not required in this evaluation-instead, the degree of relative disagreement between the models is used.
- 5 Rules for Immediate Annuities
- Death in the Family: 12 Things to Do Now
- Dumbest Things You Do With Your Money
- 6 Online Networking Mistakes to Avoid
- 401(k) Mistakes to Avoid
- 5 Economic Scenarios to Keep You Up at Night
- The Real ‘Best Places to Retire’
- Best Credit Cards for You
- 12 Tough Questions to Ask Your Parents
- The Real ‘Best Colleges’
- Home Buyer Tax Credit: How to Cash In
- Why You Shouldn't Bash Cash
- 8 Phony 'Bargains' and Better Alternatives
- Danger: 3 Debit Card Scams to Avoid
- 6 Myths About Gas Mileage
- 29 Fees We Hate Most
- Quick and Easy Ways to Boost Returns
- Best Stocks to Buy Now
- Lower Your Taxes: 10 Moves to Make Now
- New Jobs: 8 Lessons from Real-Life Career Switchers
- The New Job Market: Who Wins and Who Loses?
- Health Care Reform's Public Option: Everything You Need to Know
- Volunteer Work When Unemployed: Should You Work for Free?
- Whose Recovery Is This?
- Long-Term-Care Insurance: 4 Biggest Risks to Avoid
Content provided in partnership with
Most Recent Technology Articles
- Verizon expands 3G network coverage in upstate New York
- PlasmaTech Inc names Alpha Security Systems Ltd as new platinum distributor
- ADC's GSM base station and switching product portfolio acquired by Altobridge
- Verizon expands 3G network coverage in upstate New York
- Partner Communications appoints Eli Glickman as Deputy CEO
Most Recent Technology Publications
Most Popular Technology Articles
- Failed businesses in Japan: a study of how different companies have failed, and tips on how to succeed, in the Japanese market
- Political stability and economic growth in Asia
- What's the point of differential protection?
- EBay's Panty Raid - Industry Trend or Event
- Case study: a strategic research methodology



