Online marketing research

IBM Journal of Research and Development, Sep-Nov 2004 by Agrawal, A, Basak, J, Jain, V, Kothari, R, Et al

Active learning

In order to conduct OMR rapidly and to limit the exposure of an experiment to the smallest possible subset of users, it is necessary to choose the respondents with care. Conceptually, the most informative participants should be chosen in such a way that it is possible to collect the required data using the minimum number of respondents. Learning from chosen participants (or "data points" in a generic context) as opposed to learning from the available data (or randomly sampled data) is often called active learning and has been the object of sustained study [8-12]. Typically, one begins with a small set of labeled data points (previous participants whose responses are known) to find the unlabeled data point (the next visitor to the site) which, if labeled (chosen as a participant), would provide the maximal gain in information. In the present context, one may begin with a few users whose behavior is known (through observation or through manual curation) and use an algorithm to find a user whose responses to an OMR experiment would be maximally informative. Technically, the prior approaches to active learning have been based on using the known (or labeled) data to find the next most informative data point. We have developed an innovative algorithm that actually reverses the role of the unlabeled and labeled data [13] and that uses available information such as demographics and clickstream to evaluate the anticipated gain in information that would result from the individual's response. Informative individuals are chosen for participation in the online marketing research experiment.

To clarify, let the attributes derived from demographics, clickstream, and historical transactions be denoted by the vector x and the total information provided by an individual be denoted by I(x|X), where X represents the individuals who have already been sampled. Then, the next most informative respondent satisfies the relation

argmax [I(x|X)].

x,x ∉X

It is possible that the most informative visitor, as determined by the above equation, may in fact never arrive during the course of the experiment. We thus recommend discretizing the entire feature space and computing the information content of the features in each feature cell. Each feature cell corresponds to an idealized user, and the most informative feature cells provide the set of most informative users. Any real user visiting the site and matching anyone from the set of informative idealized users can be selected as a potential respondent. If the set of idealized users chosen is large, it ensures that informative users are not discarded simply because they are not the most informative users.

To associate the information content corresponding to a certain feature vector (user), we form multiple models that predict the behavior of a user given x. The notion of entropy (degree of disagreement) between these multiple models is then used to characterize the gain in information that is likely to result from an individual's response. Additional details of the algorithms are available elsewhere [8]. One may observe that the true behavior of a user is not required in this evaluation-instead, the degree of relative disagreement between the models is used.


 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
advertisement
  • Click Here
  • Click Here
  • Click Here
advertisement
Click Here

Content provided in partnership with ProQuest