Thursday, October 10, 2013

Randomization on mechanical turk

Amazon Mechanical Turk is a fabulous way to do online psychology experiments. There have a bunch of good tutorial papers showing why (e.g. here, here, and here). One issue that comes up frequently, though, is how to do random assignment to condition. Turk is all about allowing workers to do many HIT (human intelligence task, Turk's name for a work assignment) types, one after another. In contrast, most experimental psychologists want to make each condition of their experiment a single HIT and to get participants to do only one condition.

If you are using the web interface to Turk, you are creating a single HTML template, populated with different values for each distinct HIT type. That means that each different condition is a different HIT. In this case, if you want random assignment to (a single) condition, all you can do is write prominently "please do only one of these HITs." The problem is that Amazon displays HITs from the same job one after another, so you have to make sure that every worker stops after doing just one. This strategy generally works until some worker does 7 or 30 conditions of your experiment - messing up your randomization and putting you in the awkward position of paying for data you (typically) can't use. Nevertheless, I and many other people used the "do this HIT once" method for years - it's easy and doesn't go wrong too much if the instructions are clear enough.

In the last couple of years, though, folks in my lab have moved to using "external HITs" where we use Turk's Command Line Tools to direct workers to a single HTML/JavaScript-based HIT that can do all kinds of more interesting stuff, including having multiple screens, lots of embedded media, and a more complex control flow. The HTML/js workflow is generally great for this, and there is quite a bit of code floating around the web that can be reused for this purpose. Now there is only one underlying HIT, so workers can only complete it once.

The easiest way to do random assignment to condition from within a JavaScript HIT is to have the js assign condition completely at random for each participant. This just involves writing some randomization in the code for the experiment and makes things very simple. With 2 conditions and many participants, this works pretty well (maybe you get 48 in one condition and 52 in another), but with many conditions and fewer participants, it fails quite badly. (Imagine trying to get 5 conditions with 10 participants each. You might get 6, 14, 8, 4, and 18 subjects, respectively, which would not be optimal from the perspective of having equally precise measures about each condition.)

Our solution to this problem is as follows: We use a simple PHP script, the "maker getter," that is called with an experiment filename and a set of initial condition numbers (in the example below, it's "myexpt_v1" and conditions 1 and 2, each with 50 participants). The first time it's called, it sets up a filename for that experiment and populates the conditions. Every subsequent time it's called, it returns a condition. Then, if this is a true Turk worker (and not a test run), a separate script decrements the counts for that condition. This gives us true random assignment to condition.

(Note: Todd Gureckis's PsiTurk is a more substantial, more general way to solve this same problem and several others, but requires a bit more in the way of setup and infrastructure.)

---- DETAILS AND CODE ----

The JavaScript block for setting up and getting conditions:

// Condition - call the maker getter to get the cond variable 
try {
    var filename = "myexpt_v1"
    var condCounts = "1,50;2,50"
    var xmlHttp = null;
    xmlHttp = new XMLHttpRequest();
    xmlHttp.open( "GET", "http://website.com/cgi-bin/maker_getter.php?conds=" + 
 condCounts +"&filename=" + filename, false );
    xmlHttp.send( null );
    var cond = xmlHttp.responseText;
} catch (e) {
    var cond = 1;
}

The JavaScript block for decrementing conditions:

// Decrement only if this is an actual turk worker!
if (turk.workerId.length > 0){
var xmlHttp = null;
xmlHttp = new XMLHttpRequest();
xmlHttp.open('GET',  
'http://website.com/cgi-bin/' + 
'decrementer.php?filename=' + 
filename + "&to_decrement=" + cond, false);
xmlHttp.send(null);

}

maker_getter PHP script (courtesy of Stephan Meylan, now a grad student at Berkeley), which is running in the executable portion of your hosting space: maker_getter.php.

decrementer PHP script (also courtesy Stephan): decrementer.php.

10 comments:

  1. Hi there, to conduct this randomization, do I need to run my experiment on an external server?

    ReplyDelete
  2. Yes, probably - you need to have something to run the PHP and to host the JavaScript that queries it.

    ReplyDelete
  3. I see. Thank you very much!!! So there is no way to realize even randomization if I only use the template provided by Turk?

    ReplyDelete
    Replies
    1. Well, the basic HTML interface allows you to upload multiple different assignments, so you could just randomly assign via that method (plus use Unique Turker https://uniqueturker.myleott.com/) - but it's less versatile.

      Delete
    2. Really appreciate your reply, Michael.

      Delete
    3. Sorry if I'm repeating myself. I just wanted to clarify one more question and make sure that my understanding is right.
      (1) By multiple different assignments, do you actually mean 'multiple hits'? If this is the case, do you realize 'random assignment' through javascript?
      (2) If you did mean 'multiple assignments', I am a little confused and hope that you can give me some guide. To my understanding, multiple assignments within one hit correspond to the same task. Even though Turk itself can randomly show these assignments, I cannot use different assignments to display different groups of tasks.
      Sorry that I have made the question so long. If there is anything wrong with my understanding, please feel free to let me know. And really thanks for the suggestion of 'unique turker link'.

      Delete
  4. I tried this approach and it has an important limitation. If a worker previews or refreshes the HIT, s/he might receive another treatment (due to the possibility of a worker refreshing the page, disabling the preview would not suffice). In my study, this would introduce an important bias. Therefore, I extended this approach by storing all workers and their initial experimental group assignment in a MySQL database (with an index on WorkerID). Before a worker is shown a page, I check whether s/he is stored in the DB. If yes, I present him/her the same treatment as the initial treatment. If not, I present him/her a random treatment.

    I also log every time a worker loads the page the time, his/her id, and the assigned group. This way, I can check which user refreshes or drops out.

    ReplyDelete
  5. I tried this approach and it has an important limitation. If a worker previews or refreshes the HIT, s/he might receive another treatment (due to the possibility of a worker refreshing the page, disabling the preview would not suffice). In my study, this would introduce an important bias. Therefore, I extended this approach by storing all workers and their initial experimental group assignment in a MySQL database (with an index on WorkerID). Before a worker is shown a page, I check whether s/he is stored in the DB. If yes, I present him/her the same treatment as the initial treatment. If not, I present him/her a random treatment.

    I also log every time a worker loads the page the time, his/her id, and the assigned group. This way, I can check which user refreshes or drops out.

    ReplyDelete
  6. If you're interested in having a stronger backend, you might consider PsiTurk (https://psiturk.org/) - I deal with the issue you mentioned by blocking turkers from previewing beyond the instructions of my experiment unless they accept.

    ReplyDelete