This repo contains a code toolbox for running tasks on Amazon Mechanical Turk for computing Minimum Viewing Time (MVT) as presented in (TODO cite paper).
MVT is a measure of image recognition difficulty that is grounded in human psychophysics. It is defined as the minimum amount of time that a human needs to see an image in order to recognize it. For example, if an image has an MVT of 100ms, then humans on average are unable to reocognize the image if they see it for less than 100ms, but they can recognize it if they see the image for at least 100ms. MVT is gathered by showing images to people with various durations and asking the people to guess what they saw in the image.
The experimental design is a little complicated so in the following sections we go over on a high-level how the experiment stimuli are made, how tasks are posted, and how responses are stored.
First, we convert the experiment images into videos that will get embedded in the tasks. In each video there is a fixation cross for 500ms, the image stimulus for some varied duration, then a backward mask for 500ms. The backwards masks are made by randomizing the phase component of the experiment stimulus in the Fourier domain and then recombining it with the amplitude component.
The user specifies which image presentation durations they would like to make videos for. In the original MVT experiments, images were shown for 17ms, 50ms, 100ms, 150ms, 250ms, 10s. The user specifies these durations by passing a list of frame counts. Videos are made to be 60 frames per second as the standard computer monitor runs at 60Hz so 1 frame = 1 / 60s = ~17ms. A video is made displaying each image for each of the desired number of frames.
After the videos are made, we need to group videos together to be shown in the MTurk tasks. We allow workers to do multiple tasks, but go to lengths to ensure that no worker sees the same image more than once. If worker see an image more than once, it may improve their ability to recognize it which would confound the MVT results.
First, we group the experiment images into disjoint, class-balanced
sets with size specified by the user (say, 50). Then, for each set of
images, we create lists of videos that will be shown to the
workers. If the user would like
Then, each set of images is assigned a URL to the app running on the
experiment host. These links are uploaded to MTurk as tasks for
workers to accomplish by following the link. Each task should have experiment/server/server.js
for more information on how this is
done.
This procedure is important because it ensures that each link (task) only ever shows the same images, and no two assignment from different tasks contain any of the same images. That way, workers can complete multiple tasks that we post, but will not accidentally see the same image twice at any duration.
We have set up this toolkit to use the MTurk web application rather than the API for improved transparency. MTurk on the web makes following task progress, creating hits, and managing tasks more convenient and less bug-prone.
First, the user sets up the project and task landing page. We've included the starter code for this. Then, they upload a CSV of links to post the tasks for workers to complete. Each link is one task, but each task can have multiple assignments so that multiple workers can complete a single task. MTurk disallows workers from performing multiple assignments from the same task. As explained in the previous section, each assignment is associated in the backend with a different set of videos so as different workers accept assignments from the same task (link), they see the same images but at different presentation times.
We've included in the landing page a callback to the experiment server to check to see if the worker should be allowed to accept this task. The 'Accept HIT' button is hidden until this check is completed. If the worker has already completed an assignment with this link, or if the worker is on a list of blocked workers stored in the backend, then the 'Accept HIT' button remains hidden and the worker is instructed to return the HIT. Otherwise, the button is made visible and the worker is able to proceed. This direct interaction with the experiment server from the MTurk server requires a secure connection so the experiment server must have HTTPS credentials.
Results are stored in a Redis backend with the following key-value structure:
-
<link_id
: Task metadata including the path to video list files for this particular link. -
<link_id>:worker_id:<worker_id>:response:<trial_number>
: Set containing worker response and metadata for the video in thetrial_number
video in the task. -
<link_id>:worker_id:<worker_id>:responses
: Set oftrial_numbers
of videos that the worker has completed for this link. This is used to keep track of the worker's progress and allow them to resume where they left off after a break. -
<worker_id>:links
: A list of links that the worker has accepted assignments for. This is used to ensure that the worker is not able to accept an assignment from a link they have already seen. -
blocked_worker_ids
: A set of workers that have been blocked. Workers in this set are no longer allowed to accept more HITs. -
assignment_ids:<link_id>
: A list of assignments that have been accepted for this link. The order of these assignments are used to select which list of videos are to be shown to the worker. -
assignments_to_replace
: A set of assignment IDs that need redone. When assigning a worker to a video list, previous assignments that are in this set are replaced by the assignment ID of the new worker. We recommend the user put in this set all assignment IDs from workers who did not pass the quality control. -
<assignment_id>:timestamp
: The timestamp of the most recently submitted response from this assignment. This keeps track of which assignments are expired. Assignments that are not completed and expired are reposted by MTurk automatically and will be replaced in the backend when new assignments are accepted for that link.
These steps presume that you have set up redis and it is running on a machine and port that is discoverable to the experiment host.
- Install requirements.
$ pip install -r requirements.txt
- Set environment variables. Edit the
set_environment.sh
file to include the necessary information. The environment variables that need to be set are as follows:
EXPERIMENT_HOST
: The host where the experiment server is runningEXPERIMENT_PORT
: The port the experiment server is running onREDIS_HOST
: The hostname of the machine that redis is running onREDIS_PORT
: The port that redis is listening toREDIS_DB
: The redis database number that the responses will be stored toREDIS_PASSWORD
: The password for accessing redisSERVER_PRIVATE_KEY_FILE
: The path to the server HTTPS public keySERVER_CERTIFICATE_FILE
: The path to the server HTTPS certificate
$ source set_environment.sh
Experiment stimuli, video lists, and task metadata files are generated
by running main.py
which requires a few command line arguments.
-
image_directory
: The path to the directory containing the experiment images. The directory should be in PyTorch Image Folder format with human-readable names for class directories. -
--experiment_name
: The name of this experiment. -
--frame_counts
: List of integers separated by spaces. The numbers correspond to the number of frames that you would like images to be present at. 1 frame is 1/60 of a second. -
--num_images_per_task
: The number of images included in each task for workers to complete. -
--num_workers
: The number of workers who will see each image at each duration inframe_counts
.
Here's an example with the same configuration as the original MVT experiments:
$ python main.py /path/to/image/folder/ --experiment_name example_experiment
--frame_counts 1 3 6 9 15 600 --num_images_per_task 50 --num_workers 7
Navigate to the experiment server folder.
$ cd experiment/server
There are two fields in server.js
that you may want to edit before you start the
server.
-
allotted_time
: The amount of time (in milliseconds) that a worker has to complete a task before it expires. This should match the value you set when making the MTurk task on the MTurk web app. -
max_contribution
: The maximum number of tasks a worker can do. This number limits the bias in your data by restricting workers from doing too many tasks.
Start your experiment server:
$ npm install
$ node server.js
If everything has worked properly, you should be able to go to https://<EXPERIMENT_HOST>:<EXPERIMENT_PORT>/experiment.html?link=<LINK_ID>&b=ASSIGNMENT_ID&c=HIT_ID&d=WORKER_ID
after formatting in your experiment hostname, port number, and any of
the link_ids found in <EXPERIMENT_NAME>_links.csv
and you will be
able to try out the experiment yourself.
Next, you need to go to the MTurk Requester page and create a new project with the default template. Fill out the project properties to your liking. In our in-lab experiments, participants spent around an hour to do 200 images which you can use to estimate a reasonable reward. We encourage everyone who uses our code to comply with minimum wage regulation and to be generous in their MTurk rewards. For "Number of assignments per task", input the number of video lists per link (number of workers * number of image durations).
Then, delete everything in Design Layout and replace it with the
contents of mturk_landing_page.html
. Save the project and go back to
the "Create" page.
To post your experiments for MTurk workers to complete, press "Publish
Batch" on your new project and upload
<EXPERIMENT_NAME>_links.csv
. This file contains all the URLs for
your experiment. It may be quite large depending on how many images
you are using so you may not want to upload it all at once. In that
case, you can split up the file into smaller ones and upload them
separately as long as the first row in each file is HIT_Link
.
We've provided a small notebook for you keep track of your data as it
come in from MTurk. analyze_results.ipynb
allows you to build a
dataframe of responses and keep track of which tasks need reposted in
order to complete the whole experiment. We recommend removing the
results from workers with poor performance at long control durations
and marking their assignments for replacement. To replace those
assignments after marking them in redis, just upload a CSV file the
the relevant link ids to MTurk after changing the "Number of
assignments per task" project property to match the number of
assignments you want replaced.