-
Notifications
You must be signed in to change notification settings - Fork 18
BulkCreate
The CreateHIT
function allows a requester to create a single HIT. As of MTurkR v0.6.4, it is also possible to create multiple HITs in a single function call using BulkCreate
. This function takes multiple question
values and creates one HIT for each value, using a fixed set of other parameters. While this does not create a "batch" in the sense used by the Requester User Interface, BulkCreate
requires an annotation
argument so that all of the HITs can easily be operated on using functions such as ExpireHIT
, ExtendHIT
, etc.
In addition to BulkCreate
, three additional bulk creation wrapper functions have been added:
-
BulkCreateFromURLs
is an easy way to create ExternalQuestion HITs, by simply supplying a vector of HIT URLs. This will create one HIT for each URL, and group them under a common title, description, etc. Theannotation
field is required:BulkCreateFromURLs(url = paste0("https://www.example.com/",1:3,".html"), frame.height = 400, annotation = paste("Bulk From URLs", Sys.Date()), title = "Categorize an image", description = "Categorize this image", reward = ".05", expiration = seconds(days = 4), duration = seconds(minutes = 5), auto.approval.delay = seconds(days = 1), keywords = "categorization, image, moderation, category")
-
BulkCreateFromTemplate
can be used to create a set of HITs from a template HTML file, in the style of the Requester User Interface (i.e., the CSV upload feature of the RUI). If you (a) create an HTML file with placeholders for a set of variables (e.g.,${varname}
) and (b) create a data.frame of variable values, this function will create a HIT structure from the template for each row of the data.frame and then create a HIT from each of those completed templates. Here's an example of an HTML template:<!DOCTYPE html> <html> <head> <meta http-equiv='Content-Type' content='text/html; charset=UTF-8'/> <script type='text/javascript' src='https://s3.amazonaws.com/mturk-public/externalHIT_v1.js'></script> </head> <body> <form name='mturk_form' method='post' id='mturk_form' action='https://www.mturk.com/mturk/externalSubmit'> <input type='hidden' value='' name='assignmentId' id='assignmentId'/> <h1>${hittitle}</h1> <p>${hitvariable}</p> <p>What do you think?</p> <p><textarea name='comment' cols='80' rows='3'></textarea></p> <p><input type='submit' id='submitButton' value='Submit' /></p></form> <script language='Javascript'>turkSetAssignmentID();</script> </body> </html>
And here's MTurkR code:
temp <- system.file("template.html", package = "MTurkR") a <- data.frame(hittitle = c("HIT title 1", "HIT title 2", "HIT title 3"), hitvariable = c("HIT text 1", "HIT text 2", "HIT text 3"), stringsAsFactors = FALSE) BulkCreateFromTemplate(template = temp, input = a, annotation = paste("Bulk From Template", Sys.Date()), title = "Categorize an image", description = "Categorize this image", reward = ".05", expiration = seconds(days = 4), duration = seconds(minutes = 5), auto.approval.delay = seconds(days = 1), keywords = "categorization, image, moderation, category")
-
The final
BulkCreateFromHITLayout
uses the same logic of a template HTML file and an input data.frame. In this workflow, however, the template is created in the Requester User Interface (RUI), the "HITLayoutId" for that template is retrieved from the RUI, and the variable values are passed to theBulkCreateFromHITLayout
function. You can find an example of this workflow here, which closely mirrors the previous example.
-
Test out your batch in the sandbox first using a small number of input values to make sure the code and the HITs themselves work.
-
Use
SufficientFunds()
to estimate the cost of your project. Because HITs created in bulk are likely to be low-paying, it can be hard to estimate the cost of a project yourself due to Amazon's $0.005 minimum per-assignment commission. -
All the bulk creation functions require an
annotation
argument. This makes it easy to perform operations on the full set of HITs (e.g.,ExtendHIT
,ExpireHIT
,GetAssignments
, etc.) using just a single function call (as opposed to calling each function on each individual HIT). -
Specify the
auto.approval.delay
argument. By default, this is set to 30 days (orseconds(days = 30)
). Approving each assignment in bulk creation mode will be time consuming because approving each HIT requires a separate API call. Specifying a shorter approval delay will allow the MTurk system to approve the work for you without the need to callApproveAssignment
yourself. -
Performing operations on a bulk creation batch involves a (potentially very large) number of separate API calls. By default, MTurkR records all API calls in a local file (
MTurkRlog.tsv
) in the working directory. You can save some time and avoid this record-keeping by using the global optionoptions(MTurkR.log = FALSE)
or by passingMTurkR.log = FALSE
in your bulk creation function. This has the potential to speed up the HIT creation process and other operations performed in bulk. -
Set aside enough time for code to run. Again, because MTurkR has to make a separate API call for each HIT being created, it can be time consuming to do bulk creation and, especially, to perform any subsequent operation on a batch (such as assignment approval).
.