Data processing¶

proliferate automatically converts your experiment data into multiple CSV files that can be easily loaded into R or other statistical analysis packages.

As explained in the JavasScript library documentation, your experiment submits participant data as a JSON object. proliferate stores this data as a JSON object but automatically converts the data when you download it via the command line interface or the web interface.

To demonstrate how this works, consider the following simple JSON object with data from 5 trials as well as information about the subject:

{
  "trials": [
    {
      "stimulus": "horse",
      "reaction_time": 235,
      "correct_answer": 1
    },
    {
      "stimulus": "rabbit",
      "reaction_time": 219,
      "correct_answer": 0
    },
    {
      "stimulus": "raccoon",
      "reaction_time": 177,
      "correct_answer": 1
    },
    {
      "stimulus": "bear",
      "reaction_time": 364,
      "correct_answer": 1
    },
    {
      "stimulus": "sloth",
      "reaction_time": 271,
      "correct_answer": 1
    }
  ],
  "subject_information": {
    "native_language": "English",
    "other_languages": "Spanish, French",
    "age": 30,
    "problems": "One image didn't load",
    "comments": "Great experiment!",
    "condition": "animals"
  }
}

In this made-up example, the experiment stores the stimulus, the participant’s reaction time and whether they gave the correct answer to a comprehension question. This information is stored in the trials variable as a list of objects.

The information about the subject is stored in the subject_information variable as an object.

proliferate will create a separate CSV file for each variable at the highest level of the JSON object. In this example it would therefore create two CSV files: <experiment_name>-trials.csv and <experiment_name>-subject_information.csv.

For lists (such as trials), proliferate will create one row for each element in the list. The columns in the CSV file correspond to the variables of each trial (stimulus, reaction_time, and correct_answer in this example). Given that there are 5 elements in the trials list, proliferate would add 5 rows for each participant to <experiment_name>-trials.csv. For objects (such as subject_information) proliferate adds one row per participant. The CSV again contains one column for each variable in the object (native_language, other_languages, age, problems, comments, and condition in this example). proliferate also automatically adds three columns to each CSV, a anonymous participant ID (workerid), the name of the condition (will be generally condition1 except when you implement different conditions in different experiments as described here TODO), and an error column that shows errors if they occur during the data export (e.g., missing data from individual participants).

Special files¶

proliferate also creates an additional CSV file named <experiment_name>-workerids.csv which contains a mapping between the Prolific participant IDs and the anonymous IDs used in all other CSV files. You can use this file to map responses to the Prolific IDs (for example, if you want to pay some participants a bonus depending on their performance). For privacy reasons, this file should never be published along with your dataset.

If your experiment data JSON object contains exactly one list of objects (such as trials in the example above), proliferate will also create a CSV file named <experiment_name>-merged.csv that contains one line for each trial and also contains all the information from all the other variables (e.g., all the information stored in subject_information).

Note

Make sure that the data format is consistent across participants. proliferate extracts the structure of the data from one participant and then assumes that the data from all other participants follows the same structure.