Annotating frames can be the most time-consuming aspect of supervised machine learning, and computational tools are developed at pace to address this time-sink. For example, pose-estimation tools such as DeepLabCut and SLEAP, promote workflows that significantly decrease the required body-part labelling time; DeepLabCut asks users to extract/refine/merge body-part outlier, and SLEAP asks users to take a computer-assisted labelling approach and run non-optimized models and correct their prediction results until they are perfected. JAABA (Janelia Automatic Animal Behavior Annotator), another excellent and popular tool for machine classifation of animal behavior, is also based on a workflow composed of iterative re-training of classifiers based on machine-generated predictions.
These workflows are built upon feeding computer-generated predictions, initially with human oversight, back into the model as training examples and thus improving its predictions through more data, all while spending significantly less time annotating images. This is approach is not restricted to animal tracking / pose-estimation, but can also use for improving the performance of behavior classifiers in SimBA.
Before using the 'pseudo-labelling' tool in SimBA, we need to generate some machine learning classification predictions for videos that has not been used to train the model. Detailed information on how to generate such machine classifications can be found in the Scenario 1 - building classifiers from scratch tutorial. More documentation on how to generate machine predictions can also be found in the generic SimBA documentation. We recommend reading the of these tutorials up to and including Step 8 - Run Machine Model.
When you've successfully generated your machine predictions, you will have CSV files containing your machine classifications, with one CSV file for each video in your project, in your project_folder/csv/machine_results
directory. We will now use the 'Pseudo-labelling` tool in SimBA to look at these machine classifications and correct them when necessery.
After loading your project in SimBA, navigate to the Label behavior
tab and uou should see the following menu:
-
In the sub-menu titled
Pseudo Labelling
, there is a entry-box calledFrame folder
. Click onBrowse
and select to the folder containing the frames for the video you wish to correct the machine predictions for. -
Beneath the
Frame folder
entry-box, there is a sub-menu titledThreshold
. Within this sub-menu there are entry-boxes, with one entry-box for each classifier in the project. in each of these entry-boxes, insert a value between0.00
and1.00
. These numbers represents the classification threshold for each classifier. If you are unfamiliar withclassification thresholds
, then you can read more about them in the tutorial for Scenario 1 - Step 8, or Scenario 2 - Part3. In brief, theclassification threshold
represents how sure the computer has to be before it classifies a frame as containing the behavior. Aclassification threshold
value of 0.80, for example, would mean that the computer has to be 80% sure that the frame contains the behavior to classify it as containing the behaviour. When theclassification threshold(s)
have all been entered, click theCorrect label
button. -
After clicking the
Correct label
button, a window will pop open that looks similar to this:
This interface is near-identical to the labelling interface described in the SimBA behavioral annotator GUI tutorial and users should head over to this tutoral for detailed instructions on how to use this interface to label behaviours.
This interface, however, contains one key difference: the Check Behavior
menu has been pre-filled with the models classifcation results according to the threshold the user set in Threshold
menu described in above:
Now navigate the frames and the machine model predictions, by using the arrow keys, and view the video by clicking the Open video
button. Proceed to add any ticks in tick-boxes that may be missing ticks in the Check behavior
menu, or remove ticks that are incorrectly present. Again, please see the SimBA behavioral annotator GUI tutorial for more information on how to use this annotator interface. Once you have finished correcting the machine-generated predictions, click on Save csv
. A new CSV, named after the video, with the user-corrected labels, is then saved in the project_folder/csv/targets_inserted
folder and can be used, together with other annotated CSV files, to generate new machine classiciation models for the behaviors of interest.
Note: If the representations of the machine predictions (e.g., the tick-boxes) are too sensative (e.g., too many boxes are ticked when they shouldn't be ticked) or too weak (e.g., too many boxes are un-ticked when they should be ticked), consider closing the labelling interface window, and revise the thresholds in the
Pseudo Labelling - thesholds
sub-menu. If too many boxes are ticked for a specific behavior classifier, then decrease the threshold for this classifier. If too few boxes are ticked, then increase the threshold for the specific classifier. When you have titrated the thresholds, re-start the labelling interface by clicking theCorrect labels
button.
If you want to correct further videos with machine-generated predictions, then close the labelling interface, and navigate back to the sub-menu titled Pseudo Labelling
and the Frame folder
entry-box and browse and select a new folder containing frames from a different video.
Note: What happens during this process, is that SimBA grabs the CSV file associated with the name of the frame folder from the
project_folder/csv/machine_results
directory when the user clicks theCorrect labels
button. Moreover, when the user clicks theSave csv
button, a new CSV file is generated in theproject_folder/csv/targets_inserted
folder, and this new file is identical to the file in theproject_folder/csv/machine_results
folder, except that the relevant prediction columns have been amended to reflect the users-intruduced changes indicated through theCheck behavior
tick-box menu.
Congrats! You have now successfully generated more machince model training data using 'Pseudo-labelling' in SimBA.
For details tutorial on how to generate machine classifiers based on the pseudo-labelled data, and all other CSV data files in your project_folder/csv/targets_inserted
directory, see the Train Machine Model tutorial.