Skip to main content

Brainspace

Supervised Learning - CMML Workflow

Before creating a CMML classifier, you must first create a mutually exclusive tag in Reveal with 2 choices. Typically, the choices would be named Positive/Negative or Responsive/Non-Responsive. You can add additional choices to the tag for use in Reveal, for example “Further Review Required” or “Tech Issue”, but these will not be used in the CMML session.

Note

Starting with Brainspace 6.7, it is possible to add multiple choices, up to 5 positive, 5 negative and 5 neutral.

BRS67_MultiChoice_Tags_for_Classifier.png

Each positive choice sends the same Yes classification, each negative choice sends the same No, and each neutral choice shows the document has been seen but not classified.

There is no need to put the tag into a tag profile in Reveal, that will be done automatically once the CMML session has been created. Follow the instructions earlier in this document for creating a connected tag.

Once the connected tag is created click Supervised Learning -> New Classifier -> CMML.

Add_CMML_Classifier_for_Connector.png

Give the Classifier a descriptive name. Under Assign Tag select the Reveal connected tag choice to use as the positive and negative tag for Brainspace. In most cases, you won’t pre-review documents in Reveal using the same connected tag before creating the CMML session, but if documents are reviewed ahead of time, then those documents will be used as initial seed documents.

If you wish to export scores back to Reveal automatically after each round, enable option Export scores after each training round completes.

CMML session can be run in manual or auto mode. In manual mode, you create a training round, tag the documents in Reveal, pull the tags from Reveal into Brainspace, and then export scores to Reveal (if desired). Auto mode streamlines the process by automatically supplying training round documents on a timed basis as needed. It will also pull scores after each round if enabled.

To enable automatic training, click Enable automatic training. Enter the number of documents you wish to review for each round along with the method for selecting documents and how often to poll the Reveal API to update tagging progress.

Training_Round_Settings_6-5.png

Immediately after creating the CMML session and when auto mode is enabled, messages will appear occasionally asking to refresh the screen.

Training_Round_auto_update_status.png

Each CMML session has a unique identifier that is also used within Reveal to allow for multiple sessions at the same time. For example:

Classifier_Unique_ID.png

When a CMML session is created in Brainspace, the following items are automatically created in Reveal.

  • CMML review team with admin access by default.

  • A unique suggested training field.

  • A unique tag profile to which the connected tag is added.

  • A unique score field.

  • A unique field profile with admin and CMML review team access by default. The training field, score field, and tag field are added to this profile.

  • Main CMML root work folder under the Brainspace root folder.

  • Unique classifier folder under the main CMML root folder with admin and CMML review team access by default.

  • A work folder with all documents used for training under the classifier root folder.

  • A suggested for training needing review work folder under the classifier root folder.

At the bottom of the CMML session, you can view the number of documents in the current round along with the count of documents coded. In auto mode, the Reveal API is polled for tagging information and after all documents in the training round are tagged, Brainspace will close the round, perform score calculation, and create a new training round.

5a.jpg

To review the documents in Reveal, navigate to the suggested training folder under the classifier folder for the session and review the documents using the connected tag. You should also pick the field profile for the session. If Review is already open, you need to refresh your browser to reload the field and tag profiles. As always, you can create an assignment job if needed.

In manual mode, you can pull tags in Brainspace using following icon at the top of the session area.

1.png

Once training is complete, you can manually export scores to Reveal using the following icon at the bottom of the session area. Export_Scores_button.png

Session example after closing the first training round.

Classifier_session_example.png

Scores in Reveal range from 0.00 to 1.00 with 1.00 being most responsive.

Training_scores_in_Review.png

You can also create a control set if desired within a CMML session.

Use_Control_Set_for_Recall.png

When a control set is created in Brainspace, the following items are automatically created in Reveal.

  • A work folder with all control set documents that need to be reviewed under the classifier root folder.

  • A unique field indicating the document is a control set member.

  • A unique tag profile, tag set, and choices used to review the control set documents.

  • A unique field profile with admin and CMML review team access by default. The control set member field, score field (from the original session creation), and control set tag field are added to this profile.

If Review is already open, you need to refresh your browser to reload the field and tag profiles.

Note the tag and field profiles for the control set review have CtrlSet in the name.

Ensure that you select these versions while reviewing control set documents.

2.png

You can use the following button to update the coded status in Brainspace during or after all control set documents are coded.

Get_Control_set_status.png

Once all documents are coded, the control set is processed.

Processing_Control_Set.png

If necessary, you can add additional documents to a control set after it is created using the Modify button.

Training_Statistics.png