Arcus Documentation
Back to homepage

Candidate Evaluation and Selection

To use Arcus Model Enrichment, you first ran a trial, which matched the model to external data candidates through the platform. In this section, you will evaluate the results of the trial and select the best candidate for the model.

During the trial, Arcus recorded key performance metrics by running training loops using each candidate to see how they affected the model’s performance. You can use the results of the trial to evaluate the different data candidates. You can visualize these results with the Model SDK or see them in the Arcus Platform.

Once you’ve selected the candidate that suits your needs, it can be used in your ML workflows. Your existing training and serving workflows require no changes to train and serve the enriched model. This enriched model always automatically consumes the selected data from the platform for use in training and inference.

Evaluation with Model SDK

Continuing the example from the trialing section. You used Arcus’ Pytorch Lightning module to run a trial for the model. Similarly, the arcus_module and arcus_trainer objects below maintain the same interface as Pytorch Lightning’s LightningModule and Trainer objects so they can be used with no changes. The Arcus wrapped objects dynamically fetch and consume external data throughout your existing workflows.

arcus_module = MyLightningModule(arcus_model)

arcus_trainer = arcus.model.torch.Trainer()
trial = arcus_trainer.trial(arcus_module)

At the end of the trial, the Arcus Model SDK returns a arcus.model.torch.Trial object. You can print the results of this trial by calling the summary() method.

>>> print(trial.summary())
                                  Trial Results                           

| External Data Candidate | Features |  Epochs | val_accuracy | val_loss |
|-------------------------|----------|---------|--------------|----------|
|     baseline (none)     |    n/a   |   100   |     0.62     |   1.33   |
|       2a41f36af3        |    288   |   100   |     0.94     |   0.57   |
|       4f94b81ead        |    427   |   100   |     0.97     |   0.42   |
|       fa348dc2f3        |    135   |   100   |     0.65     |   1.15   |
|       wttehak9n2        |    118   |   100   |     0.81     |   0.82   |

For more details, visit https://app.arcus.co/project/MY_PROJECT_ID/.

In the trial summary above, Arcus showed the four data candidates that the platform matched to the model along with the baseline model which was not enriched with any additional data. Each row shows a data candidate with their key performance metrics on the validation set after training for 100 epochs. The val_accuracy and val_loss columns show the performance on the validation set at the end of training, which was reported by the LightningModule object to Arcus. The Features column shows how many additional features were added to the model from each data candidate.

From this summary, you can see that Arcus matched four data candidates. Two of them (Candidate IDs 2a41f36af3 and 4f94b81ead) most significantly improved the model’s accuracy and loss compared to the other candidates, though all four candidates improved the model’s performance. You can see that candidate 2a41f36af3 added 288 features to the model while candidate 4f94b81ead added 427 features. These represent the volume of external context that each data candidate added to the model.

Evaluation and Selection

The results of the trial are visualized on the Arcus Platform. By going into the project detail page, you can see results of the trial compared to the baseline.

You can also see other factors that determine which candidate is best for your use case. For example, you can see:

  • A summary of what underlying data each candidate added to our model, such as whether it was demographic data, location data, etc.
  • Freshness of the data, which represents how frequently the data is updated. For tasks that rely on live indicators, it’s important to take data freshness into account.
  • The cost of each data candidate. The cost of each candidate is expressed as credits per 1000 accesses and is determined dynamically during the auction process.

Once you’ve determined which candidate best fits your use case, you can select it on the Arcus platform. This selection will then be used for training and serving the model, with no changes to the existing workflows.