Recently I wrote an article about getting all prediction scores from your ML.NET model. So, when we want to do this for an ONNX model we have loaded with ML.NET, that should work the same, right?
Not really. Depending on the version of ML.NET, the datatypes of the downloaded Azure Custom Vision ONNX model are very hard to map on the .NET datatypes we use.
In this article I will show how we can get the labels and scores from an ONNX model prediction in C#.
In my article on classifying a bitmap with a downloaded ONNX model, I have described how to set up the code to get predictions from the model. In this article we will build on top of that code.
The prediction engine
Let us start with creating an instance of the prediction engine as we did in the previous post. The engine is loaded with a model that we exported from the custom vision portal.
var mlContext = new MLContext(); var transformer = mlContext.Transforms.ResizeImages("image", 224, 224, "Image") .Append(mlContext.Transforms.ExtractPixels("data", "image")) .Append(mlContext.Transforms.ApplyOnnxModel("classLabel", "data", "model.onnx")) .Fit(mlContext.Data.LoadFromEnumerable(new List<ImageInputData>())); var predictionEngine = mlContext.Model .CreatePredictionEngine<ImageInputData, ImagePrediction>(transformer);
At this moment, the
class holds only the predicted label. For this article, the ImagePrediction
ImageInputData
class is not relevant.
public class ImagePrediction { [ColumnName("classLabel")] public string[] Labels; }
Calling the Predict
method on the engine will return a ImagePrediction
object with Labels
holding the value of the highest scoring label.
var modelOutput = this.predictionEngine.Predict(inputData); // modelOutput.Labels[0] <- highest scoring label
Getting the scores
I have already written what the benefit of getting the scores on your prediction is in the previous article, but to repeat myself:
Trusting the model and accepting the highest score blindly might not be wise. If none of the labels really match, all values are close to 0%. And you get the label that scored just a tiny bit better than the others. You might see strange behavior in your application as prediction might be random and/or completely wrong.
To prevent this, consider a threshold that needs to be exceeded before accepting the prediction for further use. But in the current model, there is no score. So how can we retrieve a prediction score?
– Get all prediction scores from your ML.NET model
Extending the output model
First, extend the model to get all scores back. And here it gets interesting. In Netron, the datatype of the loss
property is shown as a sequence
of map
s with a string
key and a float
value.
Extending the class with the loss property.
private class ImagePrediction { [ColumnName("classLabel")] public string[] Labels { get; set; } [ColumnName("loss")] public List<Dictionary<string, float>> Scores { get; set; } }
We must also make sure the transformer will output the loss
property as a column. For this we need to change the parameters of the ApplyOnnxModel
method.
var transformer = mlContext.Transforms.ResizeImages("image", 224, 224, "Image") .Append(mlContext.Transforms.ExtractPixels("data", "image")) .Append(mlContext.Transforms.ApplyOnnxModel(new[] { "classLabel", "loss" }, new[] { "data" }, "model.onnx")) .Fit(mlContext.Data.LoadFromEnumerable(new List<ImageInputData>()));
The issue
Then running the code, this will result in the following runtime exception:
System.ArgumentOutOfRangeException: 'Could not determine an IDataView type for member Scores (Parameter 'rawType')'
Changing the datatypes to an array or any of the different collection types did not seems to matter. Also specifying a OnnxSequenceTypeAttribute
class with several datatypes did not resolve the situation.
Sometimes a different exception is thrown, but this is not helping to get it resolved.
System.InvalidOperationException: 'Can't bind the IDataView column 'loss' of type 'Microsoft.ML.Transforms.Onnx.OnnxSequenceType' to field or property 'Scores' of type 'IEnumerable`1[Dictionary`2[String, Single]]'.'
Looking at the source code of the OnnxTypeParser
class, we can find some interesting lines of code and comments that can get us on the right track.
... else if (typeProto.ValueCase == OnnxCSharpToProtoWrapper.TypeProto.ValueOneofCase.SequenceType) { // Now, we see a Sequence in ONNX. If its element type is T, the variable produced by // ONNXRuntime would be typed to IEnumerable<T>. ... else if (typeProto.ValueCase == OnnxCSharpToProtoWrapper.TypeProto.ValueOneofCase.MapType) { // Entering this scope means a ONNX Map (equivalent to IDictionary<>) will be produced.
So, it seems that sequences are tightly connected to IEnumerable
, and maps are tightly connected to IDictionary
. Any deviation from this will result in an exception.
So, let us change the datatype to IEnumerable<IDictionary<string, float>>
and set the OnnxSequenceTypeAttribute
to IDictionary<string, float>
.
private class ImagePrediction { [ColumnName("classLabel")] public string[] Labels { get; set; } [ColumnName("loss")] [OnnxSequenceType(typeof(IDictionary<string, float>))] public IEnumerable<IDictionary<string, float>> Scores { get; set; } }
This removes the exception, and the prediction now holds a lot more information.
ML.NET 1.5.1+
With the release of ML.NET 1.5.1, when the OnnxSequenceTypeAttribute
is not defined on the Scores
property, a much more helpful exception is thrown:
System.ArgumentOutOfRangeException: 'The expected type 'IEnumerable`1[IDictionary`2[String, Single]]' does not match the type of the 'loss' member: 'List`1[IDictionary`2[String, Single]]'. Please change the loss member to 'IEnumerable`1[IDictionary`2[String, Single]]' (Parameter 'actualType')'
This is a great improvement if you run into this exact situation.
As of version 1.5.2, this same exception is now also thrown if the OnnxSequenceTypeAttribute
is defined, awesome!
Using the scores
With everything wired up correctly, it is now easy to get the actual score of the highest scoring label.
var label = modelOutput.Labels[0]; var score = modelOutput.Scores.First()[modelOutput.Labels[0]];
Or if we want to get top scoring labels, we can sort the dictionary and take the top ten.
var top10scores = modelOutput.Scores.First() .OrderByDescending(kv => kv.Value) .Take(10);
For ONNX models in ML.NET, the mapping of datatypes is a little tricky, but knowing how to define this correctly, we can do more with a prediction than only taking the highest scoring label.