Models

We tackle models in V7, from how to prepare your training datasets, to deployment.

In this session, we are exploring the process of training and utilizing your own models in V7, the MLOps platform designed for efficient AI product development. Custom models provide you with the ability to develop AI solutions that can effortlessly handle any computer vision task, whether it involves image classification, object detection, or instance segmentation.

To get started, we will guide you through the models tab in V7, where you can initiate the training process. We will explain the distinctions between various computer vision models, highlighting the advantages and use cases of each type. For the purpose of this video, we will specifically focus on training an instance segmentation model for bird species detection.

You will learn the significance of having an adequate number of labeled instances for each class in your dataset, as this greatly impacts the performance of the model. With just 100 labeled instances, you can train a dependable custom model to assist with various tasks such as data labeling, quality testing, and even for production use via the V7 API.

We will walk you through the training process step by step. Once the training is complete, we will demonstrate how to deploy the model with a simple click. Moreover, we will delve into how you can integrate your model with your Python script using V7's REST API. With just a few lines of code, you can submit an image to the model and receive predictions for the detected objects, along with their labels and instance segmentation.

Whether you are new to V7 or an experienced user, this video will empower you to harness the full potential of custom models to expedite your AI projects and streamline your data annotation process.

You can train models in V7 with as few as 100 labeled instances. These models can then be used to help label additional data, test labeling quality and quantity, and can be used in production by calling them via the API. Or, you can use one of the already trained models for tasks like document scanning.

Okay, that sounds like a lot, but let me show you how easy it is to train and use a model with V7.

To train a model we first need to go to the models tab. Here we can click on this train a model button and are already introduced to the first decision that we need to make. Here we have the option to train an instance segmentation model, an object detection model, or a simple classification model.

Classification models just provide a certain tag to one frame or one image that it sees, like in this specific example. It sees an image, or in this case, multiple frames, and can associate a class label to it.

Object detection models can provide bounding boxes to certain items, but those bounding boxes are not the most accurate representation or not the most accurate annotation. You have some dead space that the bounding box cannot cover correctly.

Therefore, we have instance segmentation models that have polygon shapes and can mold around every single item we want to detect. It has the highest accuracy in general.

Since we have the options of those three, let's go ahead and select an instance segmentation model and call it “bird species” because we will want to train a model that can detect different bird species. As mentioned, the model that we're going to train is one that can detect bird species and we can already see here which class we have in this specific dataset and how many instances we have of those annotations.

To have a pretty decently performing model you want at least 100 instances per class. The data amount and data quality will pretty much be the bottleneck of the performance of your model.

Now, since we have selected our dataset and chosen which class we want to include, we could potentially just train our model on European Robins and Nightingales, but we'll go ahead and train on all three of the classes. We can just continue. Now we are already at the point where we can just click start training and the model will automatically be trained and will take up to three hours because our dataset is pretty large.

In this case, we'll again have a split in machine learning, we want to usually have a training, validation, and testing split to reduce overfitting. The split that they will be using is an 80-10-10 split. So 80% of the training area of the data that we have will be used for training. 10% will be used for the validation set and 10% will be used for the testing set.

Let me just go ahead and start the training and we'll see each other when the training is done.

The training really just took 45 minutes and we can go ahead and start the model. We can deploy it and it's as simple as just pressing one button to start the model.

Here we can specify some parameters. How many handles do we want to have per minute? What will the performance be? How much throughput do we need? Then some parameters where we can start the model when it receives a request, and stop it when it is idle.

It will cost 0. 03 credits per minute, which is really not that much. And let's just go ahead and start our model. And that's it!

Our model is now up and running.

We can click on our model and look at some metrics. We can see that we have a 95% mean average precision score, which is really good. We can see how our loss progressed during the training and our model performs really well. We can also on this page how to integrate the model into our Python script, which we will be looking at in a second.

Our model is up and running. Now, at this point, you could just drag in an example image and have a look at the prediction that your model would return. You could look at the predicted segmentation map overlaid over your example image, but let's implement that ourselves and look at how we would call this model using the REST API.

Okay, since we're using the REST API to get our predictions from our model that is running on V7, we only really need to import our requests library. We also need to provide the API key that I stored in a separate file so that you can't see what my API key is. We also need the URL through which we will access our model.

If you're wondering where I got this URL from, you can just go to your model card and when going to Python for example, you'll see here that you can find the URL itself. Everything that you really need is directly in this model card. So let's just go ahead and run this cell and progress with the code.

Again, the first step that we want to do is we'll get one example image.

Why am I doing this base encoding here? Again, just something that is done in the code right here and we can pretty much just proceed.

Now to get the prediction again, it is really really simple. We want to build up our payload in our header. As you can see, it's only two lines of code.

Our header only includes our authorization, which is our API key and our payload, our JSON, includes the image data which is then again, decoded. If I just run the cell and run the next cell to dissect the whole payload a bit, we can look at what it contains. Now, when dissecting the response, what can we see right here?

We can see that our results will contain some results: all the predictions that it has made, and all the instances that it has detected. In this case, it will have detected only one instance. Now, we'll be accessing this first instance right here, just the “0” element, the first element, and look at the keys that it has.

We can see that for this one object that it has detected, we have a bounding box, even though we have trained an instance segmentation model. We'll have the label, and we'll have the polygon, the actual instance segmentation. Now, for example, when we look at the label, we'll see the label is an African Grey Parrot, which is correct, as I know what the image is.

Then here, when looking at the polygon and at the path, we'll see that this is the actual polygon with all the key points.

Now from here on, you can do whatever you want with your prediction. I just went ahead and wrote a custom script to plot this data to have a nice visualization.

If I just run this script, let me hide this one here, we can see that we have a prediction for an African Grey Parrot. The green part right here is the actual segmentation map, and we can see that it ran successfully.

We can again go ahead, and change the image.

We can go to image 2. Rerun every cell. Get our prediction.

We can see we again have one object, which is an African Grey Parrot. We can again visualize it. And, would you look at that, we have a really nice prediction of our African Grey Parrot. I also went ahead and wrote a script to do a prediction on a video that I will show you right now.

You can see everything works really fine. But you know, that's only one thing that you can do with this model. Let me show you what else you can do with this powerful tool. This is the dataset that I used for training and as you can see, everything already has an annotation, and everything has a label, except for those two images right here.

Now what we can go ahead and do is click on our workflow that is used for this specific dataset and we can see it's a very simple one. What we can now do is add a model stage and we'll look at what this does. We can connect a model. This model is running right here, this is our bird species model and we can go ahead and connect the model to this specific stage.

What we can now do is just plug this model stage right between the dataset and the annotation stage. So what will happen now is when some images from the dataset stage pass progress into the next stage, it will go into the AI model stage.

Let's see what happens there.

Let's look at the two images that are missing.

Let's say we want to open this image right here, and we can see there is no label. However, can send it to the model stage. Now, this bird is the second image that has no annotation, and we will send it as well to the model stage. We can now go back and look at all images that are not complete, and that are now in the annotation stage, and if we now again open our image, we can see they are already annotated.

We can just pre-annotate our images by passing them automatically through this model stage. They are now in the annotation stage with already existing annotations. So what we can now do is just zoom in and do some fine-tuning of the annotation and this is just really, really powerful.

We can use the model that we have trained to help us label new data and with this new data that we have trained or created using the model that we have trained we could, in theory, train a new model that is even better performing. This is a human-in-the-loop system where you can continuously train a model, use the model for labeling, train a new model, and so on.

What we have seen so far is that we can easily train a custom model on our own dataset. We can then deploy this model in just a few clicks and run inference calls using the REST API in really just a few lines of code.

What we have also seen is that we can use this model in our workflow, in our annotation pipeline, to pre-annotate our data automatically and then just do a little fine-tuning. If you don't have enough data to start this training process to then build a model that you can use to speed up your annotation process, you can just use one of the public models that have been already trained.

For example, this receipt scanner is a text scanner or an OCR model that we can just start. In the same way that we have used our bird species model, we can integrate it into a workflow and use that one.

Let's relax a bit. That was a lot, but nothing too difficult.

You now know how to train a model on your own custom data. How to use that model or any other public model to help you with the annotation process, and how to use it in production using the REST API.

I hope this video helped you with getting started with V7.