Darwin JSON

The Darwin JSON format is crucial when using V7 for machine learning projects - we explain how it works.

Darwin JSON serves as the core export format for V7 annotations and model predictions. In this video, we will explore its structure and advantages over other data annotation formats. You’ll also learn how to convert your labels to and from Darwin JSON. Whether you want to import existing annotations into V7 or export your annotations to other formats, this tutorial has got you covered.

Darwin JSON contains essential specifications, including the item and annotations fields, providing all the necessary information about the exported resource. It includes details about datasets, teams, and resource URLs, making it a rich source of metadata.

You'll learn how to access essential data, such as image dimensions, thumbnail URLs, and the original image file name and URL. Then, we’ll explore different annotations, covering various types like bounding boxes or polygons. You'll discover how to represent annotation data in Darwin JSON, including optional sub-annotation data and annotation authorship information. For video annotations, the video annotation object is explained, encompassing frame annotation data and interpolation information.

Converting annotations to and from Darwin JSON is a crucial skill for integrating V7 into your existing workflows and tech stack. The video demonstrates how to convert annotations using the Darwin command line interface function library. Converting from Darwin to other formats, like YOLO, is as simple as calling a function with the right arguments.

By the end of this video, you'll have a solid understanding of Darwin JSON and its benefits. You'll be equipped with the knowledge to import your existing annotations into V7, export your annotations to other formats, and leverage Darwin JSON's flexibility for seamless integration into your computer vision projects.

V7 Darwin JSON reference: https://docs.v7labs.com/reference/darwin-jso

Our Darwin JSON output format is the most versatile format out of all common computer vision formats.

Let's have a look at its structure, advantages, and also how to convert your labels to Darwin JSON, so that you can easily import the existing annotations you have created outside of V7.

Oh, and why not also the other way around?

The content of each Darwin JSON file can be summarized by the following specifications.

An export includes all the information about the exported resource. The most important information here is the item and annotations field. The dataset field simply encodes the Darwin dataset name that the resource belongs to.

The version field will be 2.0 when the export format is Darwin JSON 2.0. The slots field again describes a resource that can be an image or video uploaded to V7, including its original metadata.

We're dealing with images, this item includes the internal file name of a particular slot on Darwin, the height and width of the image, the URL of the image thumbnail, and a list of the source file's metadata, which itself, contains the name of the original image and its URL.

A video item contains similar information as an image, since it is just a sequence of images. It also includes the frame rate in which the video has been annotated on Darwin, the number of frames generated from the original video, the URLs of the frames generated from the original video, and the path of the file within Darwin.

The item field includes information about the referenced resource. That is - the image or video uploaded to Darwin within particular slots.

This, for example, includes the internal filename of the item on V7 - not of the particular slot, but the entire item that includes the slots.Then we have a list of slots metadata, which includes source information like the dataset and team information that the item belongs to. This includes the URL of the image on Darwin, that routes directly to the Darwin work view.

Slots are a powerful concept where you can view multiple images or files at once in the V7 UI. These are especially useful, for example, in medical image labeling when dealing with mammography hanging protocols.

We have a full video on slots, so feel free to check that one out and also look at the documentation.

Finally, the annotations field actually lists all the annotations created on Darwin for that resource, be it bounding boxes or polygons in images or videos.

This is how an image annotation is defined. We have an optional list of annotators and reviewers of the image, the annotation class name, a list of the different slot names, and the actual data of the annotation with optional sub-annotation data. A main annotation type simply encodes a string of one of the main annotation types available on Darwin and is used as a key to the actual annotation data.

Note that a single annotation includes one and only one main annotation type. The annotation data now actually represents the core data of an annotation created on Darwin. Here we have many options to choose from, like bounding boxes, instance ID numbers, polygon paths, and more. For a full-up-to-date definition, please have a look at the amazing documentation.

The video annotation object is built very similarly and encodes the information of an annotation created on a video in Darwin. It includes all the frame annotation data, but also the annotation class name, and interpolation information, and it may optionally include authorship information. Each frame can now be described as annotation data, but with additional data depending on the way the frame was created on Darwin - for example, manually by an annotator, or programmatically by the interpolation algorithm.

Outside the annotation data, the only required field for imports in Darwin JSON is the file name. That is what's used for associating the name with the corresponding file in V7 upon import.

Okay, that was a lot.

Now, let's have a look at how simple it is to convert annotations to and from Darwin JSON - Despite the complexity of the Darwin JSON format.

Converting from Darwin to another annotation format, as you can see, is really simple. We luckily already have a function that does exactly what we want in the Darwin command line interface function library.

So let's go ahead and import that dependency and directly get to the function call.

We need to provide three arguments: the annotation format that we want to convert to, the list of files that we want to convert from Darwin (YOLO in this case), and the output directory where we want to store the annotations or the converted annotations.

And that's really it.

Let me just execute the cell. You can see that I have successfully created the converted file in this directory right here. I've already pulled it right out right here and you can see, this is the file in the YOLO format of a bounding box that was stored in the Darwin JSON format.

It's as simple as that.

So let's have a look at how to do it the other way around.

Okay, if you want to convert proprietary or your own custom annotation format to Darwin JSON, you will need to write a custom script. But since we know how the Darwin JSON format is defined, we can simply plug in the missing values.

For example, simply filling in the name of the file, the corresponding dataset, and the team name into the respective item entry.

When it comes to the actual annotation data, you'll need to write a custom conversion function to convert your format to the respective Darwin definition. If you're looking for a hand when creating those custom conversion scripts, don't forget that there are products like ChatGPT or GitHubCopilot that can do a lot of the legwork for you,

Bounding boxes require you to specify the height and width of the bounding box. The reference coordinate on the image is the top left corner, so naturally the x and y values for the bounding box are also the leftmost and topmost coordinates. If you want to convert your skeleton annotations to Darwin's definition, you will need to provide a list with all the nodes.

These nodes include the name of the node, a boolean value that is true when the node is occluded, and the key points.

For polygons, you will need to simply provide a list with all vertices as we have already seen before. Again, Darwin supports a few such formats. To have a detailed overview of those, have a look at the documentation.

Great, we're done. You now have a very good understanding of the Darwin JSON format and how to convert your existing annotations to the Darwin JSON format, and vice versa.

I hope this video helped you with getting started with V7.