Welcome to the Linux Foundation Forum!

Definition of "annotation" ( Chap12 )



The term of "annotation" is explained as follows ( "Annotations: Overview", of "Torchvision Operators", in Chap12.)

The process of drawing bounding boxes around the object of interest in an image is called annotation. Annotating a dataset is an intensive and laborious task, but there are a few tools that may help.

This is a little bit different from what I understand.
for example, a site saying like this( just a first hit by quick search) ;

In machine learning, data annotation is the process of detecting raw data i.e. images, videos, text files, etc. and tagging them.

This is what I understand as "annotation".( I've though that tagging is the very annotation.)

By the way, Is there clear definition about "annotation"? I would like to make my understanding as precise as possible.



  • dvgodoy
    dvgodoy Posts: 6

    Hi @m.taniguchi

    Thank you for your question. In the case of object detection, drawing bounding boxes (and then determining what's inside each box - its label) is annotating an image.

    In this context, you can also find in the site you provided, in the "Object Recognition and Classification" section, that it actually agrees with the provided definition (highlights are mine):

    "When it comes to object recognition and classification, this is where humans i.e. annotators come in, take the unlabeled, raw data and give them meaningful context i.e. labels."

    A little bit down further, you'll also find:

    "These labeling tools are equipped with features designed for the annotators to outline objects on an image and classify them. You’ll notice, object recognition is not the only goal of image annotation; once the objects are outlined on an image, you need to classify them."

    In a nutshell, annotating the data means determining what is the ground truth for each data point. In simple cases, such as image classification, it is just the label - "cat", "dog", etc. - and the process is commonly known as simply labeling. In more complex tasks, such as object detection or recognition, there may me multiple sets of coordinates for the bounding boxes, which the model is trained to detect, and defining these boxes is known as annotating.
    If it helps, you can think of labeling as a special case (the simplest) of annotating.



Upcoming Training