AI Labs released an annotation system: Long live the medical diagnosis experience.

The Dilemma of Taiwan’s Medical Laboratory Sciences

Thanks to the Breau Of National Health Insurance in Taiwan, abundant medical data are appropriately recorded. This is surely good news for us, an AI-based company. However, most of the medical data have not been labeled yet. What’s worse, Taiwan currently faces a terrible medical talent shortage. The number of experienced masters of medical laboratory sciences is getting smaller and smaller. Take Malaria diagnosis for example. Malaria parasites belong to the genus Plasmodium (phylum Apicomplexa). In humans, malaria is caused by P. falciparum, P. malariae, P. ovale, P. vivax and P. knowlesi. It is undoubtedly an arduous work for a human to detect and classify the affected cell to these five classes. Unfortunately, it is only one retiring master in this field in Taiwan that can indeed confirm the correctness of the diagnosis. We must take remedial action right away, yet it costs too much either time or money to train a human being to be a Malaria master. Only through the power of the technology can we preserve the valuable diagnosis experience.

Now, we decided to solve the problem by transferring human’s experience to the machines, and the first step is to annotate the medical data. Since the only one master cannot address the overwhelming data by himself, he needs some helpers to help him do the first detection job and, in the end, the master does the final confirmation. In this case, we need a system which allows multiple users to cooperate. It should be able to record the file path of the label data, the annotators, and the label time. We search for assorted off-the-shelf annotation systems, none of them, unfortunately again, meets our specification. So we decided to roll up our sleeves and revise a most relevant one for our propose.

An Improved Annotation System

Citing from an open-source annotation system resource [1], AILabs revised it and released a new, easy-to-use annotation system in order to help those who desire to create their valuable labeled data. With our labeling system, you will know who is the previous annotator and can systematically revise other’s work.

This system is designed for object labeling. By creating a rectangular box on your target object, you will be able to assign the label name and label coordinates to the chosen target. See the example below.

Also, you will obtain the label catalogs and the label coordinates in an XML file, which is in PASCAL VOC format. You can surely leverage the output XML file as the input of your machine learning programs.

How does it work?

Three steps: Load, Draw and Save.

In Load: Feed the system an image into this system. It is always fine if you do not have an XML file since it is your first time operating this system.

In Draw: Create as many as labels in an image as you want. Don’t forget that you may zoom in/out if the image is not clear enough.

In Save: Click save button. Everything is done. The system will output an XML file including all the labels data for an image.

What’s Next?

With the sufficient annotated data, we can then train our machines by learning the labels annotated by the medical master, which will make the machines able to make a diagnosis as brilliant as the last master. We will keep working on it!


[1] Tzutalin. LabelImg. Git code (2015). https://github.com/tzutalin/labelImg