KaIDA: a modular tool for assisting image annotation in deep learning

Abstract Deep learning models achieve high-quality results in image processing. However, to robustly optimize parameters of deep neural networks, large annotated datasets are needed. Image annotation is often performed manually by experts without a comprehensive tool for assistance which is time- consuming, burdensome, and not intuitive. Using the here presented modular Karlsruhe Image Data Annotation (KaIDA) tool, for the first time assisted annotation in various image processing tasks is possible to support users during this process. It aims to simplify annotation, increase user efficiency, enhance annotation quality, and provide additional useful annotation-related functionalities. KaIDA is available open-source at https://git.scc.kit.edu/sc1357/kaida.


Implementation General
Common problems dealing with software tools suffer from a lack of open-source accessibility, a difficult installation process, or the availability only for one specific operating system. With a user manual in the form of a README located in an open-source repository (https://git.scc.kit.edu/sc1357/kaida), researchers are guided in the installation and usage of KaIDA. By developing KaIDA as a pip package and including the corresponding dependencies in terms of a conda environment, the tool can be installed easily on users' devices by following the manual without hurdles. We have tested the cross-platform software package on different operating systems (Windows 10, Ubuntu 20.04) using python 3.8.5. All requirements and dependencies are listed in the open-source repository.

Extendable Software Development Concept
Frequently, tools are designed either adaptable, but not usable for non-developers or easy to use but not adaptable. A major strength of our tool is its scope of application in the field of domain experts and computer scientists as well. The provided standard functionalities and GUI enables the direct usage for domain experts. Besides, an important objective in the design of KaIDA is providing a tool that is modular and generic to allow adaptations.
This flexibility can be understood on two levels, namely regarding image processing tasks and methods given in modules for an individual task. Therefore, we designed our concept initially task-independent to guarantee a maximum level of flexibility for the application.
Using interchangeable GUIs [1,4] for annotation and applying inheritance approaches during software development, additional tasks can be added without hurdles by developers. Figure 1 presents the structure used to achieve the objective of providing an extendable tool. Strictly following the software development concept of inheritance, basic architectures are defined as parents. The individual task (child) is formed as a specification of the basic definition (parent). Hence, the clarity and extendibility of code are supported by this idea. We natively support four image processing tasks.
Considering assistance in image annotation, methods for assistance are various and related to the underlying task or dataset. Analogously to the explained concept of inheritance regarding the tool/datamodule, our proposed concept of extendibility w.r.t. modules is given in Figure 2. Defining an abstract module (parent), the basic structure including definitions of interfaces is provided for developing individual methods available   in the corresponding module. The methods are clustered in general and task-related categories, respectively. Taking meta-information of the underlying task into account, KaIDA loads both, general and corresponding task-related methods automatically. We already provide various methods for each natively supported image processing task which are explained in the manuscript. Coding details regarding both ways of extension are depicted in the open-source repository.
To sum up, KaIDA can be extended for the usage of other image annotation tasks (e.g. classification of 3D image stacks) and customized by adding new methods of a specific module (e.g. a new selection strategy or pre-annotation algorithm). This feature is of particular interest from the perspective of computer scientists. Hence, the deployment of newly developed methods to users needs no great effort.

Concept
The usage of tools may be burdensome for domain experts due to high software/hardware requirements [2]. Hence, we take one step further and present a practical setup to scale the application of KaIDA in a broader context. Figure 3 illustrates the proposed concept. Experimenters can utilize remote desktop for establishing a connection to the processing server on their device (e.g. tablet, desktop computer, or mobile computer). This idea is advantageous from various points of view. First, our concept is independent regarding devices. Second, remote desktop is available on various operating systems (MacOS, Windows, Linux). Third, devices such as tablets can be used, which enhance pixel-wise image annotation for users. Further, a central processing server established with powerful hardware composed of GPU and CPU reduces the computational time during assisted annotation without imposing local requirements on the domain experts' devices. Besides, the establishment of all software needed guarantees a comfortable usage of KaIDA without the requirement of additional installations. A connection for software updates needs to be considered for the dynamic development in DL projects to deploy new features such as developed assistance methods easily. In addition, the required data transfer is realized by a connection to a shared data server.

Prototype
We have realized a prototype construction regarding our proposed concept given in Table 1. The processing server uses Windows Server 2019 to enable remote desktop connections. The hardware setup is composed of an Intel Xeon 4210R CPU, Nvidia Quadro RTX 4000 GPU, 32 GB RAM, and 512 GB for local temporary storage. We offer the option to borrow a Lenovo x12 Detachable for using a touchscreen for image annotation. Git is used for version control and software updates. Further, the Large Scale Data Facility (LSDF) [3] is considered as a data server.