Labelling a dataset for supervised learning is particularly expensive in computer security as expert knowledge is required for annotation. Some research works rely on active learning to reduce the labelling cost, but they often assimilate annotators to mere oracles providing ground-truth labels.
Most of them completely overlook the user experience while active learning is an interactive procedure. In this paper, we introduce an end-to-end active learning system, ILAB, tailored to the needs of computer security experts.
We have designed the active learning strategy and the user interface jointly to effectively reduce the annotation effort. Our user experiments show that ILAB is an efficient active learning system that computer security experts can deploy in real-world annotation projects.
We provide an open source implementation to foster comparison in future research works and to allow security experts to annotate their own datasets.
A scientific paper presented for the AICS (Artificial Intelligence for Computer Security) workshop.