Recent Updates


Important Dates

* All deadlines are calculated at 11:59 pm
UTC-12 hours

Trial Data Ready Jul 31 (Sat), 2021
Training Data Ready Sep 3 (Fri), 2021
Test Data Ready Dec 3 (Fri), 2021
Evaluation Start Jan 10 (Mon), 2022
Evaluation End Jan 31 (Mon), 2022
System Description Paper Submission Due Feb 23 (Wed), 2022
Notification to Authors Mar 31 (Thu), 2022
Camera-ready Due TBD
Workshop Summer, 2022

Baseline System

We have released a baseline system that fine-tunes the XLM-RoBERTa base model for each of the dataset.

This code repository provides the baseline approach for Named Entity Recognition (NER). In this repository the following functionalities are provided:

We expect, this baseline can be the starting point from where the participants can start building their own systems.

Baselines Results

The following table presents the NER performance in different languages using the baseline system. For each language, we report the Precision (P), Recall (R), and F1 score in identifying each of the entity class label. Then we provide the micro averaged perforance on all the classes. Additionally, we report the performance on Mention Detection (MD), where the task is to only identify an entity boundaries regardless of the entity type.

Note that these results are for the development set.


BN DE ES TR FA RU ZH NL KO EN HI
PROD P 0.568 0.77 0.647 0.651 0.569 0.675 0.669 0.639 0.703 0.643 0.58
R 0.577 0.805 0.773 0.701 0.554 0.733 0.707 0.717 0.715 0.687 0.58
F1 0.572 0.787 0.704 0.675 0.561 0.703 0.687 0.676 0.709 0.664 0.58
GRP P 0.722 0.808 0.698 0.76 0.718 0.708 0.667 0.798 0.776 0.797 0.806
R 0.703 0.737 0.702 0.76 0.762 0.801 0.48 0.822 0.733 0.805 0.703
F1 0.712 0.771 0.7 0.76 0.74 0.752 0.558 0.81 0.754 0.801 0.751
CORP P 0.647 0.741 0.794 0.833 0.743 0.779 0.75 0.841 0.717 0.794 0.629
R 0.693 0.782 0.794 0.845 0.706 0.799 0.796 0.779 0.816 0.777 0.624
F1 0.669 0.761 0.794 0.839 0.724 0.789 0.772 0.809 0.763 0.785 0.626
CW P 0.627 0.73 0.691 0.694 0.665 0.719 0.656 0.696 0.673 0.644 0.479
R 0.533 0.688 0.641 0.679 0.7 0.732 0.664 0.731 0.736 0.597 0.504
F1 0.577 0.708 0.665 0.686 0.682 0.726 0.66 0.713 0.703 0.619 0.491
PER P 0.862 0.913 0.914 0.822 0.678 0.744 0.85 0.871 0.787 0.912 0.746
R 0.91 0.922 0.862 0.857 0.881 0.802 0.919 0.892 0.757 0.928 0.774
F1 0.885 0.918 0.887 0.839 0.766 0.772 0.883 0.881 0.771 0.92 0.76
LOC P 0.745 0.874 0.827 0.798 0.809 0.717 0.89 0.893 0.783 0.844 0.636
R 0.812 0.916 0.836 0.843 0.849 0.733 0.856 0.87 0.798 0.88 0.786
F1 0.777 0.894 0.831 0.82 0.828 0.725 0.873 0.881 0.791 0.862 0.703
Micro Avg. P 0.69 0.825 0.773 0.767 0.71 0.724 0.76 0.803 0.746 0.794 0.645
R 0.697 0.83 0.777 0.792 0.76 0.766 0.771 0.814 0.762 0.8 0.663
F1 0.694 0.827 0.775 0.779 0.734 0.744 0.765 0.809 0.754 0.797 0.654
MD P 0.802 0.912 0.838 0.802 0.741 0.767 0.829 0.857 0.792 0.864 0.78
R 0.81 0.917 0.842 0.828 0.793 0.811 0.841 0.869 0.81 0.871 0.8
F1 0.806 0.914 0.84 0.815 0.766 0.788 0.835 0.863 0.801 0.867 0.79

Communication