Recent Updates

Important Dates

* All deadlines are calculated at 11:59 pm
UTC-12 hours

Trial Data Ready Jul 31 (Sat), 2021
Training Data Ready Sep 3 (Fri), 2021
Evaluation Start Jan 24 (Mon), 2022
Evaluation End Jan 28 (Fri), 2022
System Description Paper Submission Due Feb 23 (Wed), 2022
Notification to Authors Mar 31 (Thu), 2022
Camera-ready Due TBD
Workshop July, 2022 with NAACL

This shared task challenges NLP enthusiasts to develop complex Named Entity Recognition systems for 11 languages. The task focuses on detecting semantically ambiguous and complex entities in short and low-context settings. Participants are welcome to build NER systems for any number of languages. And we encourage to aim for a bigger challenge of building NER systems for multiple languages. The languages are: English, Spanish, Dutch, Russian, Turkish, Korean, Farsi, German, Chinese, Hindi, and Bangla. For some languages, an additional track with code-mixed data will be offered. The task also aims at testing the domain adaption capability of the systems by testing on additional test sets on questions and short search queries.

Highlights of the task

Data Examples

Here are some examples from all the 11 languages.
The entities are highlighted with the entity type.


Processing complex and ambiguous Named Entities (NEs) is a challenging NLP task in practical and open-domain settings, but has not received sufficient attention from the research community.

Complex NEs, like the titles of creative works (movie/book/song/software names) are not simple nouns and are harder to recognize (Ashwini and Choi, 2014). They can take the form of any linguistic constituent, like an imperative clause (“Dial M for Murder”), and do not look like traditional NEs (Person names, locations, organizations). This syntactic ambiguity makes it challenging to recognize them based on their context. Such titles can also be semantically ambiguous, e.g., “On the Beach” can be a preposition or refer to a movie. Finally, such entities usually grow at a faster rate than traditional categories, and emerging entities pose yet another challenge. Table 1 lists more details about these challenges, and how they can be evaluated.

Neural models (e.g., Transformers) have produced high scores on benchmark datasets like CoNLL03/OntoNotes (Devlin et al., 2019). However, as noted by Augenstein et al. (2017), these scores are driven by the use of well-formed news text, the presence of “easy” entities (person names), and memorization due to entity overlap between train/test sets; these models perform significantly worse on complex/unseen entities. Researchers using NER on downstream tasks have noted that a significant proportion of their errors are due to NER systems failing to recognize complex entities (Luken et al., 2018; Hanselowski et al., 2018).

Examples of Complex Entities



Anti-Harassment Policy

SemEval highly values the open exchange of ideas, freedom of thought and expression, and respectful scientific debate. We support and uphold the NAACL Anti-Harassment policy. Participants are encouraged to send any concerns or questions to the NAACL Board members, Priscilla Rasmussen and/or the workshop organizers.