WANLP 2020: The Fifth Workshop for Arabic Natural Language Processing Nuanced Arabic Dialect Identification (NADI) task

  • Version
  • Download 0
  • File Size 0.00 KB
  • File Count 1
  • Create Date December 17, 2020
  • Last Updated December 17, 2020

WANLP 2020: The Fifth Workshop for Arabic Natural Language Processing Nuanced Arabic Dialect Identification (NADI) task

Introduction
Arabic has a widely varying collection of dialects. Many of these dialects remain under-studied due to rarity of resources. The goal of the shared task is to alleviate this bottleneck in the context of fine-grained Arabic dialect identification. Dialect identification is the task of automatically detecting the source variety of a given text or speech segment. Previous work on Arabic dialect identification has focused on coarse-grained regional varieties such as Gulf or Levantine (e.g., Zaidan and Callison-Burch, 2013; Elfardy and Diab, 2013; Elaraby and Abdul-Mageed, 2018) or country-level varieties (e.g., Bouamor et al., 2018; Zhang and Abdul-Mageed, 2019) such as the MADAR shared task in WANLP 2019 (Bouamor, Hassan, and Habash, 2019). The MADAR shared task also involved city-level classification on human translated data.

Shared Task
The Nuanced Arabic Dialect Identification (NADI) shared task targets province-level dialects, and as such will be the first to focus on naturally-occurring fine-grained dialect at the sub-country level. The data covers a total of 100 provinces from all 21 Arab countries and come from the Twitter domain. Evaluation and task set up follow the MADAR 2019 shared task. The subtasks involved include Subtask 1: Country-level dialect identification: A total of 21,000 tweets, covering all 21 Arab countries. This is a new dataset created for this shared task.

The QMUL/HRBDT contribution to the NADI Arabic Dialect Identification Shared Task
We present the Arabic dialect identification system that we used for the country-level subtask of the NADI challenge. Our model consists of three components: BiLSTM-CNN, character-level TF-IDF, and topic modeling features. We represent each tweet using these features and feed them into a deep neural network. We then add an effective heuristic that improves the overall performance. We achieved an F1-Macro score of 20.77% and an accuracy of 34.32% on the test set. The model was also evaluated on the Arabic Online Commentary dataset, achieving results better than the state-of-the-art.