NAMED ENTITY RECOGNITION USING OPENAI GPT SERIES MODELS
Abstract
The amount of information grew dramatically over all available sources. One of the most important parts of it is textual data, so Natural Language Processing is one of the most important areas of research. A growing amount of information demands more sophisticated and effective models and approaches to be introduced. Named entity recognition is a key part of text processing and plays an important role in text understanding, automatic text summarization, translation, etc. A wide range of different approaches were used for named entity recognition, however, the introduction of the transformers architecture with self-attention mechanism made a significant impact on current approaches to Natural Language Processing tasks in general.
Most tasks are currently leveraging transformers as a state-of-the-art approach. Meanwhile, simpler transformer architecture in comparison with others grants the possibility of large language models with a huge number of parameters like GPT-3.
The main purpose of this article is to investigate how effectively OpenAI GPT series models could recognize named entities in English and Ukrainian texts. The research was based on the CoNLL 2003 dataset one of the most used for such kind of research and the lang-uk team labeled dataset. Due to known possibilities for GPT series models to be more effective with few-shot learning examples, experiments were built with zero, one, and three shots. Moreover, experiments were performed for whole articles and sentence by sentence from the same article to compare results. Different prompts were investigated, and one was chosen for the whole experiment. The estimation of the results was based on the F1 score and specifics of the results. Results demonstrate the overall great performance of the most recent models and the increase in performance from older to newer models. Furthermore, our findings indicate that there is still room for improvement and investigation.
Keywords: Named entity recognition, natural language processing, GPT, OpenAI
Full Text:
PDFReferences
- Li Jing, Aixin Sun, Jianglei Han, and Chenliang Li. "A survey on deep learning for named entity recognition." IEEE Transactions on Knowledge and Data Engineering 34, no. 1 (2020): 50-70.
- Roy Arya. "Recent trends in named entity recognition (NER)." arXiv preprint arXiv:2101.11420 (2021).
- Grishman Ralph, and Beth M. Sundheim. "Message understanding conference-6: A brief history." In COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics. 1996.
- LeCun Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning." nature 521, no. 7553 (2015): 436-444.
- Yadav Vikas, and Steven Bethard. "A survey on recent advances in named entity recognition from deep learning models." arXiv preprint arXiv:1910.11470 (2019).
- Shen Yanyao, Hyokun Yun, Zachary C. Lipton, Yakov Kronrod, and Animashree Anandkumar. "Deep active learning for named entity recognition." arXiv preprint arXiv:1707.05928 (2017).
- Vaswani Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. "Attention is all you need." Advances in neural information processing systems 30 (2017).
- Devlin Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).
- Baevski Alexei, Sergey Edunov, Yinhan Liu, Luke Zettlemoyer, and Michael Auli. "Cloze-driven pretraining of self-attention networks." arXiv preprint arXiv:1903.07785 (2019).
- Li Xiaoya, Xiaofei Sun, Yuxian Meng, Junjun Liang, Fei Wu, and Jiwei Li. "Dice loss for data-imbalanced NLP tasks." arXiv preprint arXiv:1911.02855 (2019).
- Brown Tom, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D. Kaplan, Prafulla Dhariwal, Arvind Neelakantan et al. "Language models are few-shot learners." Advances in neural information processing systems 33 (2020): 1877-1901.
- Wang Shuhe, Xiaofei Sun, Xiaoya Li, Rongbin Ouyang, Fei Wu, Tianwei Zhang, Jiwei Li, and Guoyin Wang. "GPT-NER: Named Entity Recognition via Large Language Models." arXiv preprint arXiv:2304.10428 (2023).
- Ye Junjie, Xuanting Chen, Nuo Xu, Can Zu, Zekai Shao, Shichun Liu, Yuhan Cui et al. "A comprehensive capability analysis of gpt-3 and gpt-3.5 series models." arXiv preprint arXiv:2303.10420 (2023).
- Sang, Erik F., and Fien De Meulder. "Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition." arXiv preprint cs/0306050 (2003).
- Lang-uk team dataset for NER repository. URL: https://github.com/lang-uk/ner-uk (accessed on April 10, 2023)
- OpenAI homepage, access to UI prompt and API. URL: https://openai.com (accessed on April 25, 2023)
- Promptify library repository. URL: https://github.com/promptslab/Promptify (accessed on April 25, 2023)
- SpaCy, NLP framework homepage. URL: https://spacy.io/ (accessed on April 25, 2023)
DOI: http://dx.doi.org/10.30970/eli.23.5
Refbacks
- There are currently no refbacks.