This study explores the potential of Small Language Models (SLMs) as an efficient and secure alternative to larger models like GPT-4 for various natural language processing (NLP) tasks. With growing concerns around data privacy and the resource-intensiveness of large models, SLMs present a promising solution for research and applications requiring fast, cost-effective, and locally deployable models. The research evaluates several SLMs across tasks such as translation, summarization, Named Entity Recognition (NER), text generation, classification, and retrieval-augmented generation (RAG), comparing their performance against larger counterparts. Models were assessed using a range of metrics specific to the intended task. Results show that smaller models perform well on complex tasks, often rivalling or even outperforming larger models like Phi-3.5. The study concludes that SLMs offer an optimal trade-off between performance and computational efficiency, particularly in environments where data security and resource constraints are critical. The findings highlight the growing viability of smaller models for a wide range of real-world applications.
Keywords
AISLMIoTNLPModel Evaluation Metrics.
References
"SmolLM - blazingly fast and remarkably powerful," Feb. 2024, [Online]. Available: https://huggingface.co/blog/smollm.
R. Rei, C. Stewart, A. C. Farinha, and A. Lavie, "COMET: A Neural Framework for MT Evaluation," arXiv.org, Sep. 18, 2020, [Online]. Available: https://arxiv.org/abs/2009.09025.
T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, "BERTScore: Evaluating Text Generation with BERT," arXiv.org, Apr. 21, 2019, [Online]. Available: https://arxiv.org/abs/1904.09675.
S. Es, J. James, L. Espinosa-Anke, and S. Schockaert, "RAGAS: Automated Evaluation of Retrieval Augmented Generation," arXiv (Cornell University), Jan. 2023, [Online]. Available: https://doi.org/10.48550/arxiv.2309.15217.
A. Grattafiori et al., "The Llama 3 herd of models," arXiv.org, Jul. 31, 2024, [Online]. Available: https://arxiv.org/abs/2407.21783.
V. Rusinov, O. Honcharenko, A. Volokyta, H. Loutskii, O. Pustovit, and A. Kyrianov, "Methods of topological organization synthesis based on tree and dragonfly combinations," in Lecture notes on data engineering and communications technologies, 2023, pp. 472-485, [Online]. Available: https://doi.org/10.1007/978-3-031-36118-0_43.
G. Team et al., "Gemma 2: Improving open language models at a practical size," arXiv.org, Jul. 31, 2024, [Online]. Available: https://arxiv.org/abs/2408.00118.
A. Yang et al., "QWen2.5 Technical Report," arXiv.org, Dec. 19, 2024, [Online]. Available: https://arxiv.org/abs/2412.15115.
"smollm/text/README.md at main · huggingface/smollm," GitHub, [Online]. Available: https://github.com/huggingface/smollm/blob/main/text/README.md.
M. Abdin et al., "PHI-3 Technical Report: A highly capable language model locally on your phone," arXiv.org, Apr. 22, 2024, [Online]. Available: https://arxiv.org/abs/2404.14219.
J. Bai et al., "Qwen Technical Report," arXiv.org, Sep. 28, 2023, [Online]. Available: https://arxiv.org/abs/2309.16609.
Z. Wang et al., "Re-TASK: Revisiting LLM Tasks from Capability, Skill, and Knowledge Perspectives," arXiv.org, Aug. 13, 2024, [Online]. Available: https://arxiv.org/abs/2408.06904.
J. Zhou et al., "Instruction-Following evaluation for large language models," arXiv.org, Nov. 14, 2023, [Online]. Available: https://arxiv.org/abs/2311.07911.