<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">izvestswsu</journal-id><journal-title-group><journal-title xml:lang="ru">Известия Юго-Западного государственного университета</journal-title><trans-title-group xml:lang="en"><trans-title>Proceedings of the Southwest State University</trans-title></trans-title-group></journal-title-group><issn pub-type="ppub">2223-1560</issn><issn pub-type="epub">2686-6757</issn><publisher><publisher-name>ЮЗГУ</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.21869/2223-1560-2025-29-3-86-98</article-id><article-id custom-type="elpub" pub-id-type="custom">izvestswsu-1499</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>ИНФОРМАТИКА, ВЫЧИСЛИТЕЛЬНАЯ ТЕХНИКА И УПРАВЛЕНИЕ</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="en"><subject>COMPUTER SCIENCE, COMPUTER ENGINEERING AND CONTROL</subject></subj-group></article-categories><title-group><article-title>Применение глубокого обучения сверточной нейронной сети для классификации жестов из набора данных Sign Language MNIST</article-title><trans-title-group xml:lang="en"><trans-title>Applying deep learning convolutional neural network to classify gestures from MNIST Sign Language dataset</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><contrib-id contrib-id-type="orcid">https://orcid.org/0000-0002-5400-6817</contrib-id><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Бобырь</surname><given-names>М. В.</given-names></name><name name-style="western" xml:lang="en"><surname>Bobyr</surname><given-names>M. V.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Бобырь Максим Владимирович - доктор технических наук, профессор кафедры программной инженерии.</p><p>ул. 50 лет Октября, д. 94, Курск 305040</p><p>Researcher ID G-2604-2013</p></bio><bio xml:lang="en"><p>Maxim V. Bobyr - Dr. of Sci. (Engineering), Professor of the Software Engineering Department, Southwest State University.</p><p>50 Let Oktyabrya str. 94, Kursk 305040</p><p>Researcher ID G-2604-2013</p></bio><email xlink:type="simple">fregat_mn@rambler.ru</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><contrib-id contrib-id-type="orcid">https://orcid.org/0009-0007-8271-7660</contrib-id><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Асеев</surname><given-names>А. А.</given-names></name><name name-style="western" xml:lang="en"><surname>Aseev</surname><given-names>A. A.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Асеев Артем Андреевич - аспирант кафедры программной инженерии.</p><p>ул. 50 лет Октября, д. 94, Курск 305040</p></bio><bio xml:lang="en"><p>Artem A. Aseev - Post-Graduate Student of the Software Engineering Department, Southwest State University.</p><p>50 Let Oktyabrya str. 94, Kursk 305040</p></bio><email xlink:type="simple">aseeff.artem@yandex.ru</email><xref ref-type="aff" rid="aff-1"/></contrib></contrib-group><aff-alternatives id="aff-1"><aff xml:lang="ru"><institution>Юго-Западный государственный университет</institution></aff><aff xml:lang="en"><institution>Southwest State University</institution></aff></aff-alternatives><pub-date pub-type="collection"><year>2025</year></pub-date><pub-date pub-type="epub"><day>29</day><month>11</month><year>2025</year></pub-date><volume>29</volume><issue>3</issue><fpage>86</fpage><lpage>98</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Бобырь М.В., Асеев А.А., 2025</copyright-statement><copyright-year>2025</copyright-year><copyright-holder xml:lang="ru">Бобырь М.В., Асеев А.А.</copyright-holder><copyright-holder xml:lang="en">Bobyr M.V., Aseev A.A.</copyright-holder><license xml:lang="ru" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>Данная работа распространяется под лицензией Creative Commons Attribution 4.0.</license-p></license><license xml:lang="en" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://izvestswsu.elpub.ru/jour/article/view/1499">https://izvestswsu.elpub.ru/jour/article/view/1499</self-uri><abstract><sec><title>Цель исследования</title><p>Цель исследования. Задача распознавания жестов в системах компьютерного зрения имеет важное значение для разработки доступных интерфейсов взаимодействия человека с компьютером, в том числе и для людей с ограниченными возможностями. Традиционные методы, например использование ручного выделения признаков (HOG, SIFT) в сочетании с классификаторами типа SVM, обладают ограниченной точностью и чувствительны к изменениям освещения, фона и позы руки. Целью данной работы является построение и обучение сверточной нейронной сети (CNN) для эффективной классификации жестов на основе набора данных Sign Language MNIST. В рамках исследования решались задачи предобработки данных, проектирования архитектуры модели, её обучения и оценки качества распознавания на тестовом наборе.</p></sec><sec><title>Методы</title><p>Методы. Использовались библиотеки TensorFlow и Keras для реализации CNN. Модель включает сверточ-ные слои для извлечения локальных признаков, слой Flatten для векторизации, полносвязные слои с функ-цией активации ReLU и выходной слой с Softmax. Обучение проводилось с использованием оптимизатора Adam и функции потерь sparse_categorical_crossentropy на 27 455 изображениях, тестирование — на 7 172 примерах.</p></sec><sec><title>Результаты</title><p>Результаты. Предложенная модель достигла точности 89,14 % на тестовом наборе данных после 18 эпох обучения, что превосходит результаты традиционных методов (HOG + SVM – 70,1 %) и простых нейронных сетей (78,4 %).</p></sec><sec><title>Заключение</title><p>Заключение. Применение сверточных нейронных сетей для классификации жестов является эффективным подходом, обеспечивающим высокую точность и устойчивость к вариациям входных данных, что делает его перспективным для задач компьютерного зрения и разработки систем жестового взаимодействия.</p></sec></abstract><trans-abstract xml:lang="en"><sec><title>Relevance</title><p>Relevance. Gesture recognition in computer vision systems is important for the development of accessible human-computer interaction interfaces, including for people with disabilities. Traditional methods, such as manual feature extraction (HOG, SIFT) in combination with SVM classifiers, have limited accuracy and are sensitive to changes in lighting, background, and hand pose.</p></sec><sec><title>Purpose of research</title><p>Purpose of research. The aim of this work is to build and train a convolutional neural network (CNN) for efficient gesture classification based on the Sign Language MNIST dataset. The study addressed the problems of data preprocessing, model architecture design, training, and recognition quality assessment on the test set.</p></sec><sec><title>Methods</title><p>Methods. TensorFlow and Keras libraries were used to implement the CNN. The model includes convolutional layers for local feature extraction, a Flatten layer for vectorization, fully connected layers with a ReLU activation function, and an output layer with Softmax. The training was performed using the Adam optimizer and the sparse_categorical_crossentropy loss function on 27,455 images, and testing was performed on 7,172 examples.</p></sec><sec><title>Results</title><p>Results. The proposed model achieved 89.14% accuracy on the test dataset after 18 training epochs, which outperforms traditional methods (HOG + SVM - 70.1%) and simple neural networks (78.4%).</p></sec><sec><title>Conclusion</title><p>Conclusion. The use of convolutional neural networks for gesture classification is an effective approach that provides high accuracy and is robust to variations in input data, making it promising for computer vision and gesture interaction systems.</p></sec></trans-abstract><kwd-group xml:lang="ru"><kwd>нейронная сеть</kwd><kwd>сверточная нейронная сеть</kwd><kwd>полносвязный слой</kwd><kwd>функция активации</kwd><kwd>функция потерь</kwd><kwd>Sign Language MNIST</kwd><kwd>CPU</kwd><kwd>GPU</kwd></kwd-group><kwd-group xml:lang="en"><kwd>neural network</kwd><kwd>convolutional neural network</kwd><kwd>fully connected layer</kwd><kwd>activation function</kwd><kwd>loss function</kwd><kwd>Sign Language MNIST</kwd><kwd>CPU</kwd><kwd>GPU</kwd></kwd-group></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Gradient-based learning applied to document recognition / Y. LeCun, L. Bottou, Y. Bengio, P. Haffner // Proceedings of the IEEE. 1998. № 86(11). P. 2278–2324.</mixed-citation><mixed-citation xml:lang="en">LeCun Y., Bottou L., Bengio Y., Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE. 1998; (86): 2278-2324.</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Krizhevsky A., Sutskever I., Hinton G. ImageNet Classification with Deep Convolutional Neural Networks // Advances in Neural Information Processing Systems. 2012. Vol. 25. P. 1097–1105.</mixed-citation><mixed-citation xml:lang="en">Krizhevsky A., Sutskever I., Hinton G. ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems. 2012; 25: 1097-1105.</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">Воронцов К. В. Машинное обучение и анализ данных // Труды международной научной конференции "Нейроинформатика". М.: МФТИ, 2020. 452 с.</mixed-citation><mixed-citation xml:lang="en">Vorontsov K. V. Machine learning and data analysis. In: Trudy mezhdunarodnoi nauchnoi konferentsii "Neiroinformatika" = Proceedings of the international scientific conference "Neuroinformatics". Moscow; 2020. 452 p. (In Russ.).</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Петров И. В., Смирнов А. А. Применение сверточных нейронных сетей для классификации изображений в задачах компьютерного зрения // Искусственный интеллект и принятие решений. 2021. № 2. С. 45-58.</mixed-citation><mixed-citation xml:lang="en">Petrov I. V., Smirnov A. A. Application of convolutional neural networks for image classification in computer vision problems. Iskusstvennyi intellekt i prinyatie reshenii = Artificial Intelligence and Decision Making. 2021; (2): 45-58. (In Russ.).</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Китенко А. М. Метод поиска и разметки артефактов на изображениях с использованием алгоритмов детекции и сегментации // Системы анализа и обработки данных. 2021. № 4(84). С. 7-18.</mixed-citation><mixed-citation xml:lang="en">Kitenko A. M. Method for searching and marking artifacts in images using detection and segmentation algorithms. Sistemy analiza i obrabotki dannykh = Data analysis and processing systems. 2021; (4): 7-18. (In Russ.).</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Robust Hand Gesture Recognition Using HOG-9ULBP Features and SVM Model / J. Li, C. Li, J. Han, et al. // Electronics. 2022. Vol. 11(7). P. 988.</mixed-citation><mixed-citation xml:lang="en">Li J., Li C., Han J., et al. Robust Hand Gesture Recognition Using HOG-9ULBP Features and SVM Model. Electronics. 2022; 11 (7): 988.</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Козлов С. В., Иванова Е. П. Сравнительный анализ архитектур глубоких нейронных сетей для распознавания образов // Программные продукты и системы. 2022. № 3. С. 28-36.</mixed-citation><mixed-citation xml:lang="en">Kozlov S. V., Ivanova E. P. Comparative analysis of deep neural network architectures for pattern recognition. Programmnye produkty i sistemy = Software products and systems. 2022: (3): 28-36. (In Russ.).</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">Simonyan K., Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition // International Conference on Learning Representations (ICLR). 2015. arXiv:1409.1556.</mixed-citation><mixed-citation xml:lang="en">Simonyan K., Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition. International Conference on Learning Representations (ICLR). 2015; arXiv:1409.1556.</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">Kumar R., Patel S., Sharma M. Enhancing Sign Language Detection through MediaPipe and Convolutional Neural Networks // arXiv preprint. 2024. arXiv:2406.03729v1.</mixed-citation><mixed-citation xml:lang="en">Kumar R., Patel S., Sharma M. Enhancing Sign Language Detection through MediaPipe and Convolutional Neural Networks. arXiv preprint, 2024, arXiv:2406.03729v1.</mixed-citation></citation-alternatives></ref><ref id="cit10"><label>10</label><citation-alternatives><mixed-citation xml:lang="ru">Ioffe S., Szegedy C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift // Proceedings of the 32nd International Conference on Machine Learning (ICML). 2015. P. 448-456.</mixed-citation><mixed-citation xml:lang="en">Ioffe S., Szegedy C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Proceedings of the 32nd International Conference on Machine Learning (ICML), 2015. P. 448-456.</mixed-citation></citation-alternatives></ref><ref id="cit11"><label>11</label><citation-alternatives><mixed-citation xml:lang="ru">Семенов Д. А., Кузнецов М. И. Оптимизация процесса обучения сверточных нейронных сетей с использованием адаптивных алгоритмов // Информационные технологии. 2023. Т. 29, № 4. С. 195-203.</mixed-citation><mixed-citation xml:lang="en">Semenov D. A., Kuznetsov M. I. Optimization of the Training Process of Convolutional Neural Networks Using Adaptive Algorithms. Informatsionnye tekhnologii = Information Technologies. 2023; 29(4): 195-203. (In Russ.).</mixed-citation></citation-alternatives></ref><ref id="cit12"><label>12</label><citation-alternatives><mixed-citation xml:lang="ru">Nair V., Hinton G. E. Rectified Linear Units Improve Restricted Boltzmann Machines // Proceedings of the 27th International Conference on Machine Learning (ICML). 2010. P. 807-814.</mixed-citation><mixed-citation xml:lang="en">Nair V., Hinton G. E. Rectified Linear Units Improve Restricted Boltzmann Machines. Proceedings of the 27th International Conference on Machine Learning (ICML), 2010. P. 807-814.</mixed-citation></citation-alternatives></ref><ref id="cit13"><label>13</label><citation-alternatives><mixed-citation xml:lang="ru">Deep Residual Learning for Image Recognition / K. He, X. Zhang, S. Ren, J. Sun // IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016. P. 770-778.</mixed-citation><mixed-citation xml:lang="en">He K., Zhang X., Ren S., Sun J. Deep Residual Learning for Image Recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016. P. 770-778.</mixed-citation></citation-alternatives></ref><ref id="cit14"><label>14</label><citation-alternatives><mixed-citation xml:lang="ru">Kingma D. P., Ba J. Adam: A Method for Stochastic Optimization // International Conference on Learning Representations (ICLR). 2015. arXiv:1412.6980.</mixed-citation><mixed-citation xml:lang="en">Kingma D. P., Ba J. Adam: A Method for Stochastic Optimization. International Conference on Learning Representations (ICLR), 2015, arXiv:1412.6980.</mixed-citation></citation-alternatives></ref><ref id="cit15"><label>15</label><citation-alternatives><mixed-citation xml:lang="ru">Dropout: A Simple Way to Prevent Neural Networks from Overfitting / N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov // Journal of Machine Learning Research. 2014. Vol. 15, № 1. P. 1929-1958.</mixed-citation><mixed-citation xml:lang="en">Srivastava N., Hinton G., Krizhevsky A., Sutskever I., Salakhutdinov R. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research. 2014; 15(1): 1929-1958.</mixed-citation></citation-alternatives></ref><ref id="cit16"><label>16</label><citation-alternatives><mixed-citation xml:lang="ru">Going deeper with convolutions / C. Szegedy, W. Liu, Y. Jia, et al. // IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2015. P. 1-9.</mixed-citation><mixed-citation xml:lang="en">Szegedy C., Liu W., Jia Y., et al. Going deeper with convolutions. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2015. P. 1-9.</mixed-citation></citation-alternatives></ref><ref id="cit17"><label>17</label><citation-alternatives><mixed-citation xml:lang="ru">Фаворская М. Н., Пахирка А. И. Построение карт глубины при обнаружении презентационных атак в системах распознавания лиц // Информационные и математические технологии в науке и управлении. 2022. № 3(27). С. 40-48.</mixed-citation><mixed-citation xml:lang="en">Favorskaya M. N., Pakhirka A. I. Construction of depth maps for detection of presentation attacks in face recognition systems. Informatsionnye i matematicheskie tekhnologii v nauke i upravlenii = Information and mathematical technologies in science and management. 2022; (3): 40-48. (In Russ.).</mixed-citation></citation-alternatives></ref><ref id="cit18"><label>18</label><citation-alternatives><mixed-citation xml:lang="ru">Sign Language Transformers: Joint End-to-End Sign Language Recognition and Translation / N. C. Camgoz, O. Koller, S. Hadfield, R. Bowden // IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020. P. 10023-10033.</mixed-citation><mixed-citation xml:lang="en">Camgoz N. C., Koller O., Hadfield S., Bowden R. Sign Language Transformers: Joint End-to-End Sign Language Recognition and Translation. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020. P. 10023-10033.</mixed-citation></citation-alternatives></ref><ref id="cit19"><label>19</label><citation-alternatives><mixed-citation xml:lang="ru">Исследование устройства нечеткого цифрового фильтра для робота-манипулятора / М.В. Бобырь, Н.А. Милостная, В.А. Булатников, М.Ю. Лунева // Известия Юго-Западного государственного университета. 2020. T. 24, №1. С. 115-129. https:// doi.org/10.21869/2223-1560-2020-24-1-115-129</mixed-citation><mixed-citation xml:lang="en">Bobyr M. V., Milostnaya N. A., Bulatnikov V. A, Luneva М. Yu. Fuzzy Digital Filter Device Study for the Robot Manipulator. Izvestiya Yugo-Zapadnogo gosudarstvennogo universiteta = Proceedings of the Southwest State University. 2020; 24(1): 115-129 (In Russ.). https://doi.org/10.21869/2223-1560-2020-24-1-115-129</mixed-citation></citation-alternatives></ref><ref id="cit20"><label>20</label><citation-alternatives><mixed-citation xml:lang="ru">Бобырь М. В., Нассер А. А., Абдулджаббар М. А. Исследование свойств мягкого алгоритма нечетко-логического вывода // Известия Юго-Западного государственного университета. 2016. № 1. С. 31-49.</mixed-citation><mixed-citation xml:lang="en">Bobyr. M. V., Nasser A. A., Abduljabbar M. A. Study of the properties of a soft algorithm for fuzzy-logical inference. Izvestiya Yugo-Zapadnogo gosudarstvennogo universiteta = Proceedings of the Southwest State University. 2016; (1): 31-49. (In Russ.).</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
