Use of a neural network in creating a digital assistant for blind and visually impaired people
https://doi.org/10.26425/2658-3445-2022-5-3-73-82
Abstract
The experience of ongoing research in image processing clearly demonstrates the huge scope for the development of new neural networks that can help people in a wide range of tasks. The authors chose the direction of work related to helping people who have vision problems. The article considers a convolutional neural network of the Mask R-CNN model for segmenting objects in an image. During the research the authors study a large number of algorithms that can quickly and accurately process images, such as Faster R-CNN, which was the most efficient in 2020. During the analysis, it was revealed that the use of Mask R-N technology can significantly increase the efficiency of performing tasks, since this algorithm is the latest version of the machine learning model. As a result of the study, a neural network was developed that is capable of identifying and distinguishing a large number of objects in an image. The next step is to refine the algorithm and use additional means of interaction with the hardware of the systems to increase the speed of the neural network. In the future, the resulting neural network will be integrated into the Digital Assistant for the Blind and Visually Impaired Persons application. This application is guaranteed to improve the daily life of people with disabilities who experience certain inconveniences due to their features, and can become the basis for other, larger projects related, for example, to unmanned devices, as well as services whose work is directly based on image processing.
About the Authors
S. O. PlotnikovRussian Federation
Sergey O. Plotnikov - Student
Moscow
D. Yu. Smetanin
Russian Federation
Dmitry Yu. Smetanin - Student
Moscow
A. V. Basova
Russian Federation
Angelika V. Basova - Student
Moscow
I. A. Lvutin
Russian Federation
Ilya A. Lvutin - Student
Moscow
M. N. Belousova
Russian Federation
Maria N. Belousova - Cand. Sci. (Econ.), Assoc. Prof. at the Information Systems Department
Moscow
References
1. Beloglazova A.A. (2015), “Education and socialization of children with visual impairment”, Correctional Pedagogy: Theory and Practice, no 1, pp. 83–86.
2. Boldinova O.G. (2015), “Socialization of preschool children with visual impairments in inclusive education”, Bulletin of Cherepovets State University, vol. 5, no. 66, pp. 87–91.
3. Forsyth D.A., Ponce J. (2004), Computer vision. Modern approach, Prentice Hall, Upper Saddle River, US.
4. Girshick R., Donahue J., Darrell T., Malik J. (2013), Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Columbus, USA, 23–28 June 2014. IEEE, New York, US, pp. 580–587, http://dx.doi.org/10.1109/CVPR.2014.81
5. Gonzalez R., Faisal Z. (2019), Digital Image Processing, 2nd ed., Pearson plc, London, UK.
6. Gu J., Wang Zh., Kuen J., Ma L., Shahroudy A., Shuai B., Liu T., Wang X., Wang G. (2018), “Recent Advances in Convolutional Neural Networks”, Pattern Recognition, vol. 77, pp. 354–377, https://doi.org/10.1016/j.patcog.2017.10.013
7. He K., Gkioxari G., Dollár P. and Girshick R. (2017), Mask R-CNN, In: Proceedings of IEEE International Conference on Computer Vision, ICCV 2017, Venice, 22–29 October 2017. IEEE, New York, US, pp. 2980–2988, https://doi.org/10.1109/iccv.2017.322
8. Kovalevskij A.M. (2018), “User profiling algorithms using neural networks”: Abstr. diss. … Mgr. Sci. (Tech.): 1–40 80 02, BSUIR, Minsk, Belarus.
9. Li Z., Liu F., Yang W., Peng S., Zhou J. (2021), “A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects”, IEEE Transactions on Neural Networks and Learning Systems, pp. 1–21, https://doi.org/10.1109/TNNLS.2021.3084827
10. Lin T.Y., Maire M., Belongie S., Bourdev L., Girshick R., Hays J., Perona P., Ramanan D., Zitnick L.C., Dollár P. (2014), Microsoft COCO: Common Objects in Context, In: Proceedings of 13th European Conference on Computer Vision ECCV, Zurich, Switzerland, 6–12 September 2014. Springer. 16 p.
11. Markova S.V., Zhigalov K.Yu. (2017), “Application of a neural network to create an image recognition system”, Basic Research, no. 8, pp. 60–64.
12. Sikorskij O.S. (2017), “Overview of convolutional neural networks for the problem of image classification”, New information technologies in automated systems, no. 20, pp. 37–42.
13. Sirota A.A., Mitrofanova E.YU., Milovanova A.I. (2019), “Analysis of algorithms for searching objects in images using various modifications of convolutional neural networks”, Proceedings of Voronezh State University. Series Systems analysis and information technologies, no. 3, pp. 123–137.
14. Uijlings J., Sande K., Gevers T., Smeulders A.W.M. (2013), “Selective Search for Object Recognition”, International Journal of Computer Vision, vol. 104, pp. 154–171, https://doi.org/10.1007/s11263-013-0620-5
15. Vuletić G., Šarlija T., Benjak T. (2016), “Quality of life in blind and partially sighted people”, Journal of Applied Health Sciences, vol. 2, pp. 101–112, http://dx.doi.org/10.24141/1/2/2/3
16. Welp A., Woodbury R.B., McCoy M.A. (2016), Making Eye Health a Population Health Imperative: Vision for Tomorrow, National Academies Press, Washington (DC), US.
17. Wu J. (2017), Introduction to Convolutional Neural Networks, Nanjing University, Nanjing, China.
Review
For citations:
Plotnikov S.O., Smetanin D.Yu., Basova A.V., Lvutin I.A., Belousova M.N. Use of a neural network in creating a digital assistant for blind and visually impaired people. E-Management. 2022;5(3):73-82. (In Russ.) https://doi.org/10.26425/2658-3445-2022-5-3-73-82