The main difficulty arising in the process of automating the retrieval of objects from heterogeneous distributed information bases of an enterprise is the problem of unification of disparate content presented from different points of view and in the context of different paradigms for organizing data storage. The article presents the formulation of the problem of developing graphematic analysis for the purpose of recognizing images of technical documentation and converting graphic information into a machine-readable form, the mechanisms for removing stop words, stemming, lemmatization necessary for solving the problem are described in detail, and an algorithm for searching text structures using templates is developed. The article proposes the implementation of the graphematic analysis algorithm as the first module in the automatic processing of texts in natural language, which makes it possible to parcel out semantically significant constructions from semi-structured resources using special graphematic descriptors. The proposed implementation makes it possible to parcel out such complex structures in natural language, such as, for example, direct speech, to detect and replace abbreviations and abbreviations.
Skip Nav Destination
Article navigation
22 June 2022
PROCEEDINGS OF THE II INTERNATIONAL CONFERENCE ON ADVANCES IN MATERIALS, SYSTEMS AND TECHNOLOGIES: (CAMSTech-II 2021)
29–31 July 2021
Krasnoyarsk, Russian Federation
Research Article|
June 22 2022
Development of the mechanism for graphical analysis formation for pattern recognition of technical documentation and converting graphic information into machine-readable form
Anastasia Petrushevskaya;
Anastasia Petrushevskaya
a)
Saint-Petersburg State University of Aerospace Instrumentation (SUAI), ul. Bolshaya Morskaya
, 67, lit. A, St. Petersburg, 190000, Russia
Search for other works by this author on:
Alexey Rabin
Alexey Rabin
b)
Saint-Petersburg State University of Aerospace Instrumentation (SUAI), ul. Bolshaya Morskaya
, 67, lit. A, St. Petersburg, 190000, Russia
b)Corresponding author: [email protected]
Search for other works by this author on:
b)Corresponding author: [email protected]
AIP Conf. Proc. 2467, 040026 (2022)
Citation
Anastasia Petrushevskaya, Alexey Rabin; Development of the mechanism for graphical analysis formation for pattern recognition of technical documentation and converting graphic information into machine-readable form. AIP Conf. Proc. 22 June 2022; 2467 (1): 040026. https://doi.org/10.1063/5.0094467
Download citation file:
Pay-Per-View Access
$40.00
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
16
Views
Citing articles via
Inkjet- and flextrail-printing of silicon polymer-based inks for local passivating contacts
Zohreh Kiaee, Andreas Lösel, et al.
Effect of coupling agent type on the self-cleaning and anti-reflective behaviour of advance nanocoating for PV panels application
Taha Tareq Mohammed, Hadia Kadhim Judran, et al.
Students’ mathematical conceptual understanding: What happens to proficient students?
Dian Putri Novita Ningrum, Budi Usodo, et al.
Related Content
Methodology for experimental verification of software that implements the algorithm for graphematic analysis and preprocessing of text resources
AIP Conference Proceedings (November 2021)
Text processing technology in Uzbek speech to sign language translation systems
AIP Conf. Proc. (November 2024)
The role of speech readability in the intelligibility of visual speech signals produced by cued speech transliterators.
J Acoust Soc Am (October 2010)
Application of the method of metaontologies in the intellectual analysis of text resources considering the fuzziness and blurring of images of natural language units
AIP Conference Proceedings (November 2022)
Handwriting text recognition using neural network
AIP Conf. Proc. (March 2025)