Integrated system of intelligent analysis of information as a way to development the digital organization

Analysis of data mining systems, development of document management systems. Development of an integrated system for intelligent information analysis. A model for using various statistical data of any format in the source of a digital organization.

Рубрика Программирование, компьютеры и кибернетика
Вид статья
Язык английский
Дата добавления 01.02.2024
Размер файла 1,6 M

Отправить свою хорошую работу в базу знаний просто. Используйте форму, расположенную ниже

Студенты, аспиранты, молодые ученые, использующие базу знаний в своей учебе и работе, будут вам очень благодарны.

Размещено на http://www.allbest.ru/

Размещено на http://www.allbest.ru/

Integrated system of intelligent analysis of information as a way to development the digital organization

lurii Nikitin, Doctor of technical sciences, leading researcher,

The Institute for Superhard Materials of NAS Ukraine

Tatyana Sydorenko, Graduate student

The State University of Infrastructure and Technology

Abstract

Based on the analysis of systems of intelligent data analysis, intellectual analysis of texts, analysis of the development of document management systems, a model of using various statistical data of any format into a common source - an information platform is scientifically substantiated and proposed which is created for data management, analysis and monitoring in an integrated system of intelligent information analysis to data management organization of any business entity. Implementation of information platforms for data management is carried out in accordance with the requirements of current legislation and the preparation of relevant provisions or procedures for information interaction between various user entities.

Keywords: intellectual data analysis, intellectual analysis of texts, document management, digital organization, model,integrated system.

Introduction

Formulation of the problem. Organization of any business entity are the main elements of innovation systems of global, national and local (regional) scales. Approaches of open innovation practice and approaches of digitalization are becoming key approaches to the development of a new type of digital data management organizations of business entities.

The system of digital data analysis is one of the key mechanisms that can ensure the activities of such organizations and ensure their development into digital organizations.

Data mining allows you to look for patterns and requirements for the organization of data management and obtain relevant information from data by finding patterns and correlations in large volumes of data to turn them into valuable knowledge (analytics) to improve the organization's work processes, competitive advantages, decision-making processes, strategic planning and forecasting of future trends.

Modern requirements of dynamic changes in the external environment force organizations to integrate external and internal structured and unstructured data and metadata, creating new knowledge that becomes the assets of the organizations in the form of new analytical documents and results of research.

However, the increase in the number of documents and information that have added to databases makes it difficult to find the necessary information and manage it in organizations. Intelligent data analysis allows to significantly accelerating the search, research and acquisition of new knowledge. To date, there is a lack of research on how organizations implement and integrate data mining and text mining into their information and document management landscape at different organizational levels, and most existing data mining and text-mining systems are not integrated. The specified circumstances determine the relevance and importance of this study regarding the development of a system of intellectual analysis of information on the way to the formation and development of digital organization.

Ukraine has taken many steps towards the introduction of electronic document flow in many sectors of the economy, in particular, qualified electronic signature is actively used, almost all reporting submitted by a business entity is submitted using an electronic qualified signature. Electronic document management in Ukraine are have regulated by the Laws of Ukraine "On electronic trust services", "On electronic documents and electronic document management", "On electronic communications". According to these laws, electronic documents have the same legal force as similar paper ones, that is, in the case of tax audits, court hearings, etc., it is enough to provide documents in electronic form.

For most business entities, the use of electronic document management is convenient and effective when applying business processes. Business entities create electronic offices, electronic services, almost of all use official electronic mailboxes, comply with the requirements of electronic correspondence. This is a key factor in reducing costs, reducing the burden on employees, optimizing many technologic processes, which increases business efficiency and improves the activities of economic entities.

The introduction of digital processes into the activities of the organization allows us to dispose of redundant work processes and develop the internal work of economic entities using digital tools.

Analysis of recent research and publications

Today, data is undoubtedly a valuable resource for organizations. Organizations increasingly see data as a valuable asset that will help them succeed now and in the future. Over the past two decades, more and more companies have separated the data processing function into a separate department as internal data warehouses grow and data-related tasks have more differentiated and specialized [1].

The process of data mining first sort's large data sets then identifies patterns and establishes relationships to perform data analysis and problem solving [2].The field of application of intelligent data analysis is very wide [3-8].

The volume of raw data is growing rapidly every year. New technologies have created lot of the structured and unstructured data in available databases and repositories [9, 10].

Structured data refers to data stored in transactional databases or data warehouses. Unstructured data mainly refers to textual data (reports, articles, project proposals, emails, business memos, etc.) because it is not as easy to process or manage as numerical data stored in a relational database. In this sense, intelligent data analysis includes intelligent text analysis [11, 12].

However, more often, data mining and text mining had seen as two different concepts: data mining is concerned with discovering knowledge only from structured data, while document mining is concerned with discovering knowledge from unstructured textual data. The term "intellectual analysis of information" had used to denote the integration of data analysis and text analysis to prepare and provide analytics for decision-making [13].

The rise of remote workers is forcing many organizations to look for a viable document management system (DMS). Despite the different effectiveness of document management system, their functions are quite similar to each other. However, the software of the document management system needs appropriate adaptation. An important direction in the development of document management systems is the use of intelligent document processing, which opens up new opportunities for further automation of the document management system of organizations [14-16].

The methods of intelligent data analysis are becoming an integral part of the activities of organizations in order to identify patterns, associations, anomalies and statistically significant structures and events in data. The growth of computing power has led to a combination of experiments and computer simulations [17-19].

Modern requirements of dynamic changes in the external environment force organizations, on the one hand, to track structured and unstructured data in external databases and data warehouses, and on the other hand, to form, control, store and manage every part of their internal data. The integration of external and internal structured and unstructured data and metadata creates connections between sets of raw data, creating new knowledge that becomes organizational assets in the form of new analytical documents that can managed through the application of approaches to document management on the way to development of digital organizations.

Unresolved parts of a common problem

Many publications have shown that the number of publications on the application of intelligent data analysis, intelligent text analysis in organizations has been growing rapidly in recent years. However, to date, there is not enough research on how organizations implement and integrate the intelligent analysis of information and document management into the landscape of their digital information systems at different organizational levels, and most of the existing systems of intelligent data analysis and intelligent text analysis are not integrated. The specified circumstances determine the relevance and importance of this study regarding the development of an integrated system of intellectual analysis of information on the way to the formation and development of a digital organization various business entities.

To scientifically justify and develop an integrated system of intellectual analysis of information on the way to the formation and development of a digital organization of the various business entities.

Research methodology

The methodological approach of the study had based on the use of a set of analytical methods of theoretical generalization, expert evaluation and synthesis.

The methodological approach of forming an integrated system of intellectual analysis of information and management of document of organization various business entities had based on the application of:

* standard process of intelligent data analysis, namely: "definition of tasks"; "data collection and sorting of large data sets"; "identification of regularities - data preparation"; "data analysis - modeling"; "modeling evaluation"; "use of data analysis";

* the process of the new knowledge discovery based on intelligent data analysis, namely: data generation from web databases and repositories; choice of algorithm; data collection and processing; choosing a machine learning model; determination of focused goals and areas, forecasting of new knowledge;

* the standard process of document flow management, namely: search and collection of documents; the process of capturing and converting documents into digital documents; sorting and storage of digital documents in databases, distribution of digital documents;

* analytical models - descriptive (description of trends in existing data), diagnostic (diagnosis of why the result and certain things happened), predictive (forecasting based on results);

* machine learning models that use different algorithms to predict accurate results;

* factorial and statistical analysis methods, clustering methods, knowledge discovery methods and decision tree algorithms and association rules;

* hybrid organizational structure of the management group;

* component architectures (internal databases and data repositories; external databases and data repositories; databases and repositories of new knowledge; local and cloud servers; evaluation subsystem modules; user interfaces).

Statement of the main material

Based on the analysis of approaches, methods and systems of intellectual data analysis, intellectual analysis of texts, application of intellectual analysis for the discovery of new knowledge, analysis of document management approaches of organizations [1-19], a model (Fig. 1) and an integrated system of intellectual analysis of information (Fig. 2) in the direction of forming a digital organization.

The model is based on the application of three subsystems: a subsystem of intelligent information analysis, a subsystem of forecasting new knowledge and a subsystem of document circulation. All three subsystems are based on a digital platform for intelligent data analysis and document management, which are linked to cloud databases, web databases, online data repositories and metadata. Users interact with the digital platform through a user interface, fig.1.

Fig.1 Model of the integrated system of data mining and document management of the organization various business entities.

Based on the developed model the integrated system of intellectual analysis of information of organization various business entities is substantiated and includes: a subsystem of intellectual analysis of information (data, texts, documents), a subsystem of new knowledge discovery based on intelligent data analysis and a subsystem of document management of the institute, Fig. 2.

This integration system due to the combined software that can provide integration of various types of external databases and metadata, as well as internal databases of the organization on the analytical platform, which including: document management subsystem, subsystem of data mining, text mining and subsystem of the new knowledge discovery.

Fig.2 Integrated system of intellectual analysis of information of the organization various business entities.

integrated system intellectual information

The main elements of the subsystem of intellectual analysis of information: analytical group of intellectual analysis of information; local and cloud servers; external databases, repositories and metadata, collections (file systems, search indexes, etc.); interface.

The main elements of the organization's document management subsystem: the organization's document flow management group; local and cloud servers; internal databases, repositories and collections (file systems, search indexes, etc.); interface. The document management subsystem provides: cloud access to all documents from any device; inputting documents from multiple sources and providing different ways to transfer documents to the platform; control of documents by version and time; security (encryption during transmission and storage); intelligent management; extended document indexing; print on demand.

The main elements of the subsystem of the new knowledge discovery: analytical group of new knowledge discovery; application programming interface; external web databases, online data and metadata repositories, algorithm, machine learning model.

All three subsystems of the integrated system of intellectual analysis of information had connected to the internal local and cloud databases of the organization and external cloud web databases and metadata.

The general architecture of the integrated system of intellectual analysis of information and document of organization various business entities includes the following components:

* Internal databases and data warehouses; external databases and data repositories; databases and repositories of new knowledge;

* Local and cloud servers (contain data that is ready for processing); evaluation subsystem modules (search for regularities); user interface (displays the result in an understandable form);

* Data repositories, as part of data sources in the form of plain text, spread sheets or multimedia forms.

The integration scheme of the proposed architecture of databases and storages has tight connections of the use of all databases and storages at three levels (data level; transformation level; user interface) and can integrate: web technologies; information search; data analysis; computer graphics; image analysis, etc.

The integration management system based on the use of the standards of regarding the unification of interfaces (CWM, JDM, OLE DB), storage and transfer of Data Mining models (PMML), organization of the Data Mining process in general (CRISP) and metadata exchange between various software products, repositories (CWM), the document management standards (ISO 19475:2021, ISO/TS 194751:2018, ISO 9001).

The integrated management system of intelligent information analysis and document management has the following components:

* Management platform that integrates three subsystems (subsystem of document management of the organization, subsystem of intellectual analysis of information, subsystem of intelligent of the new knowledge discovery);

* External databases and repositories of data and metadata;

* Databases and repositories of existing data of the organization;

Databases and repositories of new knowledge (selection and classification of new knowledge).The platform of intellectual analysis of information and document flow of the institute included: interface of management of document of the organization, interface of the intellectual analysis of information, an interface of intelligent of the new knowledge discovery.

The management of the integrated system of intelligent information analysis and document management performs are provided:

* Team of document management organization subsystem, which includes four specialists: the main manager of document management, a programmer manager (maintains software and infrastructure components), a security and user manager (responsible for document and data access level operations), a content manager (responsible for access to the directory of files and templates of documents and data);

* Team of the intellectual analysis subsystem, which includes six specialists: head of analytics, analyst (responsible for the interpretation of data and text), specialist (uses methods of intellectual analysis of data and text of documents), information architect (works with large volumes of data and texts, determines database and storage architecture), information engineer (responsible for collecting and converting data and document texts), software engineer (maintains software, tests and maintains infrastructure components);

* Team subsystem for the new knowledge discovery, which includes four specialists: team leader, analyst, machine-learning engineer and a modeling and calculation specialist.

The process of functioning of the proposed system at the level of subsystems:

* processing of requests for document management (search; selection; capture; storage; editing; verification; integration; indexing; content indexes; storage; archiving; printing and distribution);

* intellectual analysis of information (identified data; data collection; data preparation - cleaning, transformation and reduction of data; data processing; identifying patterns from data; evaluating patterns; processing a collection of texts; obtaining new information from a collection of texts; presenting knowledge in analytical documents (descriptive, diagnostic, prognostic, complex);

* intelligent prediction of the new knowledge discovery (generation of defined data from online (web) database, documents, information; algorithm selection; data Processing; the generation of data in a specific area focused; choosing a machine learning model; collection and processing of data in a specific area; calculations, testing; assessment of new knowledge in a specific area.

For the formation of an integrated system of intellectual analysis of information and document management of the organization various business entities, a hybrid organizational structure of the management group had proposed, which includes a centralized data processing group and members appointed from among the employees of units who help units in achieving their goals. The organizational structure of the centralized data processing group: data architect; data engineer; data analyst; data specialist; machine learning engineer. Manages the actions of the centralized management group and its members from among the employees of the divisions - the chief analyst.

Staff of the centralized management group should have the following competencies:

* Chief analyst (implements data science functions, has experience in the subject area, leadership and strategic abilities); data architect (works with large volumes of data and determines the architecture of databases from various information sources);

* Data engineer (tests and maintains infrastructure components that are developed by the data architect);

* Data analyst (responsible for data collection and interpretation); data specialist (solves tasks using methods of machine learning and intelligent data analysis);

* Machine learning engineer (combines software development and modeling skills).

The work of the centralized management group together with members from among the employees of units had carried out in three groups, namely: analytical group of intellectual analysis of information; analytical group of intelligent the new knowledge discovery; document flow management group.

Users of the integrated system: managers, staff, customers, partners, investors, interested parties. Users can access by graphical interface to:

* Data and metadata (data marketing, data libraries, data publications, projects and reports, business activity data, investment activity data, financial planning data, accounting data, inventory control data, personnel management data);

* Analytical information (online analytics, complex analytics, graphic analytics, descriptive analytics, diagnostic analytics, predicative analytics and forecast analytics).

After purchasing the necessary software and installing it in organization various business entities, it must be properly adapted to eliminate risks: data encryption during transmission; automatic backup; emergency recovery. Usually, qualified specialists of the information department can effectively integrate software using the platform of intellectual analysis of information and document flow to store unstructured and duplicated files, e-mail, chat and solve tasks of information content management.

Conclusions and prospects for further research in this field

Analyzed the theoretical and practical approaches to the creation and development of data mining systems, automated document management systems and approaches to new knowledge discovery using data mining and machine learning. The model and integated system of intellectual analysis of information towards the formation of a digital organization various business entities are scientifically substantiated and proposed.

The discovery of new knowledge is aimed at solving the problems of organizational, bussiness and investment activities; decision-making based on intelligent data analysis in large databases and metadata; finding solutions based on patterns found in the data and using solutions for defined tasks and moving towards the formation of digital organization various business entities.

References

1. Craig Stedman What is data mining The ultimate guide https://www.techtarget.com/ searchbusinessanalytics/definition/data-mining

2. Basic Concept of Classification (Data Mining) https://www.geeksforgeeks.org/basic- concept-classification-data-mining/?ref=rp

3. Integrating Business Process Management and Data Mining for Organizational Decision Making https://rcs.cic.ipn.mx/2015_100/Integrating%20Busmess%20Process%20Management% 20and%20Data%20Mining%20for%20Organizational%20Decision%20Making.pdf

4. Jianye Zhang, Peng Zhang Design and Implementation of Flight Data Mining System.- 2016 https://link.springer.com/chapter/10.1007/978-3-662-53430-4_6

5. Data Mining for Scientific Applications https://extension.ucsd.edu/courses-and- programs/data-mining-for-scientific-applications

6. Robert Grossman, Simon Kasif, Reagan Moore, David Rocke, Jeff Ullman Data Mining Research: Opportunities and Challenges. - 1998 https://sites.bu.edu/phenogeno/files/2014/06/ grossman98-Data-minin-research-opportunities.pdf

7. Efstathios Kirkos, Yannis Manolopoulos Data Mining in Finance and accounting: a review of current research trends http://delab.csd.auth.gr/papers/ICESA04km.pdf

8. Data Mining Applications in Accounting and Finance https://www.igi-global.com/ chapter/data-mining-applications-in-accounting-and-fmance/107264

9. Shubhnoor Gil 10 Powerful Data Mining Tools For 2022. -Big Data, Data Driven, Data Engineering, data mining. - 2021 https://hevodata.com/learn/data-mining-tools/

10. Robert Grossman, Simon Kasif, Reagan Moore, David Rocke, Jeff Ullman: Data Mining Research: Opportunities and Challenges.- A Report of three NSF Workshops on Mining Large, Massive, and Distributed Data.- 1998 https://sites.bu.edu/phenogeno/files/2014/06/ grossman98-Data-minin-research-opportunities.pdf

11. Basic Concept of Classification (Data Mining) https://www.geeksforgeeks.org/basic- concept-classification-data-mining/?ref=rp

12. Data Mining Models https://www.javatpoint.com/data-mining-models

13. Quanzhi Li, Yi-fang Brook Wu Information Mining: Integrating Data Mining and Text Mining for Business Intelligence. - Association for Information Systems AIS Electronic Library (AISeL).- AMCIS 2006 https://core.ac.uk/download/pdf/301339819.pdf

14. Data Mining and Document Mining https://www.ecmconnection.com/doc/data-mining-and- document-mining-0001

15. Sanaa Alwidian, Hani Bani-Salameh, Alaa Alslaity Text data mining: a proposed framework and future perspectives.- International Journal of Business Information Systems №18(2).-2015- р.р.127-140 https://www.researchgate.net/publication/272072355_Text_ data_mining_a_proposed_framework_and_future_perspectives

16. Text Mining in Data Mining. -2021 https://www.geeksforgeeks.org/text-mining-in- data-mining/?ref=rp

17. Dane Morgan, Gerbrand Ceder Data Mining in Materials Development//Handbook of Materials Modeling pp 395-421 https://link.springer.com/chapter/10.1007/978-1-4020-3286-8_19

18. Chandrika Kamath Ya Ju Fan Data Mining in Materials Science and Engineering. - Informatics for Materials Science and Engineering. - 2013. - р.р. 17-36. https://www.sciencedirect.com/ science/article/pii/B9780123943996000023

19. Curtarolo S. et al. The high-throughput highway to computational materials design. Nat. Mater. -№12.-2013- p.p. 191-201.

Размещено на Allbest.ru

...

Подобные документы

  • IS management standards development. The national peculiarities of the IS management standards. The most integrated existent IS management solution. General description of the ISS model. Application of semi-Markov processes in ISS state description.

    дипломная работа [2,2 M], добавлен 28.10.2011

  • Consideration of a systematic approach to the identification of the organization's processes for improving management efficiency. Approaches to the identification of business processes. Architecture of an Integrated Information Systems methodology.

    реферат [195,5 K], добавлен 12.02.2016

  • Data mining, developmental history of data mining and knowledge discovery. Technological elements and methods of data mining. Steps in knowledge discovery. Change and deviation detection. Related disciplines, information retrieval and text extraction.

    доклад [25,3 K], добавлен 16.06.2012

  • A database is a store where information is kept in an organized way. Data structures consist of pointers, strings, arrays, stacks, static and dynamic data structures. A list is a set of data items stored in some order. Methods of construction of a trees.

    топик [19,0 K], добавлен 29.06.2009

  • Основные алгоритмические структуры. Запись алгоритма в словесной форме, в виде блок-схемы. Система команд исполнителя. Язык высокого уровня. Создание программы и её отладка. Интегрированные среды разработки: Integrated Development Environment, IDE.

    лекция [61,7 K], добавлен 09.10.2013

  • Information security problems of modern computer companies networks. The levels of network security of the company. Methods of protection organization's computer network from unauthorized access from the Internet. Information Security in the Internet.

    реферат [20,9 K], добавлен 19.12.2013

  • Overview history of company and structure of organization. Characterization of complex tasks and necessity of automation. Database specifications and system security. The calculation of economic efficiency of the project. Safety measures during work.

    дипломная работа [1009,6 K], добавлен 09.03.2015

  • The material and technological basis of the information society are all sorts of systems based on computers and computer networks, information technology, telecommunication. The task of Ukraine in area of information and communication technologies.

    реферат [29,5 K], добавлен 10.05.2011

  • Technical and economic characteristics of medical institutions. Development of an automation project. Justification of the methods of calculating cost-effectiveness. General information about health and organization safety. Providing electrical safety.

    дипломная работа [3,7 M], добавлен 14.05.2014

  • Описание функциональных возможностей технологии Data Mining как процессов обнаружения неизвестных данных. Изучение систем вывода ассоциативных правил и механизмов нейросетевых алгоритмов. Описание алгоритмов кластеризации и сфер применения Data Mining.

    контрольная работа [208,4 K], добавлен 14.06.2013

  • Practical acquaintance with the capabilities and configuration of firewalls, their basic principles and types. Block specific IP-address. Files and Folders Integrity Protection firewalls. Development of information security of corporate policy system.

    лабораторная работа [3,2 M], добавлен 09.04.2016

  • Модули, входящие в пакет программного обеспечения. Project Menagement, Methodology Management, Portfolio Analysis, Timesheets, myPrimavera, Software Development Kit, ProjectLink. Иерархическая структура Primavera и ее взаимосвязь с программой MS Project.

    контрольная работа [9,5 K], добавлен 18.11.2009

  • Совершенствование технологий записи и хранения данных. Специфика современных требований к переработке информационных данных. Концепция шаблонов, отражающих фрагменты многоаспектных взаимоотношений в данных в основе современной технологии Data Mining.

    контрольная работа [565,6 K], добавлен 02.09.2010

  • Общее понятие о системе Earth Resources Data Analysis System. Расчет матрицы преобразования космоснимка оврага. Инструменты геометрической коррекции, трансформирование. Создание векторных слоев. Оцифрованные классы объектов. Процесс подключения скрипта.

    курсовая работа [4,3 M], добавлен 17.12.2013

  • History of development. Building Automation System (BMS) and "smart house" systems. Multiroom: how it works and ways to establish. The price of smart house. Excursion to the most expensive smart house in the world. Smart House - friend of elders.

    контрольная работа [26,8 K], добавлен 18.10.2011

  • Основы для проведения кластеризации. Использование Data Mining как способа "обнаружения знаний в базах данных". Выбор алгоритмов кластеризации. Получение данных из хранилища базы данных дистанционного практикума. Кластеризация студентов и задач.

    курсовая работа [728,4 K], добавлен 10.07.2017

  • Классификация задач DataMining. Создание отчетов и итогов. Возможности Data Miner в Statistica. Задача классификации, кластеризации и регрессии. Средства анализа Statistica Data Miner. Суть задачи поиск ассоциативных правил. Анализ предикторов выживания.

    курсовая работа [3,2 M], добавлен 19.05.2011

  • Web Forum - class of applications for communication site visitors. Planning of such database that to contain all information about an user is the name, last name, address, number of reports and their content, information about an user and his friends.

    отчет по практике [1,4 M], добавлен 19.03.2014

  • Lines of communication and the properties of the fiber optic link. Selection of the type of optical cable. The choice of construction method, the route for laying fiber-optic. Calculation of the required number of channels. Digital transmission systems.

    дипломная работа [1,8 M], добавлен 09.08.2016

  • Перспективные направления анализа данных: анализ текстовой информации, интеллектуальный анализ данных. Анализ структурированной информации, хранящейся в базах данных. Процесс анализа текстовых документов. Особенности предварительной обработки данных.

    реферат [443,2 K], добавлен 13.02.2014

Работы в архивах красиво оформлены согласно требованиям ВУЗов и содержат рисунки, диаграммы, формулы и т.д.
PPT, PPTX и PDF-файлы представлены только в архивах.
Рекомендуем скачать работу.