Research on the Construction Technology of University Data Resource Catalogs Based on Machine Learning
- DOI
- 10.2991/978-94-6463-417-4_3How to use a DOI?
- Keywords
- Data Catalog; Big Data; Data Governance; Natural Language Processing
- Abstract
Currently, in the process of information system construction in universities, the diverse construction of various departmental business information systems leads to issues such as the diversification of data characteristics and the phenomenon of information silos. This paper aims to construct a unified data resource catalog for universities and conducts in-depth research. Under the current national strategy of digitalization in education and the requirements for informatization development in universities, establishing a clear and orderly data resource catalog is crucial. It helps in building a comprehensive digital architecture, enhancing the utilization of data value, and supporting data sharing and decision-making. Traditional data integration faces challenges such as interference between integration tools and business systems, inability to synchronize metadata in real-time, inconsistency in data standards among different business systems, and lack of metadata semantic information. To address these issues, this paper proposes a method for university data resource catalog based on the Hudi Lakehouse, and details key works in four aspects, including data lake research, the design of university data mapping dictionaries, column semantic recognition methods, and data resource catalog construction technology. It effectively overcomes problems such as connection interference, metadata change perception, and metadata column semantic information recognition, establishing a unified data resource catalog for universities. This achievement is expected to provide strong support for university data management and governance, promote data sharing and utilization, and have a positive reference significance for the future operational models and informatization construction of universities.
- Copyright
- © 2024 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Ying Zhang AU - Ying Guo AU - Shangxu Liu AU - Xiaohan Yang AU - Bowen Sun PY - 2024 DA - 2024/05/07 TI - Research on the Construction Technology of University Data Resource Catalogs Based on Machine Learning BT - Proceedings of the 2024 5th International Conference on Big Data and Informatization Education (ICBDIE 2024) PB - Atlantis Press SP - 14 EP - 29 SN - 1951-6851 UR - https://doi.org/10.2991/978-94-6463-417-4_3 DO - 10.2991/978-94-6463-417-4_3 ID - Zhang2024 ER -