Snezhana Sulova, Olga Marinova
METADATA MANAGEMENT FRAMEWORK FOR BUSINESS INTELLIGENCE DRIVEN DATA LAKES
Abstract:
Data lakes (DL) provide powerful capabilities for processing and utilizing large and diverse data, helping organizations adapt to the modern environment and extract maximum value from the information at their disposal. Effective data analysis provides actionable knowledge which is a competitive advantage for organizations. Metadata management in data lakes is a key element in ensuring their full functionality. At the same time, this is a dynamic and under-researched area that reflects the rapid development of information technology and the business needs for effective data management. The research is based on a thorough scientific analysis of existing publications on the chosen topic. For this purpose, up-to-date and relevant open access publications from Scopus and Web of Science that correspond to the keywords "data lake" and "metadata" are identified and are from the last 15 years. Based on a review of the existing literature, the main challenges in data lake metadata management are highlighted. The goal of the research is to summarize the existing models in the field of metadata management in data lakes and to propose a new conceptual framework that can serve as a useful guide for designing and implementing metadata management models in heterogeneous data warehouses, as well as implementation steps. The concept's adoption involves a detailed study of the data management model in a specific organization, a measurement of the level of effectiveness after the model’s implementation, and the use of additional metrics to confirm its feasibility. These tasks are therefore the subject of future research. Another limitation of the proposed framework is that it does not address in depth the rules and standards related to ensuring data security, which would be of the highest priority especially in sectors such as finance, defence and healthcare. In addition, further research could also focus on future analysis of the level of satisfaction with the transformation of metadata management processes.