A Survey on Association Rule Mining for Enterprise Architecture Model Discovery

  • State of the Art
  • Open access
  • Published: 21 December 2023

Cite this article

You have full access to this open access article

research paper on association rules

  • Carlos Pinheiro   ORCID: orcid.org/0000-0002-8687-7027 1 , 2 ,
  • Sergio Guerreiro   ORCID: orcid.org/0000-0002-8627-3338 2 , 3 &
  • Henrique S. Mamede   ORCID: orcid.org/0000-0002-5383-9884 4  

659 Accesses

1 Altmetric

Explore all metrics

Association Rule Mining (ARM) is a field of data mining (DM) that attempts to identify correlations among database items. It has been applied in various domains to discover patterns, provide insight into different topics, and build understandable, descriptive, and predictive models. On the one hand, Enterprise Architecture (EA) is a coherent set of principles, methods, and models suitable for designing organizational structures. It uses viewpoints derived from EA models to express different concerns about a company and its IT landscape, such as organizational hierarchies, processes, services, applications, and data. EA mining is the use of DM techniques to obtain EA models. This paper presents a literature review to identify the newest and most cited ARM algorithms and techniques suitable for EA mining that focus on automating the creation of EA models from existent data in application systems and services. It systematically identifies and maps fourteen candidate algorithms into four categories useful for EA mining: (i) General Frequent Pattern Mining, (ii) High Utility Pattern Mining, (iii) Parallel Pattern Mining, and (iv) Distribute Pattern Mining. Based on that, it discusses some possibilities and presents an exemplification with a prototype hypothesizing an ARM application for EA mining.

Similar content being viewed by others

Constrained pattern mining in the new era.

research paper on association rules

Enhancing the E-business Through Generalized Association Rule Mining from the Web-Based Data

research paper on association rules

Extracting Association Rules from a Retail Database: Two Case Studies

Avoid common mistakes on your manuscript.

1 Introduction

Association Rule Mining (ARM) is defined by Liu et al. ( 2021 ) as a data mining (DM) field that attempts to identify correlations among the items in a database. It is an essential DM technique applied in numerous domains to discover association patterns among data in huge databases (Sinaei and Fatemi 2018 ). The notion of ARM was first established by Agrawal et al. ( 1993 ), who proposed the Apriori algorithm to identify frequent patterns (FP) and the corresponding association rules. Since then, a wide variety of applications have been applying ARM to discover trends and patterns from vast sets of data in order to determine the associations, correlations, and frequent patterns among the items present in a database and link the presence of an item depending on the other items in a transaction (Menaga and Saravanan 2021 ).

In a different context, Enterprise Architecture (EA) includes the concerns, architecture principles, models, views and frameworks of an organization (Greefhorst and Proper 2011 ). EA models represent different viewpoints for managing different concerns of the company and its IT landscape, such as processes, services, and information systems (The Open Group 2018 ). Perez-Castillo et al. ( 2019 ) describe EA mining as the usage of data mining techniques to obtain an up-to-date view of EA models. According to them, EA modelling is still done with a low degree of automation, reinforcing the importance and opportunities to develop EA mining techniques. Thus, this paper considers the plausibility of applying DM principles and techniques to the EA context.

Based on these contexts, the main objectives of this paper are to gather the applications of ARM techniques in the field of EA and gather the available methods to correlate data related to EA concerns using ARM to support enterprise architectural modelling. Hence, it performs a literature review to assess whether previous research exists in this domain. The review followed a well-known process described by Kitchenham ( 2004 ) to answer two research questions regarding how ARM techniques have been used to build enterprise architectural viewpoints and the latest published ARM techniques applicable to IT-related architecture mining.

This literature review is one of the foundation parts of a research initiative seeking to apply artificial intelligence techniques to automatically build EA models, such as high-level enterprise business processes and information system models (Pinheiro et al. 2021 ). Usually, events are logged in industrial applications without an explicit correlation, and it is time-consuming to analyze and correlate them manually. Associating different events with different data sets without an explicit link is related to the problem-solving space of ARM. Some successful examples of correlating different data sets apply neural networks to fuse and transform diverse data from different data sources into uniformized data (Cai et al. 2022 ; Noori et al. 2020 ), and the usage of user-defined rules to create a single semantically integrated event log into a data mining process (Modaresnezhad et al. 2019 ; Onan 2019 ). Hence, considering its viability, this paper seeks to identify the latest published and most cited ARM techniques and algorithms applicable to EA mining, spanning from 2018 to 2023.

Despite the great diversity of algorithms, this review mapped 14 candidate algorithms for four categories that solve different groups of problems, require different equipment, and represent different processing needs regarding the volume of data and processing time that new experiments in EA mining must take into account: (i) General Frequent Pattern Mining, (ii) High Utility Pattern Mining, (iii) Parallel Pattern Mining, and (iv) Distribute Pattern Mining. It, therefore, aims to help choose which ARM solution to apply.

The rest of this paper is presented in the following structure: Section 2 identifies the relevant background about ARM, DM, and EA to provide foundational knowledge and a justification for this review. Section 3 describes the research methodology. Section 4 presents the results report with a data synthesis and the result for each research question. It is followed by a discussion in Sect. 5 about the findings, implications, and insights for EA modelling. Finally, and to conclude, a summary of the results and future research are referred to in Sect. 6 .

2 Background

This section introduces the general foundations and main concepts of DM, ARM and EA Mining and justifies the need for this review.

2.1 Motivation

EA management literature indicates that maintaining the EA model is still done through manual activities with a low degree of automation, and it represents one of the most significant challenges for EA management (Farwick et al. 2016 ; Perez-Castillo et al. 2019 ). Hence, to contribute to a solution on how to make the modelling of current EA more agile and automatized in order to better support architectural decisions, it is essential to automate the process of at least the AS-IS architecture model discovery. Thus, ARM techniques can play a crucial role in quickly providing new insights and bringing light to hidden relevant knowledge about the current state of EA. Therefore, the necessity to automate the discovery of EA models by collecting and aggregating data from various data sources to support the analysis of current EA models and enabling enterprise architects to make increasingly faster architectural insights and decisions motivated this literature review to explore the possibilities of applying ARM to the EA mining context.

2.2 Enterprise Architecture Mining

Enterprise Architecture models represent different viewpoints for managing different concerns of a company and its IT landscape, such as processes, services, and information systems (The Open Group 2018 ). These models cover concepts that reflect the business and IT perspective and require constant updates in response to the company’s continuous transformations. As these models expand, it becomes harder to keep the relevant information in the architecture up-to-date (Farwick et al. 2016 ).

The term EA Mining is related to applying DM techniques to get an up-to-date view of EA models (Perez-Castillo et al. 2019 ). It is essential for optimizing the components of EA, planning changes, and ensuring alignment with strategic and business objectives that also help to reduce risks and costs for organizations. At the same time, the speed and volatility we have witnessed in business changes have demanded an even faster response from the architecture. Hence, companies are continuously redefining their business goals, and it requires them to constantly review and adapt their processes as well as build new views that allow them to predict the impact of changes quickly.

2.3 DM and ARM

DM is an emerging field that has received increasing attention over the last two decades, with many studies seeking to apply it to a wide variety of applications with the aim of discovering trends and patterns from vast sets of data (Menaga and Saravanan 2021 ). To do that, it uses a variety of specialized algorithms that have been evolving to improve processing time, memory and storage space usage, accuracy improvement, and the utility of mined data (Datta and Mali 2021 ; Liu et al. 2021 ; Luna et al. 2019 ). DM facilitates the discovery of information related to associations, sequences, classification, clustering, and predictions (Laudon and Laudon 2021 , p. 266; Liu et al. 2021 ). These systems perform high-level analysis for patterns or trends but can also break down the data to reveal more detail if necessary (Laudon and Laudon 2021 , p. 267).

ARM is an unsupervised DM task that extracts interesting associations and frequent patterns from item sets in a database transaction or other data stores (Liu et al. 2021 ). ARM determines the associations, correlations, and frequent patterns from frequent items presented in a database, and it identifies when an item’s presence depends on the other items in a transaction (Menaga and Saravanan 2021 ). Usually, traditional ARM approaches are based on support-confidence frameworks. The support (sup) measures the item’s frequency and expresses the itemset’s popularity. The confidence (conf) measures the probability of an item’s occurrence and expresses the strength of the rules (Datta and Mali 2021 ).

ARM technique identifies an antecedent (or condition) and a consequent (or result) that have a conditional connection present in a transaction, such as defined below (Liu et al. 2021 , p. 3):

A = Antecedent, C = Consequent, T = Transaction

A → C, "If an event occurs, then an outcome event will happen.”

A and C are different sets, thus A ∩ C = ∅

A ⊂ T, then, it is expected that C ⊂ T, “Despite the condition A and the consequent C being different sets, both are contained in the same uniquely identified transaction T”.

ARM techniques represent one of the central parts of Knowledge Discovery from Databases (KDD), described by Gullo ( 2015 ) as the process of identifying new, valid, potentially useful and understandable patterns in data. In the KDD context, the term “pattern” is a concept that refers to how a subset of the data is expressed in some language or model to represent the relation of its items. The KDD process is a sequence of steps summarized in Fig.  1 .

figure 1

(Adapted from Gullo 2015 )

The basic flow of the KDD process

2.3.1 Select Data

The first step in the KDD process is to select data. It is related to defining goals that drive the KDD, extracting data from different data sources to feed the ARM process, and selecting the data that will be worked with, usually creating a transactional event log. Thus, different data are integrated and merged into a unified transaction data log with data that supports the mining objectives.

2.3.2 Preparing Data and Transformation

Preparing Data and Transformation perform essential steps related to cleaning, reducing data dimensions, preparing data to be mined and ensuring the achievement of the goals defined in the data selection. It can be divided into two groups of tasks: prepare data to extract and remove noise, which removes incompatible details from the raw data and defines proper strategies for handling missing data fields. Data transformation converts data into suitable mining types, pursuing the reduction and projection of the data to derive a representation suitable for the specific goal. It is typically accomplished by involving transformation techniques or methods to find invariant data representations, such as discretization and fuzzification.

2.3.3 Frequent Pattern Mining

Frequent Pattern Mining (FPM) is the step that applies algorithms and techniques to discover patterns and hidden rules. In this stage, the trends of the data mining values are established by choosing a proper data-mining method and algorithms, including FIM (Frequent Item Mining), classification, and clustering, among others. FIM extracts sets of items that frequently appear in transactional data. Classification takes a collection of records as input, where each record is composed of a set of attributes, and one of the attributes denotes the class of the record. The goal is to find a model for the class attribute as a function of the values of the other attributes. This involves clustering, which aims to identify a finite set of groups of objects so that the objects within the same cluster are similar, whereas the objects belonging to different clusters are dissimilar.

2.3.4 Visualize and Interpret Patterns

In this step, users get insights and new knowledge from patterns discovered by analyzing their results. It is, therefore, essential to provide tools to aid users in the task of interpreting and evaluating the discovered patterns and consolidating knowledge.

This topic presented an outlook on the processes involved in KDD to clarify the position of this research. For that reason, the process was described. However, this paper focuses on the FPM step and identifies some possible applications for the identified methods in EA modelling.

3 Research Methodology

According to Kiteley and Stogdon ( 2014 ), a literature review may be conducted as a research methodology focusing on gathering what is currently known about a specific subject or problem. Therefore, a literature review can be used to consolidate understanding, gather findings, or highlight the most convincing proposals in the published literature thus far. In this context, this review followed the process proposed by Kitchenham ( 2004 ), composed of three phases, as illustrated in Fig.  2 .

figure 2

Kitchenham’s systematic review process

3.1 Planning

In the planning phase, this paper sets two main objectives based on the context, problem, and motivations previously presented as the drivers for this research. Firstly, gather the applications of ARM techniques in the field of EA, and secondly, gather the available methods in the literature to correlate or associate data related to enterprise architecture concerns using ARM to support enterprise architectural or process modelling. Therefore, it seeks to answer the questions below to review the state of the art for these objectives.

RQ 1 —How are ARM techniques used for building enterprise architectural viewpoints?

RQ 2 —What are the latest ARM techniques applicable to IT-related architecture mining?

RQ 1 focuses on gathering the usage of ARM, specifically in EA. RQ 2 seeks a broader scope to identify techniques applied to other IT-related architecture that may not have been applied to EA yet. Thus, the review was split into two search lines for each RQ.

3.1.1 Plan for RQ 1 (How are ARM Techniques Used for Building Enterprise Architectural Viewpoints?)

3.1.1.1 search strategy for rq 1.

This first search focuses on the enterprise architect’s population to identify the usage of ARM techniques to build architectural models. For that reason, the following search string was elaborated.

figure a

3.1.1.2 Study Selection Plan for RQ 1

Table 1 describes the inclusion and exclusion criteria for selecting papers regarding RQ 1.

For this selection, the criteria for not complying with the objective sought to exclude works unrelated to the application of ARM to support architectural modelling. For instance, works that explain an ARM architecture in a medical science application.

3.1.2 Plan for RQ 2 (What are the Latest ARM Techniques Applicable to IT-Related Architecture Mining?)

3.1.2.1 search strategy for rq 2.

This second search focuses on the enterprise architect’s population. However, unlike the first search, this one aims to find applications of ARM to build any level of IT-related architecture, correlating data structures, such as software architecture, infrastructure IT architecture, application architecture and others.

It intends to identify the best candidate approaches to correlate data mined in system logs to build architectural models. Hence, the search string that follows was elaborated with that in mind.

figure b

3.1.2.2 Study Selection Plan for RQ 2

In RQ 2, the inclusion and exclusion criteria have also been defined, as described in Table 2 .

For both research questions, the two search strings were planned to be applied to all fields on the Web of Science, IEEE Explorer, ACM Digital Library, and Scopus.

In this case, the criteria for not complying with the objective excluded works that do not present an ARM technique applicable to any IT-related architecture mining.

3.2 Conduction for RQ 1 (How are ARM Techniques Used for Building Enterprise Architectural Viewpoints?)

The search string was applied to all fields on the Web of Science, IEEE Explorer, ACM Digital Library, and Scopus between 16 February 2023 and 21 February 2023, resulting in a total of 145 papers.

3.2.1 Study Selection

The selection process, illustrated in Fig.  3 , resulted in 11 papers selected from a total of 145 papers. No paper was excluded by not being published through a peer-reviewed process, and none was reported as predatory.

figure 3

RQ 1 study selection

As 11 papers are a small number of works to read, no other filter based on quality was applied. It is remarkable that no single paper describes any ARM uses for EA models, except those that only cite the possible application of data mining to some EA contexts.

3.2.2 Data Extraction

Figure  4 depicts the extraction of papers by year. Despite the absence of papers using ARM explicitly and expressly, it demonstrates that the interest in EA mining has risen over the last 5 years. While, Fig.  5 shows that most papers are available on Scopus, followed by ACM.

figure 4

EA mining by year

figure 5

EA mining per source

3.3 Conduction for RQ 2 (What are the Latest ARM Techniques Applicable to IT-Related Architecture Mining?)

3.3.1 study selection.

For this case, it was also considered out of scope to capture uses of ARM in IT, but with the main indications from disciplines such as healthcare, image processing, individual recommendation, Geographic Information Systems (GIS) and other domain-specific applications unrelated to EA or IT architecture. This also included many granular topics, such as IoT-related TELECOM Infrastructure and error detection in software source code. The selection process is depicted in Fig.  6 , which resulted in 25 papers selected for analysis.

figure 6

RQ 2 study selection

3.3.1.1 Quality Assessment

After applying the inclusion and exclusion criteria, a quality analysis was also carried out to assess the quality and adherence of papers selected for the context being investigated, basically answering the seven quality questions listed here, which are related to the objective of this research:

Was the designed study aimed at any kind of architecture modelling automation?

Was the research method clearly described?

Was the method of association rule mining detailed and well explained?

Was it validated with a solid and clear evaluation method?

Were outcomes assessed using objective criteria?

Was there any comparison to other reference algorithms?

Are all study questions answered?

Each question had three possible answers: “Yes”, “Partially”, and “No”, with values corresponding to 1.0, 0.5, and 0.0, respectively. For the first question, the analysis considered if the paper designated or pointed out some aspects related to architecture modelling. In the second one, the focus was to verify if the research method was clearly described (“Yes”), if not described but is somehow perceivable (“Partially”), or if it was not identified (“No”). The third question targets to ensure that it described and exemplified the mining method (“Yes”), if it was described at a high level without presenting its implementation (“Partially”), or if it looked for other cases (“No”). The fourth question assessed if the work presented an empirical and repeatable validation (“Yes”), if it presented an experiment but with some identified weaknesses, such as not presenting the data set used by the experiment ("Partially"), or if validation was not present (“No”). The fifth question aimed to analyze if the outcomes were objectively evaluated based on experiment results (“Yes”), if there was some or partial outcome evaluation (“Partially”), and whether the outcome evaluation was not present (“No”). For the sixth question, we confirmed if the method was compared with others (“Yes”) or not (“No”). Finally, the seventh question checked if the research questions were clearly defined and answered (“Yes”) if they were not directly and easily identified but it was possible to perceive the problem and whether or not it was answered (“Partially”), or other cases (“No”).

The quality analysis had a maximum of seven points, and it was decided not to incorporate works that had less than three points mainly because they probably had a minor contribution. The graph plotted in Fig.  7 shows the distribution of the quality score of the selected papers.

figure 7

Quality score distribution of selected papers

3.3.2 Data Extraction

Due to the large number of available algorithms, this research did not intend to cover all algorithms, both known and older ones. Presumably, previous advances had already been incorporated into new versions of the core algorithms. Therefore, this research focused on the latest algorithms reported in the literature since 2018 or referenced by works during this period. Despite this, algorithms with the best performance in the experiments are referenced, even when published before 2017. Figure  8 shows the distribution of selected papers by year.

figure 8

Distribution of selected papers by year

Figure  9 depicts the quantity and percentage of papers selected by the search database. Notably, the ACM Digital Library was responsible for half of the papers in this search. At the end of the process, the 15 papers listed in Table 3 served as primary references for the detailed analysis of the methods and algorithms. Some other algorithms were also added to these references and used in other sections ahead for algorithm comparative analysis. Regarding the location where the works have been published, they are quite dispersed in this list of papers. However, two journals stand out with two publications each: the ACM Transactions on Knowledge Discovery from Data and the Proceedings of the ACM on Measurement and Analysis of Computing Systems .

figure 9

RQ 2 paper selected by source

4 Results Report

This section summarizes the main works that contributed to this review. It represents the knowledge base available in the analyzed literature on ARM and architecture modelling automation that may help in EA AS-IS modelling automation.

4.1 Data Synthesis

Before going into the results of each research question, this subsection presents a synthesis of the selected works.

Many algorithms have been proposed for ARM. This research identifies the latest techniques applied to four distinct categories: general local sequential mining, high utility mining, parallel mining, and distributed mining.

Firstly, for RQ 1, the objective was to identify the use of ARM specifically in EA Mining. However, no paper demonstrated a direct application of ARM to EA. Despite that, we present and discuss some works highlighting insights into EA mining.

Secondly, for RQ 2, the goal was to identify the use of ARM in any IT-related architecture. In this way, Table 3 provides an overlook of the contribution found in relation to the perspective of RQ 2, indicating the location where these contributions were published.

4.2 Results for RQ 1 (How are ARM Techniques Used for Building Enterprise Architectural Viewpoints?)

This topic highlights some studies aligned with data mining for enterprise architectural model automation, as reported in the literature.

Neaga and Harding ( 2005 ) suggest that extending the existing enterprise modelling and integrating architectures by incorporating KDD and DM systems could significantly improve the decision-making process and business performance. They developed a conceptual design and development of an enterprise modelling and integration framework using knowledge discovery and data mining techniques. They suggested an approach to utilizing existing references for EA and modelling frameworks by introducing new enterprise views, such as mining and knowledge views. However, they did not develop the ARM process for EA. They only recommended the use of knowledge discovery and data mining systems libraries, such as PolyAnalyst™, Clementine, Weka, and ArMiner.

Gustavsson and Planstedt ( 2005 ) developed the fractal information fusion model (FIF), which is a model for the simulation of multiple hypotheses and intentions of different agents for military purposes, such as different behaviors of opponents and other agents. It aimed to provide an agent architecture that aligns with the global initiatives in an enterprise architecture initiative for the Swedish armed forces. They collect data and fuse them to predict results based on the integration of information from different sources about the behavior of a particular system to support decisions relating to this system. In their approach, a data sensor is constantly fed to the database using data-mining techniques, among other methods. However, the use of data mining techniques is not detailed.

Perez-Castillo et al. ( 2019 ) observed that enterprise application logs might be a powerful source of knowledge and, associated with data mining techniques, may raise the level of automation in Enterprise Architecture Modelling. Thus, they developed a reverse engineering approach based on a set of predefined architectural relationships to draw a dynamic model in ArchiMate (The Open Group 2019 ). Despite indicating the use of data mining techniques to extract and associate data based on textual terms and grammar rules, the authors focused more on explaining the ArchiMate model generation and did not detail the ARM algorithm used in any of the two subsequent works (Pérez-Castillo et al. 2020 ; 2021 ). However, their work solved part of the job in transforming the data resulting from the ARM process to a standardized EA model graphic visualization in ArchiMate. At the same time, it opens up opportunities to complement their approach through a deeper application of ARM techniques.

Despite these studies that form the background of this literature review, the point is that the usage of ARM in the field of EA modelling is still open, and few researchers have explored it. The literature process has not captured works that apply ARM to EA. Although some of them cite DM techniques, these works have not explored the full potential of ARM yet.

4.3 Results for RQ 2 (What are the Latest ARM Techniques Applicable to IT-Related Architecture Mining?)

Due to the large number of algorithms, it is important to define a criterion to help select algorithms related to different challenges that impact EA mining efforts. The abstraction level of EA models must be filled by combining different ARM solutions. For instance, mining knowledge from other architectural models will not demand a high volume, but accuracy may be essential. However, mining data from application system logs will probably demand a more performant approach or even oblige the use of a distributed and more expensive solution due to the volume of data.

Thus, to obtain the latest advances in ARM techniques that can be applied to EA mining, the algorithms and methods identified were initially grouped into some ARM approaches that address different application opportunities as described and analyzed ahead.

In this context, Fig.  10 summarizes the timeline and comparisons observed in the literature gathered in this paper, which emphasizes the most cited and the latest advanced algorithm, and considers the experiments reported by comparative studies. It helps to quickly identify and select the candidate algorithms for future research developments, or at least the newer and most cited candidates applicable to EA mining purposes.

figure 10

ARM timeline evolution

Most algorithms are presented as an evolution of the Apriori algorithm, developed by Agrawal et al. ( 1993 ), which continues to be one of the most cited algorithms (Gan et al. 2019 ; Luna et al. 2019 ). On the other hand, researchers were aware of the difficulty and limitations of local sequential mining and the importance of proposing not only efficient algorithmic solutions but also novel approaches (parallel and distributed computing) to handle such a problem (Luna et al. 2019 ).

Based on the seminal work by Agrawal et al. ( 1993 ), the contributions in terms of algorithms were classified into four different categories. These require different kinds of requirements and logic, and represent different needs of processing regarding the volume of data and processing time, which must be taken into account for new experiments in EA mining.

General Frequent Pattern Mining

High Utility Pattern Mining

Parallel and Multi-thread Mining

Distributed Mining

FIM is the root of the pattern-mining field, which encompasses multiple tasks that aim to extract sets of items that frequently (or infrequently) appear in data on multiple forms and for various purposes. Sometimes, it has been used interchangeably with the term ARM due to the objective of obtaining a frequent pattern (Luna et al. 2019 ). General Frequent Pattern Mining congregates ARM algorithms based on serial or sequential mining. Sequential, in this case, means algorithms that mine based on a linear sequence logic, in contrast with algorithms that aim to mine temporal or hierarchy sequences. These execute ARM in its logic, segmenting data and parallelly processing it, and in a more advanced approach, are found in a distributed environment. On the other hand, high-utility mining tries to obtain rules that maximize utility and generate fewer rules for all cases.

Indeed, at first glance, it is essential to consider the novelty of the algorithm and that the newer ones would bring improvements over the older ones. However, it is also essential to observe the relevance of the proposals, considering their adoption in the research field. In this sense, those that are most used are identified by the number of references (citations). Although the older ones tend to be cited more often, they still reinforce the usefulness and importance of the algorithm.

4.3.1 General Frequent Pattern

FIM is an essential task within data analysis since it is responsible for extracting frequently occurring events, patterns, or items in data. Insights from such pattern analysis offer important benefits in decision-making processes (Luna et al. 2019 ).

Apriori was the first algorithm, according to Luna et al. ( 2019 ). Many algorithms have been proposed for improvements based on the fact that the Apriori continues to be used as a baseline for a considerable number of studies. One of the most significant improvements of Apriori was FP-Growth (Han et al. 2000 ), which uses a pattern tree structure for storing and compressing information about patterns. Another key algorithm is ECLAT (Equivalent CLAss Transformation), which associates each pattern to a list of transactions that occur. Sometime later, dECLAT (diffsets with ECLAT) (Zaki and Gouda 2003 ) outperformed the running time of ECLAT and FP-Growth on dense datasets. In 2004, the algorithm named LCM (Linear time Closed itemset Miner) won the FIMI’04 Footnote 1 competition on Frequent Itemset Mining Implementations, in which plenty of implementations and solutions with remarkably high performance for the time were shown.

Liu et al. ( 2022 ) proposed a new algorithm named Evolutive Frequent Pattern Tree-based Incremental Knowledge Discovery (EFPT-IKD). It is based on double-evolving frequent pattern trees that can trace the dynamically evolving data by an incremental sliding window. One tree records frequent patterns from the historical data, and the other records incremental frequent items. The structures of the double frequent pattern trees and their relationships are updated periodically according to the emerging data and a sliding window. According to the author’s experiment, this incremental knowledge discovery algorithm surpassed FUP (Fast Update) (Cheung et al. 1996a ) and PARMTRD (Parallel Association Rules Based Multiple-Topic Relationships Detection) (Liu et al. 2018 ).

Based on this analysis, the strongest candidates for general frequent pattern mining are dECLAT and LCM once they present clear improvements over Apriori, FP-Growth and ECLAT. However, it is important to note that these three algorithms remain relevant since they are frequently reported and serve as bases for new algorithm proposals. Finally, EFPT-IKD is the latest algorithm found so far. It probably performs better despite not being directly compared with dECLAT and the other algorithms cited in this review.

There is a subgroup in ARM where the individual events could be arranged in predefined hierarchies or temporal sequences that demonstrate different properties for different applications. In particular, hierarchy mining, such as FEM (Frequent Episode Mining), reported by Ao et al. ( 2019 ), allows mining the events in episodes that belong to different levels of the event hierarchy, containing abstractive concepts that are not clear in the original input data. On the other hand, temporal sequences mining is related to techniques that use constructs from the field of business process modelling to represent frequent patterns that go beyond sequential patterns and can express rich ordering relations that include concurrent execution, choices, and repetition (Tax et al. 2018 ). It usually includes features related to the timestamp or location, and in some applications, it is represented using graphs instead of a flat representation (Luna et al. 2019 ). LA-FEMH (LArge-scale Frequent Episode Mining with Hierarchies), presented by Ao et al. ( 2019 ), aims to mine events in a hierarchy-aware partition strategy, which divides the input sequence to produce a balanced workload partition but is also a scalable distributed mining framework.

This view is expressed in Table 4 , which identifies the evolution and latest number of citations detected for each category of algorithm.

4.3.2 High Utility Improvement in Mining

High utility itemset mining is a prevalent data mining problem that considers utility factors. The designed pruning strategies help reduce the visitation of unnecessary nodes in the search space, which reduces the time required by the algorithm (Wu et al. 2019 ).

According to Wu et al. ( 2019 ), High-Utility Itemset (HUI) mining is a data mining problem that considers utility factors, such as quantity and the unit profit of items, aside from the frequency measure from transactional mining. In this field, they designed HUI-PR, a HUI algorithm that applies pruning strategies to reduce the visitation of unnecessary nodes in the search space. In addition to the utility mining strategy, their approach reduces the time required for mining. The memory usage of the algorithm also outperforms the state-of-the-art approach D2HUP (Liu et al. 2012 ) and EFIM algorithms (Zida et al. 2015 ) using six real datasets (chess, mushroom, connect, accidents, and retail) that showed the effectiveness of HUI-PR.

LA-FEMH + (Ao et al. 2019 ) is an extension of the previous algorithm called LA-FEMH, which focused on the use of the concept of maximal and closed episodes in the context of event hierarchies to support other episode mining tasks such as maximal and closed episodes in the context of event hierarchies. They demonstrated the effectiveness of the approach through the implementation of MapReduce on Apache Spark and performed experimental studies on both synthetic and real-world datasets, including financial sequences and natural language text.

In addition to the traditional frequent item mining for contexts where it is desired to avoid getting too many useless rules, LA-FEMH + and HUI-PR are the newest and most promissory candidates identified in this search, as shown in Table 5 .

4.3.3 Parallel ARM

This group presents the algorithms that generally use parallel sequence mining logic based on multi-thread architectures with shared memory, or in other words, a single computer with multiple processors, usually by extending the existing serial algorithms (Luna et al. 2019 ).

Phan ( 2018 ) advocates that closed frequent itemset mining is one of the fundamental tasks in ARM. However, it is time-consuming, and most algorithms find closed frequent items set on search space items that are not reused for mining the next time. To solve this issue, he proposed the NOV-CFI algorithms, a novel approach to quickly detect closed frequent item sets from transactional databases using an array of co-occurrences and occurrences of kernel items in at least one transaction. Besides its ability to be reused, it is also easily expanded in distributed systems. His experimental results show that the algorithms are better than NAFCP (Le and Vo 2015 ) and CHAM (Zaki and Hsiao 2002 ).

According to Ao et al. ( 2019 ), Frequent Episode Mining (FEM) aims to mine frequent sub-sequences from a single long event sequence and is one of the essential building blocks for the sequence mining research field. However, episode frequencies may fail to hold the anti-monotonicity property. For instance, all occurrence, minimal occurrence, and head frequency may lead to the frequency of a sub-episode inferior to its super-episode. Thus, they developed PEM (Peak Episode Miner), a specialized local miner that performs efficient specialized episode mining in local processes with the help of the proposed tree-like structure and concise scanning.

Yildirim Taşer et al. ( 2020 ) presented MTARM (Multitask Association Rule Miner). Instead of discovering rules from single tasks, it focuses on discovering frequent association rules by responding to different tasks and applying an algorithm that considers all tasks collectively along with the relation between them. The underlying assumption is that the rules of all tasks, or at least a subset of them, are familiar to one mutual rule set with a slight difference, and a rule may not be frequent in the entire dataset but be frequent in a group of specific tasks. Their experiment did not compare to other algorithms, but it had three different versions, including one of the well-known reference algorithms. The Eclat version outperformed the one based on FP-Growth, which performed better than the Apriori version. Note that this result should be expected due to the evolution line of these algorithms, where they historically evolved precisely in this sequence, as presented by Luna et al. ( 2019 ) and Wu et al. ( 2019 ). Despite that, it is one of the latest algorithms identified in this review and seems to present a good solution for mining frequent rules related to multiple tasks. Table 6 summarizes the algorithms for parallel ARM.

4.3.4 Distributed Algorithms

This group presents algorithms based on distributed architectures. It is distinct from parallel architecture because, in distributed computing, each processor shares nothing and has its own private main memory and storage.

In their survey, Luna et al. ( 2019 ) designates AprioriMR (Luna et al. 2018 ), BIGMiner (Chon and Kim 2018 ), MRQAR (Martín et al. 2018 ) and G3P-LSC (Padillo et al. 2018 ) as the latest published approaches thus far. AprioriMR is a series of algorithms based on MapReduce and Hadoop that improves the performance of sequential mining in the distributed big data environment. It outperformed distEclat according to the AprioriMR experiments. BIGMiner is a fast and scalable MapReduce-based frequent itemset mining method that generates equal-sized sub-databases called transaction chunks and performs support counting only based on transaction chunks and bitwise operations without generating and shuffling intermediate data. MRQAR is a framework for sequential quantitative ARM in Big Data based on the MapReduce on Apache Spark.

Ao et al. ( 2019 ) propose LA-FEMH (LArge-scale Frequent Episode Mining with Hierarchies), a scalable distributed framework for frequent episode mining from complex sequences with event hierarchies. They implement the proposed framework on Apache Spark.

According to some authors, distributed systems allow quasi-linear scalability; thus, such approaches are becoming increasingly common. It is also notable that most of the existing proposals based on distributed computing considered the MapReduce framework (Luna et al. 2019 ). This group of algorithms should be used when scalability is a critical factor. In this sense, LA-FEMH is the latest algorithm identified, followed by AprioriMR, BIGMiner, MRQAR and G3P-LSC, all of which were published in 2018. For Distributed ARM, the algorithms are summarized in Table 7 .

This section demonstrated some classes of algorithms to apply to diverse needs, mainly if sequential local mining is enough, including hierarchy, time series and utility mining. Depending on whether some performance booster is needed, a parallel approach may be helpful. Also, in case of huge volumes and reduced run time needs, a distributed approach seems to be the right choice. Each of these classes has different requirements in terms of infrastructure, from the more elementary, which requires a good computer, to the more complex, which requires a complex set of infrastructure components and tools.

4.4 Validity Evaluation

This review followed a protocol that was developed and based on a well-known method for the literature review (Kitchenham and Charters 2007 ) so that other reviews following the same protocol may achieve similar results. Nevertheless, the main threat to validity is presenting a research bias. Hence, to mitigate this risk, the research objectives and the protocol were previously aligned among the authors to increase confidence in the results achieved. Furthermore, ARM is a much larger field than the one presented here. This study left many methods out of this analysis due to the small number of citations, even though they may be used to extract information and knowledge for the EA model. However, this review prioritized those most cited for each category analyzed, implying that some techniques and opportunities may remain unexplored. The objective of presenting and exploring the use of ARM in the context of EA mining was achieved. Nonetheless, this paper presents a partial and non-exhaustive view of the field’s opportunities, leaving space for further explorations.

5 Discussion

This section provides a consolidated view of the literature previously presented, discusses the findings and opportunities for using ARM techniques for EA model mining and creates hypotheses for its application.

5.1 Findings

Although ARM has been applied to a variety of specific fields, this research did not find papers related to the direct application of ARM to EA modelling, which reinforces the opportunities in this field.

It is a fact that there is a large quantity and variety of algorithms for ARM. This variety makes the task of choosing which algorithm to use more difficult. For example, according to Gan et al. ( 2019 ), SPMF (Sequential Pattern Mining Framework), Footnote 2 an ARM library for Java, provides a collection of 120 algorithms, and it does not yet cover any parallel algorithms. Related to the selection of the ARM approach and algorithms, it is also challenging to correlate studies and algorithms to find the best and latest advances, mainly due to the high number and variations of the algorithms.

It is also observable that most papers do not clearly describe the research methodology followed, especially regarding how they gathered the state of the art. Thus, there is low confidence in the quality of the related work presented by these papers in comparative experiments, which may have been biased or ignored relatively new advances in working time. Most methods were compared with the different evolution of the same classical or ancient versions of algorithms, such as those based on the Apriori (Agrawal et al. 1993 ), FUP (Cheung 1997) and FP-Growth (Han et al. 2004 ) algorithms, implying that some of them look similar even though there hasn’t been a comparison between them yet. Thus, it might be an outstanding contribution to compare the latest proposed algorithms under the same topic, as listed in Appendix A: Algorithm Comparison, and previously introduced in the result of the review in response to RQ 2 (online appendices are available via http://link.springer.com ).

5.2 Implications

This review did not successfully identify studies demonstrating concrete examples and challenges to using ARM for EA Mining. It does not mean there are no references to aid research advances in the field, as ARM has been used in many fields. Some applications and algorithms may be adapted to the EA mining context. Moreover, the existence of a large number of algorithms and their variations to attend to various specific needs proves ARM’s flexibility and suggests the feasibility of its application to EA mining.

5.3 Application Hypothesis

At this point, a hypothesis for applying some of the approaches found in the EA modelling demonstrates a high level and initial view on how EA modelling could use ARM. Despite not being found in the literature through ARM, it is virtually possible to extract a variety of models and viewpoints. However, it is possible to simulate ARM application to EA modelling to share insights for future works, even at an initial and superficial level. In this regard, Fig.  11 shows the ArchiMate structure for EA views and provides the starting point for the analysis.

figure 11

ArchiMate 3 core framework (adapted from The Open Group 2019 ) (Perez-Castillo et al. 2019 )

Firstly, it is possible to identify rules and extract viewpoints related to Business Processes using time series mining. Niazmand ( 2022 ) applied some knowledge to Graph Mining that identifies and relates data entities. The same approach seems to be easily adapted to correlate business entities that will allow extracting business objects and business processes, including identifying actors and their roles in the business process. Frequent Event Mining fits to gather behavior in any of the tree layers. Combined with a temporal event perspective, it can help to identify business processes dynamically. In this sense, Tax et al. ( 2018 ) identify a sequence of events through local process mining. Thus, to adapt the idea to EA mining, it is necessary to analyze the right granularity of the processes to be adequate for EA and map what kind of event represents this granularity from the EA perspective. The Lin et al. ( 2020 ) approach, used to detect malfunctions in services, hardware and software, could be adapted to mining from this kind of solution knowledge by enabling components to be fulfilled dynamically at the technology layer, such as IT network systems.

These few hypotheses just explore possible uses of ARM to model some aspects of an EA model. However, the crucial diagnostic from this review is that ARM is virtually applicable to almost any application that demands knowledge from data, which is the case of EA mining. It is a field which few studies have explored and presents an excellent opportunity for future research.

EA usually deals with the general views of company architecture, encompassing all business activities, capabilities, information, and technology of an enterprise. Here, this study describes some opportunities to explore the application of identified ARM techniques to contribute to the discovery, build the current state of some EA viewpoints, and address enterprise transversal architecture-wise concerns. It is not limited to the TOGAF framework. However, this study uses TOGAF and ArchiMate as references (The Open Group 2018 , 2019 ). According to Greefhorst and Proper ( 2011 ), TOGAF is a standardized method for enterprise architecture maintained by The Open Group, a consortium of hundreds of organizations, including companies, governmental organizations and research institutes. ArchiMate is a modelling language adopted by the Open Group, developed as part of a research project to provide a language for describing enterprise architectures. Together, they define a method and a language for building enterprise architectures by identifying relevant building blocks in three domains: Business Architecture, Information System Architecture (with Data and Application Architecture), and Technological Architecture.

For the business architecture domain, it seems feasible to create rules based on indicators that may help identify and correlate events and process flow that occurs among the business process within the enterprise architecture. It may enable a deeper control of the process, identify hidden process flows and adapt the business process views in the architecture to these actual flows. Also, it may correlate with relevant events, such as those related to some enterprise’s key performance indicators (KPI) of business processes. Business objectives are usually defined in the business architecture model, which is broken down into one or more Objective Key Results (OKR) that must be actively monitored. For instance, through ARM, it may be possible to associate events related to the enterprise business objective of reducing customer churn, which happens when the customer stops consuming the company’s products or services. Thus, ARM expands the EA management beyond the definition of OKR, enabling its active monitoring as a tool of the enterprise architecture itself. It allows quick modelling and drives courses of corrective action to continuously improve processes related to these enterprise business indicators.

Still, in mapping business processes, temporal-related mining may identify patterns of business interactions from inter or intra-organizational processes and depict how other partner companies, or even internal business areas, interact with some line of business, for example. It may help identify how the best partners interact with the business product and services and the behavior of more problematic partners. Thus, the enterprise and business architect may plan a course of action to help business partners improve their business interaction with the product.

In the information system domain, it seems feasible to automatically:

Obtain a model of enterprise applications supporting one or more process executions,

Identify services and application functions used by the process,

Search for patterns of processing inefficiencies, such as finding frequent disruptions that may indicate bad architectural design or event problems related to gaps in data and absent or noisy information,

Based on mining objectives, it may generate new indicators and information measures to improve the monitoring of the application’s architectural performance and check new behavior patterns of applications.

In addition, combining EA mining with process mining (van der Aalst et al. 2012 ) takes advantage of existing process mining approaches as an EA mining accelerator. Process mining is a DM approach focused on learning the actual execution flow of processes. Despite this focus, it can aid in extracting more aggregated information about the processes in a bottom-up approach to build an enterprise architecture viewpoint. The challenge is to find a suitable way to incorporate and aggregate the process mining results into the EA model visualizations.

In the technology architecture, using the ARM approach can contribute to demonstrating how stakeholder concerns are being addressed by collecting indicators for these concerns from events as close as possible to real-time. For instance, some key EA stakeholders may be concerned about monitoring customer success. In this sense, it may be feasible to use ARM to correlate customer behaviors with product or service details that positively or negatively impact these customers. Another example is the frequency patterns at which consumers access customer service or product/service support or identify patterns in the customer experience that reinforce or hinder their success in their journeys with the company. It may also map and analyze the relationship between application components and the technology that supports it and show how applications use technology like cloud services. For example, in a multi-cloud environment, it may analyze cloud and cost consumption patterns by enterprise applications to find ways to optimize the IT cost operation. The IT operation may be a subject of pattern mining to show the frequency of outages, high latency peaks and others contributing to automatic monitoring services, but mainly to get insight related to services and IT-based product performance. It can also provide an extended view of matrix diagrams depicting the software distribution, how technology platforms support applications, and how intensively the technology is used for applications. It can be depicted using a tree map chart or another kind of graph that can be incorporated into the enterprise architecture modelling tool.

Furthermore, it is vital to proceed with gap analyses on overall architecture related to all the architecture domains to validate that the EA models support the business objectives, principles, policies, and constraints. For instance, a frequent pattern of process failure or the most frequent failure to scale up services could demonstrate inefficiency related to the business objective to keep customers experiencing excellence or denounce patterns of interruptions that may offend the business continuity as an architectural principle. In addition, ARM techniques may support and complement views on core relationships and hidden dependencies among the business, application, or technology components. It can also provide new ways to perform impact analysis, essential to understanding any hidden and broader impacts, providing tools to monitor the real impacts of changes made in an EA component as they occur and identify architecture bottlenecks at the run-time of those components.

Exploring ARM techniques in a bottom-up modelling strategy to discover enterprise architectural viewpoints can bring economic benefits to organizations, consuming less specialized time than gathering information based on interviews and inspections to discover and keep the current architecture up to date. It reduces the risk of errors provoked by manual modelling and misunderstanding and supports better and faster architectural decision-making (Perez-Castillo et al. 2019 ). On the other hand, it might provide new ways for architecture visualization and planning top-down courses of action based on insights obtained through applying ARM to EA mining.

However, based on the works gathered in this review, the application of ARM in enterprise architecture models is still in its infancy. It constitutes a promising research field that may help automate an enterprise architect’s tasks. It can provide new viewpoints and visualizations for EA concerns to support a more agile EA management, finding new ways to optimize the architecture regarding services, infrastructure, and business alignment.

5.4 Example

To exemplify, let us consider a company with physical stores, a mobile app and an e-commerce portal, a logistic partner responsible for product delivery and a partner that provides payment services. Now let us consider the following hypothetical sequence of service calls through API (Application Program Interface) calls for a general e-commerce context where the APIs are the most common behavior of users following a path in search of products. Once they identify the desired product, they put it in a shopping cart created when the first product is added. In the sequence, the users add and remove products from the cart. Then, when satisfied, users check out the cart and are asked to create an account or log in. Once logged, they pay, and once the payment is approved, it creates an order that is sent to the logistics sector for delivery. At the end of this hypothetical typical process, both logistics and the user confirm the product delivery and its reception by the user. Figure  12 illustrates this hypothetical Process.

figure 12

Example of user buying process

For this study, the first API call is the one that creates the shopping cart. To build an AE Process View, it could apply an ARM process that correlates Process Events in a temporal sequence, considering each API request as an atomic event and identifying in the sequence which event represents the Antecedent Activity and which one is the Consequent Activity, grouping and classifying the event context. The candidate activities are identified and grouped considering the same caller of the APIs. Then, through temporal association rules, it is possible to identify the sequence candidate and assign a classification context to each sequence, identifying and building individual instances for each Process, which is composed of an activities chain with antecedents and consequents under the same context. The Process Instances may be grouped based on the first and last activities. Thus, the inner different path sequences may be seen as variations of the same process group and filtered with what is considered irrelevant at the EA level. Thus, these activities should be mapped to ArchiMate elements.

The resulting model is an ArchiMate viewpoint depicted in Fig.  13 . Despite some differences in the sequences that do not correspond precisely to the same graph, all paths will start with the creation of the cart and finish with the final update of the order status. In the end, the process sequence for the view is assembled based on the most frequent sequence involving all the activity once.

figure 13

ArchiMate business process view prototype

The previous example is based on a hypothetical scenario, and it is also a high-level demonstration of how to apply ARM to build a process to transform data mined through ARM to an ArchiMate model. This is only one possibility among many others. Nonetheless, this view still needs implementation with an actual case in future work.

6 Conclusion and Future Work

Exploring ARM techniques to discover enterprise architectural viewpoints is the most valuable innovation proposed by this paper. It promises to enable more agile discovery and maintenance of EA models and improve its governance through architectural performance visualizations by discovering hidden knowledge from architecture execution data. Hence, this review also aimed to find the state-of-the-art ARM techniques applicable to discover significant association rules for building enterprise architectural models and viewpoints.

ARM, in general, seems to be easily described in enterprise architecture as an application tool for data architecture to support diverse needs, analogous to any other tools of business intelligence or data science. However, this research presented ARM as a tool to capture, design, and evolve the enterprise architecture model itself in all its domains. The discussion presented and exemplified ARM application cases related to some projected results for the enterprise architecture as described in the TOGAF framework, which is one of the most known EA frameworks. However, these examples are most likely only a small part of the possibilities which were described in the initial and high-level insights and probably open space for deepening these and other opportunities.

One important recommendation is that in the case discussed where ARM is used for EA mining, ARM should be designed and modelled in the EA and stay present as part of the EA model. It provides new complementary visualizations that enable enriching knowledge about the architecture with less effort and can support bottom-up EA modelling strategies. This capability is especially helpful in designing further architectural improvements in the organizational structures, products, services, and adjacent technologies. It also potentially contributes to linking the high-level and abstract world of EA to its concrete realization based on the actual data collected from business processes and application execution, making the architecture management work more dynamically with information near real-time. It is also more realistic, with fewer misunderstandings and errors, and more insightful with knowledge discovery from data.

Based on the few papers available that investigate the application of some data mining techniques to automatize EA modelling tasks, it is possible to affirm that few studies have explored the field, in contrast with the high number found in ARM research. However, some possibilities of its application were explored in this paper using as a basis some viewpoints within the TOGAF domains: business, information system and technology, which have the potential to guarantee accurate information on the actual and current state of the enterprise architecture, saving experts time, avoiding architecture misunderstandings and supporting faster and better decisions driven by data, using ARM techniques.

From the perspective of the algorithms suitable for an EA mining strategy, fourteen candidate algorithms were identified for four distinct topics incorporating more than 21 overpassed algorithms, embracing a total of 35 ARM algorithms that were covered. Despite the great diversity and difficulty in relating these algorithms and tracking their evolution, the newer and most cited algorithms were mapped. It also indicates those algorithms with lower performance. The generated map will help future analysis focus on the best algorithms while avoiding wasting time analyzing an outdated algorithm. This analysis can also be a starting point for further investigations in the field.

In future work, it will be helpful to conduct laboratory experiments to compare the performance of candidate algorithms identified in this research against the requirements for its application in the EA modelling context. Furthermore, it seems necessary to confirm the comparison among the latest algorithms with new experiments to generate a firmer list of selected algorithms for EA mining.

The opportunities for architecture viewpoints presented in this research were a very preliminary insight into ARM applicability to EA mining. It strongly indicates its utility. However, it must be complemented with a comprehensive observation to provide a broader view of the opportunities of its applications.

https://ceur-ws.org/Vol-126/.

https://www.philippe-fournier-viger.com/spmf/index.php?link=algorithms.php.

Agarwal RC, Aggarwal CC, Prasad VVV (2000) Depth first generation of long patterns. In: Proceedings of the sixth ACM SIGKDD international conference on knowledge discovery and data mining, Boston. ACM, pp 108–118. https://doi.org/10.1145/347090.347114

Aggarwal A, Toshniwal D (2018) Spatio-temporal frequent itemset mining on web data. In: 2018 IEEE international conference on data mining workshops, pp 1160–1165. https://doi.org/10.1109/ICDMW.2018.00166

Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on management of data. ACM, New York, pp 207–216. https://doi.org/10.1145/170035.170072

Agrawal R, Shafer J (1996) Parallel mining of association rules: design, implementation and experience. IBM Research Division, San Jose

Book   Google Scholar  

Ao X, Shi H, Wang J, Zuo L, Li H, He Q (2019) Large-scale frequent episode mining from complex event sequences with hierarchies. ACM Trans Intell Syst Technol 10(4):1–26. https://doi.org/10.1145/3326163

Article   Google Scholar  

Barkhordari M, Niamanesh M (2018) Kavosh: an effective map-reduce-based association rule mining method. J Big Data 5(1):25. https://doi.org/10.1186/s40537-018-0129-4

Cai K, Chen H, Ai W, Miao X, Lin Q, Feng Q (2022) Feedback convolutional network for intelligent data fusion based on near-infrared collaborative IoT technology. IEEE Trans Ind Inform 18(2):1200–1209. https://doi.org/10.1109/TII.2021.3076513

Chen J (2010) An updown directed acyclic graph approach for sequential pattern mining. IEEE Trans Knowl Data Eng 22(7):913–928. https://doi.org/10.1109/TKDE.2009.135

Cheung D, Han J, Ng V, Wong C (1996a) Maintenance of discovered association rules in large databases: an incremental updating technique. In: Proceedings 1996 international conference on data engineering, New Orleans. https://doi.org/10.1109/ICDE.1996.492094

Cheung DW, Han J, Ng VT, Fu AW, Fu Y (1996b) A fast distributed algorithm for mining association rules. In: Fourth international conference on parallel and distributed information systems, pp 31–42. https://doi.org/10.1109/PDIS.1996.568665

Cheung DW, Lee SD, Kao B (1997) A general incremental technique for maintaining discovered association rules. In: Database systems for advanced applications ’97, pp 185–194. https://doi.org/10.1142/9789812819536_0020

Chon K-W, Kim M-S (2018) BIGMiner: a fast and scalable distributed frequent pattern miner for big data. Cluster Comput 21(3):1507–1520. https://doi.org/10.1007/s10586-018-1812-0

da Cunha DS, Xavier RS, Ferrari DG, Vilasbôas FG, de Castro LN (2018) Bacterial colony algorithms for association rule mining in static and stream data. Math Probl Eng 2018:e4676258. https://doi.org/10.1155/2018/4676258

Datta S, Mali K (2021) Significant association rule mining with high associability. In: 5th international conference on intelligent computing and control systems, pp 1159–1164. https://doi.org/10.1109/ICICCS51141.2021.9432237

Djenouri Y, Djenouri D, Belhadi A, Cano A (2019) Exploiting GPU and cluster parallelism in single scan frequent itemset mining. Inform Sci 496:363–377. https://doi.org/10.1016/j.ins.2018.07.020

Farwick M, Schweda CM, Breu R, Hanschke I (2016) A situational method for semi-automated enterprise architecture documentation. Softw Syst Model 15(2):397–426. https://doi.org/10.1007/s10270-014-0407-3

Gan W, Lin JC-W, Fournier-Viger P, Chao H-C, Yu PS (2019) A survey of parallel sequential pattern mining. ACM Trans Knowl Discov Data 13(3):1–34. https://doi.org/10.1145/3314107

Greefhorst D, Proper E (2011) The role of enterprise architecture. In: Greefhorst D, Proper E (eds) Architecture principles: the cornerstones of enterprise architecture. Springer, Heidelberg, pp 7–29. https://doi.org/10.1007/978-3-642-20279-7_2

Chapter   Google Scholar  

Gullo F (2015) From patterns in data to knowledge discovery: what data mining can do. Phys Proc 62:18–22. https://doi.org/10.1016/j.phpro.2015.02.005

Gustavsson PM, Planstedt T (2005) The road towards multi-hypothesis intention simulation agents architecture—fractal information fusion modeling. In: Proceedings of the winter simulation conference. https://doi.org/10.1109/WSC.2005.1574548

Han JW, Pei J, Yin YW (2000) Mining frequent patterns without candidate generation. SIGMOD Rec 29(2):1–12. https://doi.org/10.1145/335191.335372

Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8(1):53–87. https://doi.org/10.1023/B:DAMI.0000005258.31418.83

Karthik S, Medvidovic N (2019) Automatic detection of latent software component relationships from online Q&A sites. In: IEEE/ACM 7th international workshop on realizing artificial intelligence synergies in software engineering, pp 15–21. https://doi.org/10.1109/RAISE.2019.00011

Kitchenham B, Charters S (2007) Guidelines for performing systematic literature reviews in software engineering. EBSE technical report EBSE-2007–01. Keele, Staffs, and Durham. https://citeseerx.ist.psu.edu/doc/10.1.1.117.471 . Accessed 6 Mar 2022

Kitchenham B (2004) Procedures for performing systematic reviews. Keele University. https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=29890a936639862f45cb9a987dd599dce9759bf5 . Accessed 9 May 2022

Kiteley R, Stogdon C (2014) Literature reviews in social work. Sage, London. https://doi.org/10.4135/9781473957756

Laudon K, Laudon JP (2021) Management information systems: managing the digital firm, global edition. Pearson. https://books.google.com.br/books?id=AqJXzgEACAAJ . Accessed 14 Nov 2021

Le T, Vo B (2015) An N-list-based algorithm for mining frequent closed patterns. Expert Syst Appl 42(19):6648–6657. https://doi.org/10.1016/j.eswa.2015.04.048

Li H, Wang Y, Zhang D, Zhang M, Chang EY (2008) Pfp: parallel fp-growth for query recommendation. In: Proceedings of the ACM conference on recommender systems, pp 107–114. ACM, New York. https://doi.org/10.1145/1454008.1454027

Liang Y-H, Wu S-Y (2015) Sequence-growth: a scalable and effective frequent itemset mining algorithm for big data based on MapReduce framework. In: IEEE international congress on big data, pp 393–400. https://doi.org/10.1109/BigDataCongress.2015.65

Lin F, Muzumdar K, Laptev NP, Curelea M-V, Lee S, Sankar S (2020) Fast dimensional analysis for root cause investigation in a large-scale service environment. Proc ACM Meas Anal Comput Syst 4(2):1–23. https://doi.org/10.1145/3392149

Lin M-Y, Lee P-Y, Hsueh S-C (2012) Apriori-based frequent itemset mining algorithms on MapReduce. In: Proceedings of the 6th international conference on ubiquitous information management and communication. ACM, New York. https://doi.org/10.1145/2184751.2184842

Liu X, Zhang X, Wang Y, Zhou J, Helal S, Xu Z, Cao S (2018) PARMTRD: parallel association rules based multiple-topic relationships detection. In: Jin H et al (eds) Web Services—ICWS 2018. Springer, Cham, pp 422–436. https://doi.org/10.1007/978-3-319-94289-6_27

Liu X, Niu X, Fournier-Viger P (2021) Fast Top-K association rule mining using rule generation property pruning. Appl Intell 51(4):2077–2093. https://doi.org/10.1007/s10489-020-01994-9

Liu X, Zheng L, Zhang W, Zhou J, Cao S, Yu S (2022) An evolutive frequent pattern tree-based incremental knowledge discovery algorithm. ACM Trans Manag Inf Syst 13(3):1–20. https://doi.org/10.1145/3495213

Liu J, Wang K, Fung BCM (2012) Direct discovery of high utility itemsets without candidate generation. In: IEEE 12th international conference on data mining, pp 984–989. https://doi.org/10.1109/ICDM.2012.20

Luna JM, Padillo F, Pechenizkiy M, Ventura S (2018) Apriori versions based on MapReduce for mining frequent patterns on big data. IEEE Trans Cybern 48(10):2851–2865. https://doi.org/10.1109/TCYB.2017.2751081

Luna JM, Fournier-Viger P, Ventura S (2019) Frequent itemset mining: a 25 years review. Wires Data Min Knowl Discov 9(6):e1329. https://doi.org/10.1002/widm.1329

Martín D, Martínez-Ballesteros M, García-Gil D, Alcalá-Fdez J, Herrera F, Riquelme-Santos JC (2018) MRQAR: a generic MapReduce framework to discover quantitative association rules in big data problems. Knowl-Based Syst 153:176–192. https://doi.org/10.1016/j.knosys.2018.04.037

Menaga D, Saravanan S (2021) GA-PPARM: CONSTRAINT-based objective function and genetic algorithm for privacy preserved association rule mining. Evolut Intell. https://doi.org/10.1007/s12065-021-00576-z

Modaresnezhad M, Vahdati A, Nemati H, Ardestani A, Sadri F (2019) A rule-based semantic approach for data integration, standardization and dimensionality reduction utilizing the UMLS: application to predicting bariatric surgery outcomes. Comput Biol Med 106:84–90. https://doi.org/10.1016/j.compbiomed.2019.01.019

Moens S, Aksehirli E, Goethals B (2013) Frequent itemset mining for big data. In: IEEE international conference on big data, pp 111–118. https://doi.org/10.1109/BigData.2013.6691742

Neaga EI, Harding JA (2005) An enterprise modeling and integration framework based on knowledge discovery and data mining. Int J Prod Res 43(6):1089–1108. https://doi.org/10.1080/00207540412331322939

Niazmand E (2022) Enhancing query answer completeness with query expansion based on synonym predicates. In: Companion proceedings of the web conference, pp 354–358. ACM, New York. https://doi.org/10.1145/3487553.3524198

Noori FM, Riegler M, Uddin MZ, Torresen J (2020) Human activity recognition from multiple sensors data using multi-fusion representations and CNNs. ACM Trans Multimed Comput Commun Appl 16(2):1–19. https://doi.org/10.1145/3377882

Onan A (2019) Two-stage topic extraction model for bibliometric data analysis based on word embeddings and clustering. IEEE Access 7:145614–145633. https://doi.org/10.1109/ACCESS.2019.2945911

Padillo F, Luna JM, Herrera F, Ventura S (2018) Mining association rules on big data through MapReduce genetic programming. Integr Comput-Aided Eng 25(1):31–48. https://doi.org/10.3233/ICA-170555

Perez-Castillo R, Ruiz-Gonzalez F, Genero M, Piattini M (2019) A systematic mapping study on enterprise architecture mining. Enterp Inform Syst 13(5):675–718. https://doi.org/10.1080/17517575.2019.1590859

Pérez-Castillo R, Ruiz F, Piattini M (2020) A decision-making support system for enterprise architecture modelling. Decis Support Syst 131:113249. https://doi.org/10.1016/j.dss.2020.113249

Pérez-Castillo R, Caivano D, Ruiz F, Piattini M (2021) ArchiRev—reverse engineering of information systems toward archimate models an industrial case study. J Softw Evol Proc 33(2):e2314. https://doi.org/10.1002/smr.2314

Phan H (2018) NOV-CFI: a novel algorithm for closed frequent itemsets mining in transactional databases. In: Proceedings of the VII international conference on network, Communication and computing, pp 58–63. ACM, New York. https://doi.org/10.1145/3301326.3301363

Pinheiro CR, Guerreiro S, Mamede HS (2021) Automation of enterprise architecture discovery based on event mining from API gateway logs: state of the art. In: IEEE 23rd conference on business informatics, pp 117–124. https://doi.org/10.1109/CBI52690.2021.10062

Sinaei S, Fatemi O (2018) Run-time mapping algorithm for dynamic workloads using association rule mining. J Syst Arch 91:1–10. https://doi.org/10.1016/j.sysarc.2018.09.005

De Stefano M, Pecorelli F, Tamburri DA, Palomba F, De Lucia A (2020) Splicing community patterns and smells: a preliminary study. In: Proceedings of the IEEE/ACM 42nd international conference on software engineering workshops, pp 703–710. ACM, New York. https://doi.org/10.1145/3387940.3392204

Tax N, Sidorova N, Haakma R, van der Aalst WMP (2018) Mining local process models with constraints efficiently: applications to the analysis of smart home data. In: 14th international conference on intelligent environments, pp 56–63. https://doi.org/10.1109/IE.2018.00016

The Open Group (2018) The TOGAF® standard, version 9.2. https://publications.opengroup.org/standards/togaf/c182 . https://pubs.opengroup.org/architecture/togaf9-doc/arch/index.html . Accessed 28 Apr 2022

The Open Group (2019) ArchiMate® 3.1 Specification. https://pubs.opengroup.org/architecture/archimate3-doc/ . Accessed 15 Apr 2022

Uno T, Kiyomi M, Arimura H (2004) LCM ver. 2: efficient mining algorithms for frequent/closed/maximal itemsets. FIMI ’04, p 126. https://ceur-ws.org/Vol-126/uno.pdf . Accessed 27 Feb 2022

van der Aalst W, Adriansyah A, de Medeiros AKA, Arcieri F, Baier T, Blickle T, Wynn M (2012) Process mining manifesto. In: Daniel F et al (eds) Business Process Management Workshops. Springer, Heidelberg, pp 169–194. https://doi.org/10.1007/978-3-642-28108-2_19

Wu JM-T, Lin JC-W, Tamrakar A (2019) High-utility itemset mining with effective pruning strategies. ACM Trans Knowl Discov Data 13(6):1–22. https://doi.org/10.1145/3363571

Xun Y, Zhang J, Qin X (2016) FiDoop: parallel mining of frequent itemsets using MapReduce. IEEE Trans Syst Man Cybern: Syst 46(3):313–325. https://doi.org/10.1109/TSMC.2015.2437327

Yildirim Taşer P, Birant KU, Birant D (2020) Multitask-based association rule mining. Turk J Elec Eng Comput Sci 28(2):933–955. https://doi.org/10.3906/elk-1905-88

Zaki MJ (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12(3):372–390. https://doi.org/10.1109/69.846291

Zaki MJ, Gouda K (2003) Fast vertical mining using diffsets. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining, pp 326–335. ACM, New York. https://doi.org/10.1145/956750.956788

Zaki MJ, Hsiao C-J (2002) CHARM: an efficient algorithm for closed itemset mining. In: Proceedings of the SIAM international conference on data mining, pp 457–473. https://doi.org/10.1137/1.9781611972726.27

Zida S, Fournier-Viger P, Lin JC-W, Wu C-W, Tseng VS (2015) EFIM: a highly efficient algorithm for high-utility itemset mining. In: Sidorov G, Galicia-Haro SN (eds) Advances in artificial intelligence and soft computing. Springer, Cham, pp 530–546. https://doi.org/10.1007/978-3-319-27060-9_44

Download references

Acknowledgements

This work is financed by National Funds through the Portuguese funding agency, FCT - Fundação para a Ciência e a Tecnologia, within project LA/P/0063/2020. The second author was supported by national funds through Fundação para a Ciência e a Tecnologia (FCT) with reference UIDB/50021/2020 (INESC-ID).

Open access funding provided by FCT|FCCN (b-on).

Author information

Authors and affiliations.

Universidade de Trás-os-Montes e Alto Douro, Vila Real, Portugal

Carlos Pinheiro

INESC-ID, Rua Alves Redol 9, 1000-029, Lisbon, Portugal

Carlos Pinheiro & Sergio Guerreiro

Instituto Superior Técnico, University of Lisbon, Lisbon, Portugal

Sergio Guerreiro

Department of Science and Technology, INESC TEC, Universidade Aberta, Lisbon, Portugal

Henrique S. Mamede

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Carlos Pinheiro .

Additional information

Accepted after two revisions by Hans-Georg Fill.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 85 KB)

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Pinheiro, C., Guerreiro, S. & Mamede, H.S. A Survey on Association Rule Mining for Enterprise Architecture Model Discovery. Bus Inf Syst Eng (2023). https://doi.org/10.1007/s12599-023-00844-5

Download citation

Received : 04 May 2023

Accepted : 05 October 2023

Published : 21 December 2023

DOI : https://doi.org/10.1007/s12599-023-00844-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Association rule mining
  • Data mining
  • Enterprise architecture mining
  • Enterprise architecture modelling
  • Artificial intelligence
  • Find a journal
  • Publish with us
  • Track your research

Research on Association Rule Mining

The problem of mining association rules (see association rule learning at Wikipedia was introduced in Agrawal et al 1993.

The aim of association rule mining is to find interesting and useful patterns in a transaction database. Each transaction in the database contains a set of items and a transaction identifier (e.g., a market basket). Association rules are rules of the form X → Y where X and Y are two disjoint subsets of all available items. X is called the antecedent or LHS (left-hand side), and Y is called the consequent or RHS (right-hand side). Association rules satisfy constraints on minimum support (a measure of rule significance) and minimum confidence (a measure of rule strength).

Research on association rules focuses on efficient algorithms, measures of rule interestingness, sequence mining, and using association rules for classification.

  • Comprehensive list of interest measures for association rules
  • The free sample chapter Association Analysis: Basic Concepts and Algorithms from the popular textbook Introduction to Data Mining by Tan, Steinbach and Kumar provides a great introduction to association rule mining. R Code accompanying the book chapter is available in Chapter 5 of the web book An R Companion for Introduction to Data Mining.

Our Implementations (in R)

  • arules: A R extension package for mining association rules and frequent itemsets with R. It provides an easy-to-use and flexible platform for experiments and research.
  • arulesViz: Add-on for arules to visualize association rules.
  • arulesCBA: Add-on for arules to perform association rule-based classification.
  • arulesSequences: Add-on for arules to handle and mine frequent sequences.
  • arulesNBMiner: Implementation of the mining algorithm and estimation procedure developed in Michael Hahsler. A model-based frequency constraint for mining associations from transaction data. Data Mining and Knowledge Discovery, 13(2):137-166, September 2006. NBMiner is an add-on to arules.

Other Implementations

  • Christian Borgelt’s implementations of Apriori, Eclat and other algorithms (C)
  • mlxtend: Frequent Itemsets via Apriori Algorithm (Python)
  • Frequent pattern mining implementations from Bart Goethals (C+)
  • Implementations by Mohammed J. Zaki with focus on sequence mining (C++)
  • A C++ Frequent Itemset Mining Template Library by Bodon/Racz/Schmidt-Thieme (C++)
  • Frequent Itemset Mining Implementations Repository (FIMI)
  • Weka, a collection of machine learning algorithms for data mining tasks. (Java)
  • UCI KDD Archive, an online repository of large data sets which encompasses a wide variety of data types, analysis tasks, and application areas.
  • Traces available in the Internet Traffic Archive. Data sets with packet traces, HTTP logs and more.
  • KDD Cup Data, data sets and results for the annual Data Mining and Knowledge Discovery competition organized by ACM Special Interest Group on Knowledge Discovery and Data Mining.
  • FIMI Dataset Repository
  • Active Learning Challenge (Causality Workbench)
  • KDnuggets - Datasets
  • KDnuggets: Meetings
  • Kmining: Conferences
  • Top Conferences in Data Mining by MS Academic Search
  • ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD)
  • ACM Symposium On Applied Computing (SAC)
  • ACM SIGMOD Conference (SIGMOD)
  • European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD)
  • Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD)
  • SIAM Conference on Data Mining (SDM)
  • IEEE International Conference on Data Mining (ICDM)
  • International Symposium on Intelligent Data Analysis (IDA)
  • International Conference on Machine Learning (ICML) </li>
  • ACM Transactions on Knowledge Discovery from Data (TKDD) - ACM
  • Data Mining and Knowledge Discovery (DAMI) - Springer
  • Data & Knowledge Engineering (DKE) - Elsevir
  • Statistical Analysis and Data Mining - Wiley
  • Statistics and Computing - Springer
  • Knowledge and Information Systems: An International Journal - Springer
  • IEEE Transactions on Knowledge and Data Engineering (TKDE) - IEEE
  • International Journal of Data Warehousing and Mining (IJDWM) - IGI Publishing
  • International Journal of Business Intelligence and Data Mining (IJBIDM) - Inderscience
  • International Journal of Information Technology & Decision Making (IJITDM) - World Scientific
  • Intelligent Data Analysis: An International Journal - IOS Press
  • Journal of Database Management (JDM) - Idea Group
  • Journal of Computational and Graphical Statistics (JCGS) - American Statistical Association (ASA)
  • SIGKDD Explorations - ACM
  • INFORMS Journal on Computing (JOC) - informs
  • Journal of Intelligent Information Systems - Springer
  • Machine Learning - Springer
  • Journal of Machine Learning Research (JMLR) - SPARC
  • Transactions on Machine Learning and Data Mining - ibai Publishing
  • Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery - Wiley

Our Publications

Help | Advanced Search

Computer Science > Human-Computer Interaction

Title: exploring convergence in relation using association rules mining: a case study in collaborative knowledge production.

Abstract: This study delves into the pivotal role played by non-experts in knowledge production on open collaboration platforms, with a particular focus on the intricate process of tag development that culminates in the proposal of new glitch classes. Leveraging the power of Association Rule Mining (ARM), this research endeavors to unravel the underlying dynamics of collaboration among citizen scientists. By meticulously quantifying tag associations and scrutinizing their temporal dynamics, the study provides a comprehensive and nuanced understanding of how non-experts collaborate to generate valuable scientific insights. Furthermore, this investigation extends its purview to examine the phenomenon of ideological convergence within online citizen science knowledge production. To accomplish this, a novel measurement algorithm, based on the Mann-Kendall Trend Test, is introduced. This innovative approach sheds illuminating light on the dynamics of collaborative knowledge production, revealing both the vast opportunities and daunting challenges inherent in leveraging non-expert contributions for scientific research endeavors. Notably, the study uncovers a robust pattern of convergence in ideology, employing both the newly proposed convergence testing method and the traditional approach based on the stationarity of time series data. This groundbreaking discovery holds significant implications for understanding the dynamics of online citizen science communities and underscores the crucial role played by non-experts in shaping the scientific landscape of the digital age. Ultimately, this study contributes significantly to our understanding of online citizen science communities, highlighting their potential to harness collective intelligence for tackling complex scientific tasks and enriching our comprehension of collaborative knowledge production processes in the digital age.

Submission history

Access paper:.

  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

research paper on association rules

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

  •  We're Hiring!
  •  Help Center

Association Rule Mining

  • Most Cited Papers
  • Most Downloaded Papers
  • Newest Papers
  • Save to Library
  • Last »
  • Association Rules Mining Follow Following
  • Market Basket Analysis Follow Following
  • Rule-Based Systems Follow Following
  • Academic Ranking Follow Following
  • Knowledge Representation and Reasoning Follow Following
  • Intelligent Systems Follow Following
  • Software Architecture Follow Following
  • Web Mining Follow Following
  • Wirless Sensor Networks Follow Following
  • Association Rule Follow Following

Enter the email address you signed up with and we'll email you a reset link.

  • Academia.edu Publishing
  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

Application Research on Fast Discovery of Association Rules Based on Air Transportation

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

research paper on association rules

Supreme Court Rules on Important Impact Fee Case

  • Share this page on Facebook facebook
  • Share this page on Twitter twitter
  • Share this page on LinkedIn linkedin

research paper on association rules

  • McKaia Dykema
  • Community & Economic Development
  • Legal Advocacy

Co-authored by: Amanda Karras, Executive Director/General Counsel, International Municipal Lawyers Association (IMLA)

This month, the Supreme Court issued a unanimous decision in Sheetz v. El Dorado County , which is a case involving government “Takings,” specifically ones that involve the government’s use of impact fees. Impact fees are typically a one-time payment that local governments levy on a property developer for new development projects. Municipalities use these fees to offset the financial impact that new development places on public infrastructure, such as roads and utilities.

In their ruling, the Court narrowly determined that legislatively enacted impact fees are not exempt from the requirements set forth in two previous property rights cases ( Nollan v. California Coastal Commission and Dolan v. City of Tigard, Oregon ). As such, local governments that impose impact fees will now be subjected to a standard requiring them to demonstrate the relationship and relative impact of the development on the community .  Specifically, cities will have to show that conditions (impact fees) to obtain a land-use permit have an “essential nexus” (relationship) to the government’s land-use interest and a “rough proportionality” between the weight on the property owner and the development’s effects of the proposed land use.

research paper on association rules

This case involves the County of El Dorado’s traffic impact mitigation fee, which it adopted via the General Plan, to require new development to help finance the construction of new roads and widen existing roads.  The amount of the fee is set by formula after the County conducted a nexus study and generally, the fee was based on the location of the project and the type of project.  In assessing the fee, the County does not make any “individualized determinations” as to the nature and extent of the traffic impacts caused by a particular project on state and local roads.

A resident of the County applied for a building permit to construct a single-family home on his property, which the County agreed to issue on the condition that he pay the impact fee. The property owner paid the fee, and the permit was issued, but he then challenged the fee as invalid under the Takings Clause of the Fifth Amendment. He argued that the fee was an unconstitutional condition under Nollan and Dolan as the County did not make an individualized determination that an “essential nexus” and “rough proportionality” existed between the traffic impacts caused by his project and the need for improvements to state/local roads.

research paper on association rules

The California Court of Appeals held that the Nollan and Dolan “essential nexus” and “rough proportionality” tests do not apply to legislative exactions that are generally applicable to a broad class of property owners like the one at issue in this case, so the impact fee was constitutional. However, in a 9-0 decision authored by Justice Barrett, the Supreme Court reversed that decision, concluding that “[t]he Takings Clause does not distinguish between legislative and administrative permit conditions.”  The Court reasoned that the text, history, and precedent support its conclusion that legislatures are not exempt from the Takings Clause. And because the Takings Clause applies equally to legislators and administrators, it “prohibits legislatures and agencies alike from imposing unconstitutional conditions on land-use permits.” The Court found that when a building permit is conditioned on something unrelated to the land-use interest, then the tests set forth in Nollan and Dolan will apply.

What Does This Mean for Local Governments and Their Ability to Issue Impact Fees?

The Court’s decision is narrow and does not prevent local governments from enacting reasonable permitting conditions.  The Court’s decision offers a silver lining as the justices ruled just on the issue of whether legislatively enacted fees were subject to the heightened scrutiny tests required under the referenced precedents.

Importantly, that means the decision does not prevent local governments from enacting reasonable permitting conditions, including impact fees, via legislation.

However, given this recent decision, local governments will now want to ensure that all such legislatively imposed impact fees comply with Nollan and Dolan’s requirements.

research paper on association rules

Local governments may face challenges, and lower courts may rule on other issues raised in the case. Because of this ruling, local governments may now expect challenges by litigants on the local government’s impact fees to ensure compliance with the Nollan and Dolan requirements. Lower courts will also be ruling on the other issues that were raised in the case, including “whether a permit condition imposed on a class of properties must be tailored with the same degree of specificity as a permit condition that targets a particular development.” Local governments that are in jurisdictions allowing for the collection of impact fees will want to advocate for positions that do not require individualized inquiries into these legislatively enacted fees.

Read the Amicus Brief

The Local Government Legal Center filed an amicus brief joined by NLC, IMLA, the National Association of Counties (NACo), and the Government Finance Officers Association (GFOA).

About the Author

McKaia Dykema is the Legislative Research Manager on the Federal Advocacy team at the National League of Cities.

You may also like:

research paper on association rules

What is Fair Housing?

research paper on association rules

  • Mia Chapman

research paper on association rules

Securing Legacies: Strategies for Resolving Heirs’ Property Issues in Cities

research paper on association rules

Women in Municipal Government (WIMG) 2024 Summer Conference

research paper on association rules

Hispanic Elected Local Officials (HELO) 2024 Summer Convening

research paper on association rules

National Black Caucus of Local Elected Officials (NBC-LEO) 2024 Summer Conference

research paper on association rules

  • Public Safety & Justice Reform

Celebrating Second Chance Month: The Crucial Role of City Leaders in Reintegration Efforts

research paper on association rules

IMAGES

  1. Complete guide to Association Rules (2/2)

    research paper on association rules

  2. PPT

    research paper on association rules

  3. Lecture13

    research paper on association rules

  4. (PDF) Fuzzy Association Rules and Its Applications

    research paper on association rules

  5. (PDF) Association Rules Selection Approach Based on Interesting

    research paper on association rules

  6. Visualization-association rules-article 405245

    research paper on association rules

VIDEO

  1. Unlimited paper association is live!

  2. SPECIAL RULES FOR PAPER READING part 2, ENGLISH FOUNDATION BATCH KRISH CLASS 16

  3. Earth Month: How to properly recycle paper & cardboard products

  4. Graphic Packaging Boardio Paper-Based Bottle

  5. Introduction to Evaluation of Association Patterns

  6. apriori algorithm for association rules explained with exampleeasy

COMMENTS

  1. A survey on the use of association rules mining techniques ...

    To this end, the paper uses association rules and other techniques such as clustering. ... The research reported in this paper was partially supported by the COPKIT project under the European Union's Horizon 2020 research and innovation program (grant agreement No 786687), the Andalusian government and the FEDER operative program under the ...

  2. A comprehensive review of visualization methods for association rule

    The selection criteria were the follows: (1) research paper addresses any kind of ARM and its connection with visualization, and the research must be peer reviewed, i.e., published in a referred conference, journal paper, book chapter or monograph. ... New ideas in the visualization of association rules. This section reviews papers dealing with ...

  3. A Comparative Analysis of Association Rule Mining Algorithms

    The paper experimented on two datasets based on a small dataset with 37 records and the Adventure large dataset of Microsoft with 172,459 records, while the software provides association rules ...

  4. (PDF) Association Rules: Problems, solutions and new ...

    Universidad de Salamanca, Plaza Merced S/N, 37008, Salamanca. e-mail: [email protected]. Abstract. Association rule mining is an important. component of data mining. In the last years a great. number of ...

  5. (PDF) Research on Association Rule Mining

    Research on Association Rule Mining. Ziauddin, Shahid Kammal, Khaiuz Zaman Khan, Muhammad Ijaz Khan. Gomal University, D.I.Khan (Pakistan) [email protected]. Abstract: Association Rule Mining is ...

  6. Research on an Improved Association Rule Mining Algorithm

    Data mining association rules is an important role of data mining because of its wide applicability in market analysis by expressing how tangible products and services relate to each other and how rend to group together. The paper proposed Apriori algorithm of riddling compression. And has carried on the simulation, the result demonstrated the Apriori algorithm of riddling compression can ...

  7. Efficient Analysis of Pattern and Association Rule Mining Approaches

    fields (hierarchical association rules [8]). This paper is organized as follows: in Section 2, we briefly describe association rules mining. Section 3 summarizes kinds of frequent pattern mining and association rule mining. Section 4 details a review of association rules approaches. In Section 5, we describe

  8. A Survey on Association Rule Mining for Enterprise ...

    Association Rule Mining (ARM) is a field of data mining (DM) that attempts to identify correlations among database items. It has been applied in various domains to discover patterns, provide insight into different topics, and build understandable, descriptive, and predictive models. On the one hand, Enterprise Architecture (EA) is a coherent set of principles, methods, and models suitable for ...

  9. Journal of Physics: Conference Series PAPER OPEN ACCESS ...

    algorithm and the Fp-growth algorithm idea, this paper proposes an improved association rule algorithm based on Fp-tree , Constructs the general process of personalized recommendation of association rules. ... so personalized technology has become a research hotspot. Association rules, as one of the technical methods of Web mining, are The ...

  10. Research on Association Rule Mining

    The aim of association rule mining is to find interesting and useful patterns in a transaction database. Each transaction in the database contains a set of items and a transaction identifier (e.g., a market basket). Association rules are rules of the form X → Y where X and Y are two disjoint subsets of all available items.

  11. Research of association rule algorithm based on data mining

    Association rule data mining is an important part in the field of data mining data mining, its algorithm performance directly affects the efficiency of data mining and the integrity, effectiveness of ultimate data mining results. Based on the existing association rule mining algorithms, this paper studies and analyzes their efficiency and effectiveness, and according to the efficiency defects ...

  12. [2404.15440] Exploring Convergence in Relation using Association Rules

    This study delves into the pivotal role played by non-experts in knowledge production on open collaboration platforms, with a particular focus on the intricate process of tag development that culminates in the proposal of new glitch classes. Leveraging the power of Association Rule Mining (ARM), this research endeavors to unravel the underlying dynamics of collaboration among citizen ...

  13. (PDF) Summary of Association Rules

    1. Summary of Association Rules. Foxiao Zhan 1, a, Xiaolan Zhu 1, b, Lei Zhang 1, 2, *, Xuexi Wang 1, c, Lu Wang 1, d, Chaoyi Liu 1, e. 1 Department of Computer Technology and application Qinghai ...

  14. PDF A SURVEY OF ASSOCIATION RULES

    The objective of this paper is to provide a thorough survey of previous research on association rules. In the next section we give a formal definition of association rules. Section 3 contains the description of sequential and parallel algorithms as well as other algorithms to find association rules.

  15. Market Basket Analysis: Identify the Changing Trends of Market Data

    This paper discusses the data mining technique i.e. association rule mining and provide a new algorithm which may helpful to examine the customer behaviour and assists in increasing the sales. ... Lower association rules (Outliers) Association Rules Score Assigned A C a->c 18 18 C C 85 Manpreet Kaur and Shivani Kang / Procedia Computer Science ...

  16. Association Rule Mining Research Papers

    View Association Rule Mining Research Papers on Academia.edu for free. ... Rare association rules are those that only appear infrequently even though they are highly associated with very specific data. ... In this paper we present an association rule mining algorithm that is suited for continuous valued attributes commonly found in scientific ...

  17. Association Rule Mining: Applications in Various Areas

    This paper presents the various areas in which the association rules are applied for effective decision making in a wide variety of applications such as: market basket analysis, medical diagnosis, bio-medical literature, protein sequences, census data, logistic regression, fraud detection in web, CRM of credit card business etc. This paper presents the various areas in which the association ...

  18. Application Research on Fast Discovery of Association Rules Based on

    Abstract: In this paper the popular Apriori algorithm and its improvements were introduced and a great many complex data in air transportation were used to fast discover the association rules. For too many rules are produced in the traditional association rules and the Apriori algorithm is not used widely, this paper chooses the improvement algorithm to prune redundant rules based on ...

  19. (PDF) An Improved Apriori Algorithm For Association Rules

    In this paper we present new scheme for extracting association rules that considers the time, number of database scans, memory consumption, and the interestingness of the rules.

  20. PDF Using Association Rule Mining to Enrich User Profiles with Research

    1142 Lule Ahmedi, et al.: Using Association Rule Mining to Enrich User Profiles with Research Paper.. mining process in two parts: mining of user association rules and mining of papers association rules. The strategy is to combine the results of both approaches, by checking the support of one and the other. If the support is lower

  21. [PDF] Research and application of association rules in practice

    Research and application of association rules in practice teaching BBS. Z. Lang. Published 21 November 2015. Computer Science. TLDR. An improved A Priori algorithm is proposed on the basis of overcoming shortcomings of traditional Apriori algorithm which it deals with enormous frequent itemsets and the association rules of correlative ...

  22. Data Mining Techniques in Association Rule : A Review

    In this paper we present a three-step visualization method for mining market basket association rules. These steps include discovering frequent itemsets, mining association rules and finally ...

  23. Supreme Court Rules on Important Impact Fee Case

    Co-authored by: Amanda Karras, Executive Director/General Counsel, International Municipal Lawyers Association (IMLA) This month, the Supreme Court issued a unanimous decision in Sheetz v. El Dorado County, which is a case involving government "Takings," specifically ones that involve the government's use of impact fees.Impact fees are typically a one-time payment that local governments ...