The Hash base Apriori Technique for Association Rule Mining and Data Sanitization
Abstract
Under different circumstances, private information is exposed, and it must be sanitised before even being shared to address privacy issues. Data mining techniques can collect large amounts of data in a short amount of time. The information gathered by the powerful machine learning techniques may identify the most sensitive content, which pertains to an individual or organization. The degree of sensitivity of data belonging to a business or an agency might vary. Only approved individuals and organizations have access to this information. As a result, using access limitations to confirm the security of complicated data is not a complete operation. It can impact the utility of a data mining solution, and the user may be able to re-identify sensitive data. To introduce instruments to find a mechanism for the security of confidential information. Finding ways to secure confidential data by developing data mining tools and procedures that can be applied to databases, even though this diminishes the data mining results’ trust worthiness. In this article, we proposed a data sanitization strategy that uses a frequent itemset classification approach with a modified apriori algorithm. The problem is to maintain intelligence information for vital arrangements while simultaneously preventing the numerous exposures of company rule mining. Data sanitization strategy is used to thoroughly investigate numerous sequential pattern algorithms for ensuring the privacy of large amounts of data. Our research shows that our approach is efficient, scalable, and provides meaningful correction compared to other methods used in existing systems