项目作者: bhaveshgawri

项目描述 :
Frequent Itemset Generation and Association Rule Mining
高级语言: Python
项目地址: git://github.com/bhaveshgawri/Apriori-Algorithm.git
创建时间: 2018-02-23T14:41:57Z
项目社区:https://github.com/bhaveshgawri/Apriori-Algorithm

开源协议:

下载


Apriori-Algorithm

Frequent Itemset Generation and Association Rule Mining

Includes:

  • Apriori Algorithm: To find frequent-itemsets from a set of transactions.
  • Rule generation: Generation of ‘interesting’ rules from the frequent itemset.
  • Hashing: Itemsets are hashed to get their support count in almost constant time.

Requirements:

Only python3 is required to run this algorithm. No need to install anything else.

python3 is installed in most Linux distributions, by default.

Mining the Rules:

First make the Apriori file executable and then simply run it as:

  1. $ chmod +x ./Apriori
  2. $ ./Apriori
This will make a directory: results at your current path and store the results in the in a file.

Name of file: It will be the values of support and confidence for which Apriori Algorithm is run.

Paremeters:

Support: Defaults value of support is 0.02

Confidence: Default value of confidence is 0.45

Input Transaction file: Currently set as groceries.csv.

All the parameters can be changed from within the code.

Output:

In a file s=0.01 c=0.5

  1. Frequnet Itemsets:
  2. 1-itemsets:
  3. ['liquor'] | support: 109
  4. ['dessert'] | support: 365
  5. ['sliced cheese'] | support: 241
  6. ['bottled water'] | support: 1087
  7. ['oil'] | support: 276
  8. ['yogurt'] | support: 1372
  9. ....
  10. Count: 88
  11. 2-itemsets:
  12. ['whole milk', 'citrus fruit'] | support: 300
  13. ['other vegetables', 'margarine'] | support: 194
  14. ['whipped/sour cream', 'citrus fruit'] | support: 107
  15. ['whole milk', 'cream cheese'] | support: 162
  16. ['rolls/buns', 'citrus fruit'] | support: 165
  17. ['soda', 'citrus fruit'] | support: 126
  18. ...
  19. Count: 213
  20. ...
  21. Total number of frequent itemset(s): 333
  22. Rules:
  23. ['citrus fruit', 'root vegetables'](174) -> ['other vegetables'](1903) | confidence:
  24. 0.5862068965517241
  25. ['root vegetables', 'tropical fruit'](207) -> ['other vegetables'](1903) | confidence:
  26. 0.5845410628019324
  27. ['curd', 'yogurt'](170) -> ['whole milk'](2513) | confidence: 0.5823529411764706
  28. ['other vegetables', 'butter'](197) -> ['whole milk'](2513) | confidence:
  29. 0.5736040609137056
  30. ['root vegetables', 'tropical fruit'](207) -> ['whole milk'](2513) | confidence:
  31. 0.5700483091787439
  32. ....
  33. Total number of rules: 14

Format of Rules:

[LHS] (item set (count)) -> [RHS] (item set (count)) | confidence: confidence value