Module learnrdr :: Class RDRLearnerMDL
[hide private]
[frames] | no frames]

Class RDRLearnerMDL

source code

rules_adhoc.RDRLearner --+
                         |
                        RDRLearnerMDL

Class for learning ripple-down rules from data using the minimum description length principle.

Instance Methods [hide private]
 
learn(self, data_file, features_file, max_depth, c, pos_symbol, neg_symbol, min_length, max_length, regenerate)
Learn RDR from data using MDL and display the results.
source code
Ruleset
greedy_set_cover_mdl(self, data, pos_data, ruleset, possible_rules, datapoint_bits, covered_data)
Find greedy set cover using MDL.
source code
(list of Rules, Rule, integer)
find_possible_rules(self, features, data, depth, covered_data, pos_length, datapoint_bits)
Find possible rules given a set of data points and features.
source code
Ruleset
find_rules(self, data, features, depth)
Find rules from data.
source code

Inherited from rules_adhoc.RDRLearner: check_data, count_errors, data_for_rule, greedy_set_cover, pos_data_for_depth

Method Details [hide private]

learn(self, data_file, features_file, max_depth, c, pos_symbol, neg_symbol, min_length, max_length, regenerate)

source code 

Learn RDR from data using MDL and display the results.

Parameters:
  • data_file (string) - file containing the training data.
  • features_file (string) - file containing the features or "all" if all possible substrings of given lengths are to be generated.
  • max_depth (integer) - maximum depth of rules.
  • c (float) - the value of parameter c in MDL.
  • pos_symbol (string) - symbol signifying positive example in data.
  • neg_symbol (string) - symbol signifying negative example in data.
  • min_length (integer) - minimum length for substrings to be generated.
  • max_length (integer) - maximum length for substrings to be generated.
  • regenerate (boolean) - whether to regenerate possible rules after each time a rule is added to the ruleset
Overrides: rules_adhoc.RDRLearner.learn

greedy_set_cover_mdl(self, data, pos_data, ruleset, possible_rules, datapoint_bits, covered_data)

source code 

Find greedy set cover using MDL.

Parameters:
  • data (dict) - data points and their classifications.
  • pos_data (dict) - data points to be classified by the rules and their classifications.
  • ruleset (L(Ruleset)) - rules already found.
  • possible_rules (list) - possible rules that can be used for classification.
  • datapoint_bits (float) - bits needed to encode one data point.
  • covered_data (dict) - data points already covered by a rule.
Returns: Ruleset
Ruleset that was found.

find_possible_rules(self, features, data, depth, covered_data, pos_length, datapoint_bits)

source code 

Find possible rules given a set of data points and features.

Parameters:
  • data (dict) - data points and their classifications.
  • depth (integer) - maximum depth for exceptions.
  • covered_data (dict) - data points already covered by a rule.
  • pos_length (integer) - number of unclassified elements.
  • datapoint_bits (float) - bits needed to encode one data point.
  • features (list)
Returns: (list of Rules, Rule, integer)
Possible rules, best rule and its index.
Overrides: rules_adhoc.RDRLearner.find_possible_rules

find_rules(self, data, features, depth)

source code 

Find rules from data.

Parameters:
  • data (dict) - data points and their classifications.
  • features (list) - possible features for classification
  • depth (integer) - maximum depth for exceptions for the rules to be found.
Returns: Ruleset
Ruleset that was found.
Overrides: rules_adhoc.RDRLearner.find_rules