Heuristics (DL-Learner Javadoc)

java.lang.Object
- org.dllearner.learningproblems.Heuristics

```
public class Heuristics
extends Object
```
Implementation of various heuristics. The methods can be used in learning problems and various evaluation scripts. They are verified in unit tests and, thus, should be fairly stable.

Author:

Jens Lehmann

Nested Class Summary

Nested Classes
Modifier and Type Class and Description

static class Heuristics.HeuristicType

Nested Classes
Modifier and Type	Class and Description
`static class`	`Heuristics.HeuristicType`

Constructor Summary

Constructors
Constructor and Description

Heuristics()

Constructors
Constructor and Description
`Heuristics()`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`static double`	`divideOrZero(int numerator, int denominator)`
`static double`	`getAScore(double recall, double precision)` Computes arithmetic mean of precision and recall, which is called "A-Score" here (A=arithmetic), but is not an established notion in machine learning.
`static double`	`getAScore(double recall, double precision, double beta)` Computes arithmetic mean of precision and recall, which is called "A-Score" here (A=arithmetic), but is not an established notion in machine learning.
`static double[]`	`getAScoreApproximationStep1(double beta, int nrOfPosExamples, int nrOfInstanceChecks, int nrOfSuccessfulInstanceChecks)` In the first step of the AScore approximation, we estimate recall (taking the factor beta into account).
`static double[]`	`getAScoreApproximationStep2(int nrOfPosClassifiedPositives, double[] recallInterval, double beta, int nrOfRelevantInstances, int nrOfInstanceChecks, int nrOfSuccessfulInstanceChecks)` In step 2 of the A-Score approximation, the precision and overall A-Score is estimated based on the estimated recall.
`static double[]`	`getConfidenceInterval95Wald(int total, int success)` Computes the 95% confidence interval of an experiment with boolean outcomes, e.g.
`static double`	`getConfidenceInterval95WaldAverage(int total, int success)` Computes the 95% confidence interval average of an experiment with boolean outcomes, e.g.
`static double`	`getFScore(double recall, double precision)` Computes F1-Score.
`static double`	`getFScore(double recall, double precision, double beta)` Computes F-beta-Score.
`static double[]`	`getFScoreApproximation(int nrOfPosClassifiedPositives, double recall, double beta, int nrOfRelevantInstances, int nrOfInstanceChecks, int nrOfSuccessfulInstanceChecks)` This method can be used to approximate F-Measure and thereby saving a lot of instance checks.
`static double`	`getFScoreBalanced(double recall, double precision, double beta)`
`static double`	`getJaccardCoefficient(int elementsIntersection, int elementsUnion)` Computes the Jaccard coefficient of two sets.
`static double`	`getMatthewsCorrelationCoefficient(int tp, int fp, int tn, int fn)`
`static double[]`	`getPredAccApproximation(int nrOfPositiveExamples, int nrOfNegativeExamples, double beta, int nrOfPosExampleInstanceChecks, int nrOfSuccessfulPosExampleChecks, int nrOfNegExampleInstanceChecks, int nrOfNegativeNegExampleChecks)`
`static double`	`getPredictiveAccuracy(int nrOfExamples, int nrOfPosClassifiedPositives, int nrOfNegClassifiedNegatives)`
`static double`	`getPredictiveAccuracy(int nrOfPosExamples, int nrOfNegExamples, int nrOfPosClassifiedPositives, int nrOfNegClassifiedNegatives, double beta)`
`static double`	`getPredictiveAccuracy2(int nrOfExamples, int nrOfPosClassifiedPositives, int nrOfPosClassifiedNegatives)`
`static double`	`getPredictiveAccuracy2(int nrOfPosExamples, int nrOfNegExamples, int nrOfPosClassifiedPositives, int nrOfNegClassifiedNegatives, double beta)`
`boolean`	`isTooWeak(int nrOfPositiveExamples, int nrOfPosClassifiedPositives, double noise)` Computes whether a hypothesis is too weak, i.e.
`boolean`	`isTooWeak2(int nrOfPositiveExamples, int nrOfNegClassifiedPositives, double noise)` Computes whether a hypothesis is too weak, i.e.
`static double`	`p1(int success, int total)`
`static double`	`p3(double p1, int total)`

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - Heuristics
```
public Heuristics()
```
- Method Detail
  - getFScore
```
public static double getFScore(double recall,
                               double precision)
```
    Computes F1-Score.
    
    Parameters:
    
    recall - Recall.
    
    precision - Precision.
    
    Returns:
    
    Harmonic mean of precision and recall.
  - getFScore
```
public static double getFScore(double recall,
                               double precision,
                               double beta)
```
    Computes F-beta-Score.
    
    Parameters:
    
    recall - Recall.
    
    precision - Precision.
    
    beta - Weights precision and recall. If beta is >1, then recall is more important than precision.
    
    Returns:
    
    Harmonic mean of precision and recall weighted by beta.
  - getFScoreBalanced
```
public static double getFScoreBalanced(double recall,
                                       double precision,
                                       double beta)
```
  - getAScore
```
public static double getAScore(double recall,
                               double precision)
```
    Computes arithmetic mean of precision and recall, which is called "A-Score" here (A=arithmetic), but is not an established notion in machine learning.
    
    Parameters:
    
    recall - Recall.
    
    precision - Precison.
    
    Returns:
    
    Arithmetic mean of precision and recall.
  - getAScore
```
public static double getAScore(double recall,
                               double precision,
                               double beta)
```
    Computes arithmetic mean of precision and recall, which is called "A-Score" here (A=arithmetic), but is not an established notion in machine learning.
    
    Parameters:
    
    recall - Recall.
    
    precision - Precison.
    
    beta - Weights precision and recall. If beta is >1, then recall is more important than precision.
    
    Returns:
    
    Arithmetic mean of precision and recall.
  - getJaccardCoefficient
```
public static double getJaccardCoefficient(int elementsIntersection,
                                           int elementsUnion)
```
    Computes the Jaccard coefficient of two sets.
    
    Parameters:
    
    elementsIntersection - Number of elements in the intersection of the two sets.
    
    elementsUnion - Number of elements in the union of the two sets.
    
    Returns:
    
    #intersection divided by #union.
  - getPredictiveAccuracy
```
public static double getPredictiveAccuracy(int nrOfExamples,
                                           int nrOfPosClassifiedPositives,
                                           int nrOfNegClassifiedNegatives)
```
  - getPredictiveAccuracy
```
public static double getPredictiveAccuracy(int nrOfPosExamples,
                                           int nrOfNegExamples,
                                           int nrOfPosClassifiedPositives,
                                           int nrOfNegClassifiedNegatives,
                                           double beta)
```
  - getPredictiveAccuracy2
```
public static double getPredictiveAccuracy2(int nrOfExamples,
                                            int nrOfPosClassifiedPositives,
                                            int nrOfPosClassifiedNegatives)
```
  - getPredictiveAccuracy2
```
public static double getPredictiveAccuracy2(int nrOfPosExamples,
                                            int nrOfNegExamples,
                                            int nrOfPosClassifiedPositives,
                                            int nrOfNegClassifiedNegatives,
                                            double beta)
```
  - getMatthewsCorrelationCoefficient
```
public static double getMatthewsCorrelationCoefficient(int tp,
                                                       int fp,
                                                       int tn,
                                                       int fn)
```
  - getConfidenceInterval95Wald
```
public static double[] getConfidenceInterval95Wald(int total,
                                                   int success)
```
    Computes the 95% confidence interval of an experiment with boolean outcomes, e.g. heads or tails coin throws. It uses the very efficient, but still accurate Wald method.
    
    Parameters:
    
    success - Number of successes, e.g. number of times the coin shows head.
    
    total - Total number of tries, e.g. total number of times the coin was thrown.
    
    Returns:
    
    A two element double array, where element 0 is the lower border and element 1 the upper border of the 95% confidence interval.
  - getConfidenceInterval95WaldAverage
```
public static double getConfidenceInterval95WaldAverage(int total,
                                                        int success)
```
    Computes the 95% confidence interval average of an experiment with boolean outcomes, e.g. heads or tails coin throws. It uses the very efficient, but still accurate Wald method.
    
    Parameters:
    
    success - Number of successes, e.g. number of times the coin shows head.
    
    total - Total number of tries, e.g. total number of times the coin was thrown.
    
    Returns:
    
    The average of the lower border and upper border of the 95% confidence interval.
  - isTooWeak
```
public boolean isTooWeak(int nrOfPositiveExamples,
                         int nrOfPosClassifiedPositives,
                         double noise)
```
    Computes whether a hypothesis is too weak, i.e. it has more errors on the positive examples than allowed by the noise parameter.
    
    Parameters:
    
    nrOfPositiveExamples - The number of positive examples in the learning problem.
    
    nrOfPosClassifiedPositives - The number of positive examples, which were indeed classified as positive by the hypothesis.
    
    noise - The noise parameter is a value between 0 and 1, which indicates how noisy the example data is (0 = no noise, 1 = completely random). If a hypothesis contains more errors on the positive examples than the noise value multiplied by the number of all examples, then the hypothesis is too weak.
    
    Returns:
    
    True if the hypothesis is too weak and false otherwise.
  - isTooWeak2
```
public boolean isTooWeak2(int nrOfPositiveExamples,
                          int nrOfNegClassifiedPositives,
                          double noise)
```
    Computes whether a hypothesis is too weak, i.e. it has more errors on the positive examples than allowed by the noise parameter.
    
    Parameters:
    
    nrOfPositiveExamples - The number of positive examples in the learning problem.
    
    nrOfNegClassifiedPositives - The number of positive examples, which were indeed classified as negative by the hypothesis.
    
    noise - The noise parameter is a value between 0 and 1, which indicates how noisy the example data is (0 = no noise, 1 = completely random). If a hypothesis contains more errors on the positive examples than the noise value multiplied by the number of all examples, then the hypothesis is too weak.
    
    Returns:
    
    True if the hypothesis is too weak and false otherwise.
  - p1
```
public static double p1(int success,
                        int total)
```
  - p3
```
public static double p3(double p1,
                        int total)
```
  - getFScoreApproximation
```
public static double[] getFScoreApproximation(int nrOfPosClassifiedPositives,
                                              double recall,
                                              double beta,
                                              int nrOfRelevantInstances,
                                              int nrOfInstanceChecks,
                                              int nrOfSuccessfulInstanceChecks)
```
    This method can be used to approximate F-Measure and thereby saving a lot of instance checks. It assumes that all positive examples (or instances of a class) have already been tested via instance checks, i.e. recall is already known and precision is approximated.
    
    Parameters:
    
    nrOfPosClassifiedPositives - Positive examples (instance of a class), which are classified as positives.
    
    recall - The already known recall.
    
    beta - Weights precision and recall. If beta is >1, then recall is more important than precision.
    
    nrOfRelevantInstances - Number of relevant instances, i.e. number of instances, which would have been tested without approximations. TODO: relevant = pos + neg examples?
    
    nrOfInstanceChecks - Performed instance checks for the approximation.
    
    nrOfSuccessfulInstanceChecks - Number of successful performed instance checks.
    
    Returns:
    
    A two element array, where the first element is the computed F-beta score and the second element is the length of the 95% confidence interval around it.
  - getAScoreApproximationStep1
```
public static double[] getAScoreApproximationStep1(double beta,
                                                   int nrOfPosExamples,
                                                   int nrOfInstanceChecks,
                                                   int nrOfSuccessfulInstanceChecks)
```
    In the first step of the AScore approximation, we estimate recall (taking the factor beta into account). This is not much more than a wrapper around the modified Wald method.
    
    Parameters:
    
    beta - Weights precision and recall. If beta is >1, then recall is more important than precision.
    
    nrOfPosExamples - Number of positive examples (or instances of the considered class).
    
    nrOfInstanceChecks - Number of positive examples (or instances of the considered class) which have been checked.
    
    nrOfSuccessfulInstanceChecks - Number of positive examples (or instances of the considered class), where the instance check returned true.
    
    Returns:
    
    A two element array, where the first element is the recall multiplied by beta and the second element is the length of the 95% confidence interval around it.
  - getAScoreApproximationStep2
```
public static double[] getAScoreApproximationStep2(int nrOfPosClassifiedPositives,
                                                   double[] recallInterval,
                                                   double beta,
                                                   int nrOfRelevantInstances,
                                                   int nrOfInstanceChecks,
                                                   int nrOfSuccessfulInstanceChecks)
```
    In step 2 of the A-Score approximation, the precision and overall A-Score is estimated based on the estimated recall.
    
    Parameters:
    
    nrOfPosClassifiedPositives - Positive examples (instance of a class), which are classified as positives.
    
    recallInterval - The estimated recall, which needs to be given as a two element array with the first element being the mean value and the second element being the length of the interval (to be compatible with the step1 method).
    
    beta - Weights precision and recall. If beta is >1, then recall is more important than precision.
    
    nrOfRelevantInstances - Number of relevant instances, i.e. number of instances, which would have been tested without approximations.
    
    nrOfInstanceChecks - Performed instance checks for the approximation.
    
    nrOfSuccessfulInstanceChecks - Number of performed instance checks, which returned true.
    
    Returns:
    
    A two element array, where the first element is the estimated A-Score and the second element is the length of the 95% confidence interval around it.
  - getPredAccApproximation
```
public static double[] getPredAccApproximation(int nrOfPositiveExamples,
                                               int nrOfNegativeExamples,
                                               double beta,
                                               int nrOfPosExampleInstanceChecks,
                                               int nrOfSuccessfulPosExampleChecks,
                                               int nrOfNegExampleInstanceChecks,
                                               int nrOfNegativeNegExampleChecks)
```
  - divideOrZero
```
public static double divideOrZero(int numerator,
                                  int denominator)
```

Class Heuristics

Nested Class Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

Heuristics

Method Detail

getFScore

getFScore

getFScoreBalanced

getAScore

getAScore

getJaccardCoefficient

getPredictiveAccuracy

getPredictiveAccuracy

getPredictiveAccuracy2

getPredictiveAccuracy2

getMatthewsCorrelationCoefficient

getConfidenceInterval95Wald

getConfidenceInterval95WaldAverage

isTooWeak

isTooWeak2

p1

p3

getFScoreApproximation

getAScoreApproximationStep1

getAScoreApproximationStep2

getPredAccApproximation

divideOrZero